Download Reference microRNA sequences from miRBase
Fetch a copy of microRNA mature sequences:
wget https://www.mirbase.org/download/CURRENT/hairpin.fa gzip -c hairpin.fa > hairpin.fa.gz
Hairpin sequences:
wget https://www.mirbase.org/download/CURRENT/mature.fa gzip -c mature.fa.gz
Run a test
Before running the pipeline with real data, run the following test:
nextflow run nf-core/smrnaseq -profile test,singularity --outdir results -r 2.1.0
To submit the above command to the HPC cluster prepare the following script:
#!/bin/bash -l #PBS -N nfsmrnaseq #PBS -l select=1:ncpus=2:mem=4gb #PBS -l walltime=24:00:00 #work on current directory (folder) cd $PBS_O_WORKDIR #load java and set up memory settings to run nextflow module load java NXF_OPTS='-Xms1g -Xmx4g' nextflow run nf-core/smrnaseq -profile test,singularity --outdir results -r 2.1.0
Submitting the job
Once you have created the folder for the run, the samplesheet.csv file, nextflow.config, and launch.pbs, you are ready to submit.
Submit the run with this command
qsub launch.pbs
Monitoring the Run
You can use the command
qstat -u $USER
Alternatively, use the command
qjobs
to check on the jobs, you are running. Nextflow will launch additional jobs during the run.
You can also check the .nextflow.log file for details on what is going on.
Preparing a sample metadata file
Now let’s prepare a samplesheet.csv file that specifies the name of your samples and the location of the raw FASTQ files
#!/bin/bash -l #PBS -N nfsmrnaseq #PBS -l select=1:ncpus=2:mem=4gb #PBS -l walltime=12:00:00 #work on current directory (folder) cd $PBS_O_WORKDIR #User defined variables ########################################################## DIR='/path/to/raw/FASTQ/files' INDEX='samplesheet.csv' ########################################################## #load python module module load python/3.10.8-gcccore-12.2.0 #fetch the script to create the sample metadata table wget -L https://raw.githubusercontent.com/nf-core/rnaseq/master/bin/fastq_dir_to_samplesheet.py chmod +x fastq_dir_to_samplesheet.py #generate initial sample metadata file ./fastq_dir_to_samplesheet.py $DIR index.csv \ --strandedness auto \ --read1_extension .fastq.gz #format index file cat index.csv | awk -F "," '{print $1 "," $2}' > ${INDEX} #Remove intermediate files: rm index.csv fastq_dir_to_samplesheet.py