Overview
Similar to exercise 6.4 we will:
Use created “samplesheet.csv” metadata file for small RNAseq datasets in exercise 6.4.
Use a “nextflow.config” file in the working directory to override Nextflow parameters (e.g., specify where to find the pipeline assets).
Use a PBS script to run the expression profiling of miRNAs against MirGeneDB, a curated database that includes experimentally validated miRNAs.
Prepare pipeline inputs
Let’s move to the working directory:
cd $HOME/workshop/2024-2/session6_smallRNAseq/runs/run2_human_MirGeneDB
Now, let’s copy the samplesheet.csv, nextflow.config and the PBS script to run the pipeline agains MirGeneDB files:
cp $HOME/workshop/2024-2/session6_smallRNAseq/data/human_disease/samplesheet.csv . cp $HOME/workshop/2024-2/session6_smallRNAseq/scripts/nextflow.config . cp $HOME/workshop/2024-2/session6_smallRNAseq/scripts/launch_nf-core_smallRNAseq_MirGeneDB.pbs .
Print the content of the launch script:
cat launch_nf-core_smallRNAseq_MirGeneDB.pbs
Submit the job to the HPC cluster:
qsub launch_nf-core_smallRNAseq_MirGeneDB.pbs
Monitor the progress:
qjobs
The job will take several hours to run, hence we will use precomputed results for the statistical analysis in the next section.
Outputs
The pipeline will produce two folders, one called “work,” where all the processing is done, and another called “results,” where we can find the pipeline's outputs. The content of the results folder is as follows:
results/ ├── bowtie_index │ ├── mirna_hairpin │ └── mirna_mature ├── fastp │ └── on_raw ├── fastqc │ ├── raw │ └── trimmed ├── mirna_quant │ ├── bam │ ├── edger_qc <-- Expression mature miRNA (mature_counts.csv) and precursor-miRNAs (haripin_counts.csv) │ ├── mirtop │ ├── reference │ └── seqcluster ├── mirtrace │ ├── mirtrace-report.html │ ├── mirtrace-results.json │ ├── mirtrace-stats-contamination_basic.tsv │ ├── mirtrace-stats-contamination_detailed.tsv │ ├── mirtrace-stats-length.tsv │ ├── mirtrace-stats-mirna-complexity.tsv │ ├── mirtrace-stats-phred.tsv │ ├── mirtrace-stats-qcstatus.tsv │ ├── mirtrace-stats-rnatype.tsv │ ├── qc_passed_reads.all.collapsed │ └── qc_passed_reads.rnatype_unknown.collapsed ├── multiqc │ ├── multiqc_data │ ├── multiqc_plots │ └── multiqc_report.html └── pipeline_info ├── execution_report_2024-08-26_14-38-10.html ├── execution_report_2024-08-26_17-44-50.html ├── execution_timeline_2024-08-26_14-38-10.html ├── execution_timeline_2024-08-26_17-44-50.html ├── execution_trace_2024-08-26_17-44-50.txt ├── nf_core_smrnaseq_software_mqc_versions.yml ├── params_2024-08-26_17-45-00.json ├── pipeline_dag_2024-08-26_14-38-10.html └── pipeline_dag_2024-08-26_17-44-50.html
The quantification of the mature miRNA and hairpin expressions can be found in the /results/mirna_quant/edger_qc directory.
cd /results/mirna_quant/edger_qc
├── hairpin_counts.csv ├── hairpin_CPM_heatmap.pdf ├── hairpin_edgeR_MDS_distance_matrix.txt ├── hairpin_edgeR_MDS_plot_coordinates.txt ├── hairpin_edgeR_MDS_plot.pdf ├── hairpin_log2CPM_sample_distances_dendrogram.pdf ├── hairpin_log2CPM_sample_distances_heatmap.pdf ├── hairpin_log2CPM_sample_distances.txt ├── hairpin_logtpm.csv ├── hairpin_logtpm.txt ├── hairpin_normalized_CPM.txt ├── hairpin_unmapped_read_counts.txt ├── mature_counts.csv <-- Expression profile of mature miRNAs. ├── mature_CPM_heatmap.pdf ├── mature_edgeR_MDS_distance_matrix.txt ├── mature_edgeR_MDS_plot_coordinates.txt ├── mature_edgeR_MDS_plot.pdf ├── mature_log2CPM_sample_distances_dendrogram.pdf ├── mature_log2CPM_sample_distances_heatmap.pdf ├── mature_log2CPM_sample_distances.txt ├── mature_logtpm.csv ├── mature_logtpm.txt ├── mature_normalized_CPM.txt └── mature_unmapped_read_counts.txt
Let’s inspect the mature.csv file. Let’s use the ‘cat’ command to print it on the screen:
cat mature_counts.csv
"","hsa-let-7a-5p","hsa-let-7a-3p","hsa-let-7a-2-3p","hsa-let-7b-5p","hsa-let-7b-3p","hsa-let-7c-5p","hsa-let-7c-3p","hsa-let-7d-5p","hsa-let-7d-3p","hsa- "ERR409882",364608,341,16,59417,1998,68342,44,14861,3790,29486,207,211184,228,1462,7002,2,49664,1,1091,174,326,43,6,468,7,1482,1615,9,17256,534,573,6526,0 "ERR409879",305651,184,6,52115,1476,58425,30,12397,2659,23604,201,198778,151,1013,5486,1,48381,4,945,202,194,40,7,368,3,1097,1317,6,12662,561,372,3693,2,1 "ERR409881",712880,165,9,83857,2335,162724,83,30556,4503,68044,385,456864,348,1818,9893,0,111712,5,1495,259,174,48,6,318,2,1466,2220,4,17865,466,551,10360 "ERR409884",182178,111,3,27892,913,39989,21,7751,1886,13902,159,127386,132,743,3651,3,40311,0,629,117,97,21,11,305,2,1147,902,2,8313,368,242,2276,0,1146,4 "ERR409889",568269,257,13,92339,2239,100021,45,20819,3511,44172,207,276474,259,1376,12407,5,83908,5,1971,467,403,70,30,1082,7,3082,3172,14,24112,819,421,6 "ERR409894",314053,137,9,44708,1220,74145,74,12313,2827,25295,196,196866,158,896,4681,3,43677,1,806,138,131,22,7,296,3,1181,1169,5,11145,611,360,3742,5,12 "ERR409887",178201,48,4,25678,733,41506,27,7833,1613,15724,121,123391,98,497,3288,0,39434,1,445,97,65,15,3,150,2,539,461,3,5837,186,161,2958,2,847,3,1544, "ERR409880",318121,136,3,46347,1260,65606,39,11095,2269,24585,200,191072,194,1118,5599,2,67420,3,1242,155,168,22,2,505,6,1708,1836,3,11293,482,359,3652,1, "ERR409890",332579,105,7,40131,955,73537,38,13528,2029,31807,158,207846,175,962,5146,0,42402,0,659,149,102,20,4,219,3,964,1086,4,11957,423,385,6017,4,1556
Note: the “mature_counts.csv” needs to be transposed prior running the statistical analysis. This can be done either user the R script or using a script called “transpose_csv.py”.
Let’s copy the transpose_csv.py script to the working folder:
cp /work/training/2024/smallRNAseq/scripts/transpose_csv.py .
The check how to use the script do the following:
python transpose_csv.py --help
usage: transpose_csv.py [-h] --input INPUT --output OUTPUT Transpose a CSV file and generate a tab-delimited TXT file. optional arguments: -h, --help show this help message and exit --input INPUT Input CSV file containing mature miRNA counts. --output OUTPUT Output tab-delimited TXT file.
To transpose the initial “mature_counst.csv” file do the following:
python transpose_csv.py --input mature_counts.csv --out mature_counts.txt
Let’s now print the transposed mature counts table:
cat mature_counts.txt
microRNA ERR409882 ERR409879 ERR409881 ERR409884 ERR409889 ERR409894 ERR409887 ERR409880 ERR409890ERR409878 ERR409885 ERR409886 ERR409891 ERR409899 ERR409893 ERR409896 ERR409895 ERR409888 ERR409892ERR409898 ERR409897 ERR409883 ERR409900 hsa-let-7a-5p 364608 305651 712880 182178 568269 314053 178201 318121 332579 432950 546049 208284 351586 289926 417695 421395 531417 320229 249354 186910 242774 287209 1258946 hsa-let-7a-3p 341 184 165 111 257 137 48 136 105 288 205 85 100 47 114 102 106 167 88 85 101 262 439 hsa-let-7a-2-3p 16 6 9 3 13 9 4 3 7 17 12 4 7 2 5 9 9 123 13 hsa-let-7b-5p 59417 52115 83857 27892 92339 44708 25678 46347 40131 61357 59795 27498 37230 43602 62467 45870 75630 56350 28694 25897 28396 100340 174494 hsa-let-7b-3p 1998 1476 2335 913 2239 1220 733 1260 955 2535 1771 894 1118 999 1404 1294 1670 2293 1343 764 793 1180 4682 hsa-let-7c-5p 68342 58425 162724 39989 100021 74145 41506 65606 73537 69994 128501 47783 75857 66085 86159 84951 118227 53364 62697 41282 46267 117557 271459 hsa-let-7c-3p 44 30 83 21 45 74 27 39 38 41 70 33 34 38 54 55 77 290 23 28 108 155 hsa-let-7d-5p 14861 12397 30556 7751 20819 12313 7833 11095 13528 15326 22746 7913 14405 14122 19831 17317 27318 10690 9305 8698 10109 9076 55680 hsa-let-7d-3p 3790 2659 4503 1886 3511 2827 1613 2269 2029 3311 3875 1999 2370 2978 3305 2460 5372 3015 2857 1747 1800 3443 8451 hsa-let-7e-5p 29486 23604 68044 13902 44172 25295 15724 24585 31807 29935 46550 18482 29525 26036 39737 33998 51987 20686 20070 15501 19495 50124 114605 hsa-let-7e-3p 207 201 385 159 207 196 121 200 158 229 393 195 198 150 248 154 285 199 185 141 168 512 686 hsa-let-7f-5p 211184 198778 456864 127386 276474 196866 123391 191072 207846 250192 376484 129113 231751 223462 288225 276677 425085 154352 133796 126870 155630 150954 802971 hsa-let-7f-1-3p 228 151 348 132 259 158 98 194 175 233 261 116 156 118 187 174 243 159 192 124 103 226 560 hsa-let-7f-2-3p 1462 1013 1818 743 1376 896 497 1118 962 1387 1999 783 1062 701 1100 1127 1215 642 1024 578 799 1847 3221 hsa-miR-15a-5p 7002 5486 9893 3651 12407 4681 3288 5599 5146 9213 7931 3166 4055 5151 5804 5614 12518 4751 4126 3567 3704 4998 16596 hsa-miR-15a-3p 2 1 0 3 5 3 0 2 0 6 0 0 1 0 1 0 1 09 hsa-miR-16-5p 49664 48381 111712 40311 83908 43677 39434 67420 42402 73300 73543 36662 45433 50446 54609 55368 106992 56124 49628 34812 35133 42637 177920 hsa-miR-16-1-3p 1 4 5 0 5 1 1 3 0 3 1 2 1 1 3 1 3 011 15 hsa-miR-17-5p 1091 945 1495 629 1971 806 445 1242 659 1301 1079 662 666 548 805 681 822 934 590 537 561 1311 2793 hsa-miR-17-3p 174 202 259 117 467 138 97 155 149 247 216 153 110 157 161 134 311 132 113 136 133 380 420 hsa-miR-18a-5p 326 194 174 97 403 131 65 168 102 327 137 104 104 76 115 98 101 222 69 85 74 649 455 hsa-miR-18a-3p 43 40 48 21 70 22 15 22 20 38 44 23 18 25 40 22 62 421 24 34 17 86 hsa-miR-19a-5p 6 7 6 11 30 7 3 2 4 12 3 4 7 2 4 2 6 010 hsa-miR-19a-3p 468 368 318 305 1082 296 150 505 219 542 473 399 298 232 247 237 307 358 346 253 259 817 772 hsa-miR-19b-1-5p 7 3 2 2 7 3 2 6 3 5 2 3 1 0 1 1 113 10 hsa-miR-19b-3p 1482 1097 1466 1147 3082 1181 539 1708 964 1773 1656 1216 1054 766 884 878 1046 1284 1312 788 912 5402 3276 hsa-miR-20a-5p 1615 1317 2220 902 3172 1169 461 1836 1086 2163 1632 884 1199 855 1156 1097 1036 1176 649 673 869 1269 4785 hsa-miR-20a-3p 9 6 4 2 14 5 3 3 4 6 6 7 3 1 2 5 7 96 hsa-miR-21-5p 17256 12662 17865 8313 24112 11145 5837 11293 11957 21365 20088 14302 13403 11423 12353 15500 17497 10842 8743 7668 9397 21234 33370 hsa-miR-21-3p 534 561 466 368 819 611 186 482 423 803 398 595 296 540 464 376 510 470 306 261 267 490 1435 hsa-miR-22-5p 573 372 551 242 421 360 161 359 385 483 562 219 293 160 357 346 231 406 363 229 229 998 1543 hsa-miR-22-3p 6526 3693 10360 2276 6428 3742 2958 3652 6017 4830 7321 4210 4172 4826 6177 5011 9282 3041 5602 3335 2297 10176 16041 hsa-miR-23a-5p 0 2 6 0 7 5 2 1 4 1 1 2 1 9 4 1 7 012 10 hsa-miR-23a-3p 1785 1388 3082 1146 2153 1286 847 2136 1556 1966 2835 2553 1797 1339 1637 1648 2208 1208 2269 894 1054 4353 5098 hsa-miR-24-1-5p 24 15 11 4 23 2 3 11 7 25 14 6 7 2 9 6 8 111 7 2 60 36 hsa-miR-24-3p 5206 4549 6172 2715 6773 3471 1544 4085 3320 5937 4608 3329 2943 2039 3436 2830 2510 3465 3630 3148 1886 23971 13975 hsa-miR-24-2-5p 1 0 9 3 2 3 2 3 1 3 7 6 3 3 11 2 1 114 7 hsa-miR-25-5p 6 2 5 1 6 5 2 1 1 4 1 2 0 2 1 1 4 07 hsa-miR-25-3p 10678 8254 14145 4943 16696 7599 4914 9122 7057 11815 10600 5989 5872 7239 8869 7667 14988 8566 5127 3997 5421 12219 27213 hsa-miR-26a-5p 942607 674879 1353549 460648 1173421 1104148 481683 1026301 905383 1514327 1284627 1028272 954763 588897 719907 838025 918553 861348 1194165 539098 578238 1940621 2160732 hsa-miR-26a-1-3p 33 13 38 11 23 15 5 13 19 12 19 21 11 7 26 15 114 4 11 92 67 hsa-miR-26b-5p 8873 8470 12404 6004 16179 7036 4039 10340 7269 13698 16293 6699 6824 5658 6723 8201 10241 8101 6964 5297 5465 12834 21565 hsa-miR-26b-3p 117 120 161 55 191 64 47 86 88 139 165 86 73 81 85 74 176 886 52 62 93 260 hsa-miR-27a-5p 10 5 6 3 3 47 3 6 4 13 5 12 4 5 12 2 7 347 5 3 15 22 hsa-miR-27a-3p 6316 6048 11563 4314 8880 4538 3946 7333 7578 8045 10429 11134 5353 7110 7001 7213 11208 4270 8606 4711 4313 23050 18861 hsa-miR-28-5p 1189 1003 2766 714 2114 963 798 1139 1316 1423 2938 1438 1285 1523 1462 1706 3236 894 1527 826 943 1703 3740 hsa-miR-28-3p 9931 8512 22673 5850 17423 10445 6004 12452 11139 12577 20995 14581 12398 12654 12736 12438 25137 8456 9078 4997 7356 19199 36389 hsa-miR-29a-5p 139 119 221 78 197 133 48 154 137 129 278 95 129 56 115 114 136 6165 76 76 402 302 hsa-miR-29a-3p 47851 40318 86030 31781 67338 40611 24682 43362 52719 43606 87923 43894 39369 32490 41001 40846 60529 26618 64033 30538 27862 171798 127958 hsa-miR-30a-5p 53741 58354 125902 46942 92089 61011 33253 92094 76030 69153 138612 76104 54456 41981 58085 62219 82200 31718 61622 27745 49837 268845 163743 hsa-miR-30a-3p 5559 4462 11786 3361 6961 6289 3173 7023 5990 6561 12540 4597 5335 4929 6342 5095 9863 3229 6463 2249 3918 16800 17563 hsa-miR-31-5p 203 200 445 273 239 161 134 182 235 193 407 107 191 189 205 135 240 9180 210 140 572 859 hsa-miR-31-3p 16 9 19 1 5 5 3 5 6 3 7 1 6 0 10 4 6 153 15 hsa-miR-32-5p 824 539 651 344 1123 336 205 575 356 895 663 293 333 324 431 450 604 354 267 305 247 491 1286 hsa-miR-32-3p 45 34 52 21 61 21 17 29 28 51 45 19 34 27 28 46 37 321 12 13 22 146 hsa-miR-33a-5p 652 573 664 327 1196 541 185 494 498 841 478 339 222 276 526 325 525 181 341 285 309 3858 1399 hsa-miR-33a-3p 114 83 116 44 123 62 31 67 70 109 109 31 39 38 83 57 72 753 34 49 115 233 hsa-miR-92a-1-5p 13 2 13 4 4 14 1 3 3 2 5 2 5 3 4 4 711 2 2 3 31 hsa-miR-92a-3p 42246 38723 50223 20238 68324 35052 19661 34443 24237 42942 41887 22290 22354 41671 37906 29984 79714 41615 17825 19775 19460 39084 108578 hsa-miR-93-5p 6901 5416 7821 3750 9831 4443 2768 5608 3821 7290 5275 3226 3474 3914 5811 4263 8262 6879 2346 2739 2781 10365 19463 hsa-miR-93-3p 28 10 38 8 42 9 6 15 18 19 20 22 15 6 25 10 19 122 12 9 47 47 hsa-miR-95-5p 84 53 122 46 126 58 26 68 88 83 113 62 76 39 65 82 77 386 32 48 546 217 hsa-miR-95-3p 1110 969 2741 778 1742 1128 644 1520 1625 1217 2296 994 1364 889 1325 1264 1222 735 1296 818 824 3110 4329 hsa-miR-96-5p 26 9 41 12 23 9 47 35 36 17 18 217 27 37 16 24 97 119 9 9 56 58 hsa-miR-98-5p 50824 30986 80693 21549 45532 38319 21471 33600 39626 48053 58479 21535 42277 31044 46855 45473 54780 35260 25804 22597 25577 51970 142624 hsa-miR-98-3p 719 484 911 349 492 393 221 427 478 485 1016 320 645 568 656 668 1103 258 449 299 332 292 1828 hsa-miR-99a-5p 15965 13513 33096 12728 25485 14486 9571 20179 18562 13461 23895 22426 15968 14763 15562 12327 16006 11674 24577 8407 8378 64336 54743 hsa-miR-99a-3p 302 365 621 291 422 335 185 376 358 436 591 258 303 210 318 287 439 240 294 178 251 1491 1198 hsa-miR-100-5p 79481 37070 63987 30825 107620 30986 18025 44661 40955 44899 60082 52166 39781 40535 41072 28735 28319 53908 38033 20461 14628 91004 131887 hsa-miR-100-3p 154 89 184 37 166 144 36 62 62 180 168 40 89 68 93 111 101 101 86 29 48 65 319 hsa-miR-101-5p 449 636 966 420 894 557 310 586 529 627 901 538 456 481 574 496 829 243 728 374 412 753 1422 hsa-miR-101-3p 19431 18440 29284 14575 31144 18471 7449 17823 18165 24323 34198 9800 13607 10769 15929 17565 21584 9448 9053 8167 13225 57684 46489 hsa-miR-29b-1-5p 126 99 298 100 157 143 45 118 143 112 198 62 135 94 151 129 119 83 119 80 62 417 520 hsa-miR-29b-3p 9231 10383 16612 7804 18075 9233 3548 12501 11360 9461 18227 6548 9338 4033 8918 6990 7204 3688 7076 5152 6062 35669 28289 hsa-miR-29b-2-5p 43 33 86 31 59 46 23 61 47 34 48 29 46 31 53 27 223 41 26 21 180 164 hsa-miR-103a-2-5p 47 32 74 21 36 45 22 46 53 36 64 26 25 20 25 30 218 56 21 18 244 149 hsa-miR-103a-3p 57341 56545 120575 45558 74052 60887 33067 65870 58842 65720 82673 35489 52492 42411 73148 55872 72334 50436 50157 29897 36824 208477 252692 hsa-miR-103a-1-5p 6 2 4 0 1 4 2 3 4 4 0 0 1 0 3 4 222 4 hsa-miR-105-5p 94 67 195 23 93 96 64 50 103 74 177 43 95 93 129 124 188 3109 59 41 370 360 hsa-miR-105-3p 7 5 16 4 14 6 7 5 11 13 12 5 7 2 8 5 11 650 44 hsa-miR-106a-5p 99 71 304 88 191 114 43 249 146 128 125 164 83 83 130 104 159 682 63 128 292 575 hsa-miR-106a-3p 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 00 hsa-miR-107 12864 12924 25287 10703 16544 12143 8198 13385 13381 13838 20856 7270 10120 10615 15478 12161 22185 9413 10800 6737 8071 37333 47477 hsa-miR-16-2-3p 13 13 23 3 28 11 11 22 15 22 19 11 8 21 15 14 68 110 10 32 hsa-miR-192-5p 10850 10721 16893 7812 16012 8565 5717 11078 9463 14552 16540 8147 7275 8285 9508 10099 11859 7911 7783 5007 6835 23833 26345 hsa-miR-192-3p 3 1 5 1 1 1 1 3 2 7 5 4 1 2 1 1 1 24 hsa-miR-196a-5p 0 0 19 3 11 11 0 9 6 0 5 12 1 8 0 0 30 011 0 1 hsa-miR-196a-1-3p 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 00 hsa-miR-197-5p 2 7 6 3 2 5 4 1 2 1 5 4 2 4 3 4 2 112 14 hsa-miR-197-3p 4641 5186 9789 4043 6037 4628 3665 6083 4062 5562 7040 6654 4797 5322 6538 4461 8605 6048 7292 3844 2793 15747 20189 hsa-miR-199a-5p 140 82 1008 145 269 206 165 646 153 249 885 715 383 207 241 300 643 9596 93 173 723 520 hsa-miR-199a-3p 939 837 5648 918 1410 1120 1047 3302 915 1580 6687 3161 2028 1693 1651 2319 4909 433 2284 496 1079 2042 3434 hsa-miR-208a-3p 0 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 01 hsa-miR-129-5p 6973 6508 13303 3986 6530 7258 3774 5453 7559 7194 7962 3709 6279 4931 10167 6221 6809 4274 4919 3070 4283 64162 33485 hsa-miR-129-1-3p 1734 1327 2143 840 1159 857 773 1283 1321 1454 1387 546 1278 486 1733 870 749 1014 1408 642 784 4986 4616 hsa-miR-148a-5p 89 66 372 67 142 114 63 176 85 120 403 132 115 70 100 135 181 576 41 72 266 286 hsa-miR-148a-3p 6130 5176 25339 4577 8570 6976 5268 12690 6384 7454 37987 11049 9217 4590 5783 8804 11497 2906 10277 2773 5167 16279 19534 hsa-miR-30c-5p 8513 6090 11549 4502 9475 6710 3768 8687 7097 7538 13893 6697 6948 4477 6579 5918 6878 5295 9006 3720 4738 18333 20643 hsa-miR-30c-2-3p 361 296 668 213 521 294 199 357 322 374 798 339 353 324 535 380 823 241 238 149 293 2099 1116