Shell
如何將 basename 與並行使用?
我在 Linux 系統上有這樣的文件:
10S1_S5_L002_chrm.fasta SRR3184711_chrm.fasta SRR3987378_chrm.fasta SRR4029368_chrm.fasta SRR5204465_chrm.fasta SRR5997546_chrm.fasta 13_S7_L003_chrm.fasta SRR3184712_chrm.fasta SRR3987379_chrm.fasta SRR4029369_chrm.fasta SRR5204520_chrm.fasta SRR5997547_chrm.fasta 14_S8_L003_chrm.fasta SRR3184713_chrm.fasta SRR3987380_chrm.fasta SRR4029370_chrm.fasta SRR5208699_chrm.fasta SRR5997548_chrm.fasta 17_S4_L002_chrm.fasta SRR3184714_chrm.fasta SRR3987415_chrm.fasta SRR4029371_chrm.fasta SRR5208700_chrm.fasta SRR5997549_chrm.fasta 3_S1_L001_chrm.fasta SRR3184715_chrm.fasta SRR3987433_chrm.fasta SRR4029372_chrm.fasta SRR5208701_chrm.fasta SRR5997550_chrm.fasta 4_S2_L001_chrm.fasta SRR3184716_chrm.fasta SRR3987482_chrm.fasta SRR4029373_chrm.fasta SRR5208770_chrm.fasta SRR5997551_chrm.fasta 50m_S10_L004_chrm.fasta SRR3184717_chrm.fasta SRR3987489_chrm.fasta SRR4029374_chrm.fasta SRR5208886_chrm.fasta SRR5997552_chrm.fasta 5_S3_L001_chrm.fasta SRR3184718_chrm.fasta SRR3987493_chrm.fasta SRR4029375_chrm.fasta SRR5211153_chrm.fasta SRR6050903_chrm.fasta 65m_S11_L005_chrm.fasta SRR3184719_chrm.fasta SRR3987495_chrm.fasta SRR4029376_chrm.fasta SRR5211162_chrm.fasta SRR6050905_chrm.fasta 6_S6_L002_chrm.fasta SRR3184720_chrm.fasta SRR3987647_chrm.fasta SRR4029377_chrm.fasta SRR5211163_chrm.fasta SRR6050920_chrm.fasta 70m_S12_L006_chrm.fasta SRR3184721_chrm.fasta SRR3987651_chrm.fasta SRR4029378_chrm.fasta SRR5215118_chrm.fasta SRR6050921_chrm.fasta 80m_S1_L002_chrm.fasta SRR3184722_chrm.fasta SRR3987657_chrm.fasta SRR4029379_chrm.fasta SRR5247122_chrm.fasta SRR6050958_chrm.fasta
總共有 423 個,我被要求將它們分成 32 個部分,以便在 32 個 CPU 上實現最佳並行化所以現在我有了這個:
10S1_S5_L002_chrm.part-10.fasta SRR3986254_chrm.part-26.fasta SRR4029372_chrm.part-22.fasta SRR5581526-1_chrm.part-20.fasta 10S1_S5_L002_chrm.part-11.fasta SRR3986254_chrm.part-27.fasta SRR4029372_chrm.part-23.fasta SRR5581526-1_chrm.part-21.fasta 10S1_S5_L002_chrm.part-12.fasta SRR3986254_chrm.part-28.fasta SRR4029372_chrm.part-24.fasta SRR5581526-1_chrm.part-22.fasta 10S1_S5_L002_chrm.part-13.fasta SRR3986254_chrm.part-29.fasta SRR4029372_chrm.part-25.fasta SRR5581526-1_chrm.part-23.fasta 10S1_S5_L002_chrm.part-14.fasta SRR3986254_chrm.part-2.fasta SRR4029372_chrm.part-26.fasta SRR5581526-1_chrm.part-24.fasta 10S1_S5_L002_chrm.part-15.fasta SRR3986254_chrm.part-30.fasta SRR4029372_chrm.part-27.fasta SRR5581526-1_chrm.part-25.fasta 10S1_S5_L002_chrm.part-16.fasta SRR3986254_chrm.part-31.fasta SRR4029372_chrm.part-28.fasta SRR5581526-1_chrm.part-26.fasta 10S1_S5_L002_chrm.part-17.fasta SRR3986254_chrm.part-32.fasta SRR4029372_chrm.part-29.fasta SRR5581526-1_chrm.part-27.fasta 10S1_S5_L002_chrm.part-18.fasta SRR3986254_chrm.part-3.fasta SRR4029372_chrm.part-2.fasta SRR5581526-1_chrm.part-28.fasta 10S1_S5_L002_chrm.part-19.fasta SRR3986254_chrm.part-4.fasta SRR4029372_chrm.part-30.fasta SRR5581526-1_chrm.part-29.fasta 10S1_S5_L002_chrm.part-1.fasta SRR3986254_chrm.part-5.fasta SRR4029372_chrm.part-3.fasta SRR5581526-1_chrm.part-2.fasta 10S1_S5_L002_chrm.part-20.fasta SRR3986254_chrm.part-6.fasta SRR4029372_chrm.part-4.fasta SRR5581526-1_chrm.part-30.fasta 10S1_S5_L002_chrm.part-21.fasta SRR3986254_chrm.part-7.fasta SRR4029372_chrm.part-5.fasta SRR5581526-1_chrm.part-31.fasta
我想應用來自 CRISPRCasFinder 工具的命令 該命令在我單獨使用時執行良好 1
namefile.fasta
該命令在我使用時也執行良好parallel
onnamefile.part*.fasta
。但是當我嘗試通過使用使命令更通用時
basename
,沒有任何效果。我想用來basename
將輸入文件的名稱保留在輸出文件夾中。我在一個較小的數據集上試過這個:
time parallel 'dossierSortie=$(basename -s .fasta {}) ; singularity exec -B $PWD /usr/local/CRISPRCasFinder-release-4.2.20/CrisprCasFinder.simg perl /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl -so /usr/local/CRISPRCasFinder/sel392v2.so -cf /usr/local/CRISPRCasFinder/CasFinder-2.0.3 -drpt /usr/local/CRISPRCasFinder/supplementary_files/repeatDirection.tsv -rpts /usr/local/CRISPRCasFinder/supplementary_files/Repeat_List.csv -cas -def G --meta -out /databis/defontis/Dossier_fasta_chrm_avec_CRISPRCasFinder/Test/Result{} -in /databis/defontis/Dossier_fasta_chrm_avec_CRISPRCasFinder/Test/{}' ::: *_chrm.part*.fasta
它做到了這一點
ERR358546_chrm.part-1.fasta SRR4029114_k141_23527.fna.bck SRR5100341_k141_10416.fna.lcp SRR5100345_k141_3703.fna.al1 ERR358546_chrm.part-2.fasta SRR4029114_k141_23527.fna.bwt SRR5100341_k141_10416.fna.llv SRR5100345_k141_3703.fna.bck ERR358546_chrm.part-3.fasta SRR4029114_k141_23527.fna.des SRR5100341_k141_10416.fna.ois SRR5100345_k141_3703.fna.bwt ERR358546_chrm.part-4.fasta SRR4029114_k141_23527.fna.lcp SRR5100341_k141_10416.fna.prj SRR5100345_k141_3703.fna.des ERR358546_chrm.part-5.fasta SRR4029114_k141_23527.fna.llv SRR5100341_k141_10416.fna.sds SRR5100345_k141_3703.fna.lcp ERR358546_chrm.part-6.fasta SRR4029114_k141_23527.fna.ois SRR5100341_k141_10416.fna.sti1 SRR5100345_k141_3703.fna.llv ERR358546_k141_26987.fna SRR4029114_k141_23527.fna.prj SRR5100341_k141_10416.fna.suf SRR5100345_k141_3703.fna.ois ERR358546_k141_33604.fna SRR4029114_k141_23527.fna.sds SRR5100341_k141_10416.fna.tis SRR5100345_k141_3703.fna.prj ERR358546_k141_90631.fna SRR4029114_k141_23527.fna.sti1 SRR5100341_k141_10942.fna SRR5100345_k141_3703.fna.sds ResultERR358546_chrm.part-3 SRR4029114_k141_23527.fna.suf SRR5100341_k141_164.fna SRR5100345_k141_3703.fna.sti1 ResultERR358546_chrm.part-4 SRR4029114_k141_23527.fna.tis SRR5100341_k141_3046.fna SRR5100345_k141_3703.fna.suf ResultSRR4029114_chrm.part-1 SRR5100341_chrm.part-10.fasta SRR5100341_k141_3968.fna SRR5100345_k141_3703.fna.tis ResultSRR4029114_chrm.part-4 SRR5100341_chrm.part-11.fasta SRR5100341_k141_631.fna SRR5100345_k141_4429.fna ResultSRR5100341_chrm.part-10 SRR5100341_chrm.part-12.fasta SRR5100341_k141_6376.fna SRR5100345_k141_4832.fna ResultSRR5100341_chrm.part-11 SRR5100341_chrm.part-13.fasta SRR5100341_k141_8699.fna SRR5100345_k141_6139.fna ResultSRR5100341_chrm.part-3 SRR5100341_chrm.part-1.fasta SRR5100341_k141_8892.fna SRR5100345_k141_731.fna ResultSRR5100341_chrm.part-9 SRR5100341_chrm.part-2.fasta SRR5100345_chrm.part-10.fasta SRR5100345_k141_731.fna.al1 ResultSRR5100345_chrm.part-1 SRR5100341_chrm.part-3.fasta SRR5100345_chrm.part-1.fasta SRR5100345_k141_731.fna.bck ResultSRR5100345_chrm.part-4 SRR5100341_chrm.part-4.fasta SRR5100345_chrm.part-2.fasta SRR5100345_k141_731.fna.bwt ResultSRR5100345_chrm.part-9 SRR5100341_chrm.part-5.fasta SRR5100345_chrm.part-3.fasta SRR5100345_k141_731.fna.des SRR4029114_chrm.part-1.fasta SRR5100341_chrm.part-6.fasta SRR5100345_chrm.part-4.fasta SRR5100345_k141_731.fna.lcp SRR4029114_chrm.part-2.fasta SRR5100341_chrm.part-7.fasta SRR5100345_chrm.part-5.fasta SRR5100345_k141_731.fna.llv SRR4029114_chrm.part-3.fasta SRR5100341_chrm.part-8.fasta SRR5100345_chrm.part-6.fasta SRR5100345_k141_731.fna.ois SRR4029114_chrm.part-4.fasta SRR5100341_chrm.part-9.fasta SRR5100345_chrm.part-7.fasta SRR5100345_k141_731.fna.prj SRR4029114_chrm.part-5.fasta SRR5100341_k141_10416.fna SRR5100345_chrm.part-8.fasta SRR5100345_k141_731.fna.sds SRR4029114_k141_14384.fna SRR5100341_k141_10416.fna.al1 SRR5100345_chrm.part-9.fasta SRR5100345_k141_731.fna.sti1 SRR4029114_k141_16765.fna SRR5100341_k141_10416.fna.bck SRR5100345_k141_1211.fna SRR5100345_k141_731.fna.suf SRR4029114_k141_23527.fna SRR5100341_k141_10416.fna.bwt SRR5100345_k141_2884.fna SRR5100345_k141_731.fna.tis SRR4029114_k141_23527.fna.al1 SRR5100341_k141_10416.fna.des SRR5100345_k141_3703.fna
文件夾的名稱不好,因為我想要例如只是
ResultERR358546
而不是ResultERR358546_chrm.part-2.fasta
我不想要每個部分的結果,而只想要每個 ID。
您的
basename
命令僅刪除固定.fasta
副檔名 - 據我所知,它無法刪除變數模式。然而 GNU
parallel
提供了一個Perl 表達式替換字元串工具,它比basename
- ex 強大得多。給定$ ls *_chrm.part*.fasta ERR358546_chrm.part-2.fasta ERR358546_chrm.part-5.fasta ERR358546_chrm.part-8.fasta ERR358546_chrm.part-3.fasta ERR358546_chrm.part-6.fasta ERR358546_chrm.part-9.fasta ERR358546_chrm.part-4.fasta ERR358546_chrm.part-7.fasta
然後
$ parallel echo Result'{= s:_.*$:: =}' ::: *_chrm.part*.fasta ResultERR358546 ResultERR358546 ResultERR358546 ResultERR358546 ResultERR358546 ResultERR358546 ResultERR358546 ResultERR358546
替換
s:_.*$::
替換下劃線後的所有內容。移植到您的原始命令:time parallel ' singularity exec -B "$PWD" /usr/local/CRISPRCasFinder-release-4.2.20/CrisprCasFinder.simg \ perl /usr/local/CRISPRCasFinder/CRISPRCasFinder.pl \ -so /usr/local/CRISPRCasFinder/sel392v2.so \ -cf /usr/local/CRISPRCasFinder/CasFinder-2.0.3 \ -drpt /usr/local/CRISPRCasFinder/supplementary_files/repeatDirection.tsv \ -rpts /usr/local/CRISPRCasFinder/supplementary_files/Repeat_List.csv \ -cas -def G --meta \ -out /databis/defontis/Dossier_fasta_chrm_avec_CRISPRCasFinder/Test/Result'{= s:_.*$:: =}' \ -in /databis/defontis/Dossier_fasta_chrm_avec_CRISPRCasFinder/Test/{} ' ::: *_chrm.part*.fasta
如果要擷取並包含零件索引,可以將表達式修改為
Result'{= s:_chrm\.part-(\d+)\.fasta$:_$1: =}'
或者
'{= s:_chrm\.part-(\d+)\.fasta$:Result_$1: =}'
例如。