將一列的值與另一列中的所有值進行比較

July 12, 2014

我有 2 個輸入文件。的每一行File1都應與的每一行進行比較File2。
邏輯是：
如果Column1ofFile1不匹配Column1（其下的所有值） of ，則在輸出文件中File2列印整行 of 。File1同樣，將的每個值與Column1下Column1的每個值進行比較File2。
如果Column1兩個文件的匹配，並且如果的值Column2大於File1或N+10小於N-10，N則的值Column2在哪裡File2，然後才列印整行 of並像這樣File1比較所有行。File2
File1:
Contig1  23
Contig1  42
Contig2  68
Contig3  89
Contig3  102
Contig7  79
File2:
Contig1  40
Contig1  49
Contig3  90
Contig2  90
Contig20 200
Contig1  24
預期輸出：
Contig2  68
Contig3  102
Contig7  79
任何解決方案，即使是沒有awkor的解決方案，sed都會這樣做。
我發現了一個類似的問題，但我不確定我必須做什麼：
這是程式碼：
 `NR==FNR { 
  lines[NR,"col1"] = $1
  lines[NR,"col2"] = $2
  lines[NR,"line"] = $0
  next
   }
  (lines[FNR,"col1"] != $1) {
   print lines[FNR,"line"]
   next
   }
  (lines[FNR,"col2"]+10 &lt; $2 || lines[FNR,"col2"]-10 &gt; $2) {
   print lines[FNR,"line"]
   }' file1 file2`

下面的腳本執行以下操作，我認為這就是您想要的：

如果 file2 中不存在來自 file1 的 contig，則列印該 contig 的所有行。
如果它存在於 file2 中，則對於 file1 中的每個值，僅當它不小於 file2 中的任何 contig 值 -10 或大於 file2 中的任何值 +10 時才列印它。

#!/usr/bin/env perl

my (%file1, %file2);

## read file1, the 1st argument
open(F1,"$ARGV[0]");
while(&lt;F1&gt;){
   chomp;
   ## Split the line on whitespace into the @F array.
   my @F=split(/\s+/); 

   ## Save all lines in the %file1 hash.
   ## $F[0] is the contig name and $F[1] the value.
   ## The hash will store a list of all values
   ## associated with this contig.
   push @{$file1{$F[0]}},$F[1];
}
close(F1);
## read file2, the second argument
open(F2,"$ARGV[1]"); 
while(&lt;F2&gt;){
   ## remove newlines
   chomp;
   ## save the fields into array @F
   my @F=split(/\s+/); 
   ## Again, save all values associated with each
   ## contig into the %file2 hash. 
   push @{$file2{$F[0]}},$F[1];
}
close(F2);

## For each of the contigs in file1
foreach my $contig (keys(%file1)) {
   ## If this contig exists in file 2
   if(defined $file2{$contig}){
       ## get the list of values for that contig
       ## in each of the two files
       my @f2_vals=@{$file2{$contig}};
       my @f1_vals=@{$file1{$contig}};
       ## For each of file1's values for this contig
       val1:foreach my $val1 (@f1_vals) {
               ## For each of file2's value for this contig
               foreach my $val2 (@f2_vals) {
                   ## Skip to the next value from file1 unless
                   ## this one falls within the desired range.
                   unless(($val1 &lt; $val2-10) || ($val1 &gt; $val2+10)){
                       next val1;
                   }
               }
               ## We will only get here if none of the values
               ## fell within the desired range. If so, we should
               ## print the value from file1.
               print "$contig $val1\n";
           }
   }
   ## If this contig is not in file2, print the
   ## lines from file1. This will print all lines
   ## from file1 whose contig was not in file2.
   else {
       print "$contig $_\n" for @{$file1{$contig}}
   }
}

將其保存在文本文件中（比如foo.pl），使其可執行（chmod a+x foo.pl）並像這樣執行它：

./foo.pl file1 file2

在您的範例中，它返回：

$ foo.pl file1 file2 
Contig2 68
Contig3 102
Contig7 79

引用自：https://unix.stackexchange.com/questions/143771

將一列的值與另一列中的所有值進行比較

相關問答

多列日誌文件的後處理

刪除不超過或少於“N”個欄位的行？

如果特定列中的單詞與表達式不匹配，如何替換它？

如何從包含製表符和空格的文本創建統一列？

特定列級替換

列不匹配和替換