Bash

處理文本文件 - 提取以數字開頭的數據

  • August 17, 2015

我有一個我需要處理的聯繫資訊列表,.txt格式由,. 我們只想保留以65房地產開頭的 SIC 程式碼。

該命令應該只檢查以 65 開頭的數據的正確欄位。

請記住,數字不會總是6531,它必須以65(例如,6521 6555 6587 也是我們想要保留的東西)開頭

2,J,John Foraste Photography,atlanticinn@biri.com,68 Middle Hwy,Barrington,RI,2806, , ,733511,Photographic Engineering,atlanticinn.com
3,X,Xerox Corp,danielle_cook@xeroxscanners.com,10 Orms St # 420,Providence,RI,02904-7815,5594547871,4012763242,504403,Copying & Duplicating Machines & Supls,www.xerox.com
4,S,St Sahag & St Mesrob Armenian,h.ghajanian@osjl.com,70 Jefferson St,Providence,RI,02908-4923,4012722832,4012727712,866107,Churches,www.stsahmes.org
13,C,Century 21 Access America,damonray@mail.com,1025 Tiogue Ave,Coventry,RI,02816-6100,4018282100, ,6531,Real Estate, 
14,B,Baxter's Showroom,rbaxters@aol.com,Null,Warwick,RI,0,4017398222,4017397058,594409,Jewelers,baxtersjewelry.com^^majorfindings.com^^robertbaxter.com^^san
17,R,Re/Max South County,ereadey@yahoo.com,56 Wells Street,Westerly,RI,2891,4015962067, ,6531,Real Estate, 
19,L,Lyn Reale - Block Island Realty,saintwolfe@computer.org,215Chapelstreet,Block Island,RI,2807,4012534311, ,653118,Real Estate,stmichaelsbristolri.org
21,R,Re/Max South County,apage@remax.net,56 Wells Street,Westerly,RI,2891,4015962067, ,6531,Real Estate, 
22,V,Vns Home Health Svc,george@vnshomehealth.org,14 Woodruff Ave # 7,Narragansett,RI,02882-3467,4017882253,4017820500,808201,Home Health Service,

處理後的列表應該是

13,C,Century 21 Access America,damonray@mail.com,1025 Tiogue Ave,Coventry,RI,02816-6100,4018282100, ,6531,Real Estate, 
17,R,Re/Max South County,ereadey@yahoo.com,56 Wells Street,Westerly,RI,2891,4015962067, ,6531,Real Estate, 
19,L,Lyn Reale - Block Island Realty,saintwolfe@computer.org,215Chapelstreet,Block Island,RI,2807,4012534311, ,653118,Real Estate,stmichaelsbristolri.org
21,R,Re/Max South County,apage@remax.net,56 Wells Street,Westerly,RI,2891,4015962067, ,6531,Real Estate, 

awk

awk -F, '{if ( $11 ~ /^65/ ) print $0}' file

說明:使用逗號作為欄位分隔符-F,,檢查第 11 列是否^以 65 開頭()if ( $11 ~ /^65/ ),如果是則列印整行print $0

這應該這樣做:

#!/usr/bin/env perl

use strict;
use warnings;

while ( <> ) {
   print if (split /,/)[10] =~ m/^65/;
}

如果你喜歡的話,可以把它畫成一個:

perl -ne 'print if (split /,/)[10] =~ m/^65/;' yourfile

引用自:https://unix.stackexchange.com/questions/223683