Bash 删除第二列开头的空白

Bash 删除第二列开头的空白,bash,awk,sed,cut,Bash,Awk,Sed,Cut,我有一个空格分隔的文件,看起来像: 12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899 2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721 6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305 10 10.104585135.104956335 ENSG00000065

我有一个空格分隔的文件,看起来像:

12  12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2  2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6  6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10  10.104585135.104956335 ENSG00000065613 rs72811696
我想删除第二列的前导空格(第1列和第2列之间有两个空格,而不是一个空格)。有人知道用于此的sed或awk命令吗?

带cut:

cut -d " " -f 1,3- file
输出:

12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899 2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721 6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305 10 10.104585135.104956335 ENSG00000065613 rs72811696
12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10 10.104585135.104956335 ENSG00000065613 rs72811696
12.57428314.57490104 ENSG0000065361 rs2271194 rs61939899 2.198148577.198835577 ENSG0000065413 rs4524134 rs2697288 rs6738721 6.84279922.84407274 ENSG0000065609 rs2016358 rs35791305 10.104585135.104956335 ENSG0000065613 rs72811696 带切口:

cut -d " " -f 1,3- file
输出:

12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899 2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721 6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305 10 10.104585135.104956335 ENSG00000065613 rs72811696
12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10 10.104585135.104956335 ENSG00000065613 rs72811696
12.57428314.57490104 ENSG0000065361 rs2271194 rs61939899 2.198148577.198835577 ENSG0000065413 rs4524134 rs2697288 rs6738721 6.84279922.84407274 ENSG0000065609 rs2016358 rs35791305 10.104585135.104956335 ENSG0000065613 rs72811696
tr-s
(或
tr-挤压重复
)将删除重复字符。因此,如果要替换所有重复的空格,可以编写:

tr -s ' '   < input-file   > output-file
输出:

12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899 2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721 6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305 10 10.104585135.104956335 ENSG00000065613 rs72811696
12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10 10.104585135.104956335 ENSG00000065613 rs72811696
tr-s
(或
tr-挤压重复
)将删除重复字符。因此,如果要替换所有重复的空格,可以编写:

tr -s ' '   < input-file   > output-file
输出:

12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899 2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721 6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305 10 10.104585135.104956335 ENSG00000065613 rs72811696
12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10 10.104585135.104956335 ENSG00000065613 rs72811696

使用
GNU sed
,在第一列后用单空格替换多个空格字符

sed -E 's/^(\S+)\s+/\1 /' ip.txt
对于其他版本,请使用

  • [[:space:]
    用于
    \s
  • [^[:space:]
    \S

:空白:
(空格和制表符)而不是
:空格:
(空白字符)

使用
GNU-sed
,在第一列后用单个空格替换多个空白字符

sed -E 's/^(\S+)\s+/\1 /' ip.txt
对于其他版本,请使用

  • [[:space:]
    用于
    \s
  • [^[:space:]
    \S

:空白:
(空格和制表符)而不是
:空格:
(空白字符)

此AWK将所有连续空格的出现替换为单个空格:

$ awk 'gsub(/ +/," ")' file 
12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10 10.104585135.104956335 ENSG00000065613 rs72811696

此AWK使用单个空间替换连续空间的所有发生:

$ awk 'gsub(/ +/," ")' file 
12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10 10.104585135.104956335 ENSG00000065613 rs72811696

只需删除每行的第一个空格:

$ sed 's/ //' file
12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10 10.104585135.104956335 ENSG00000065613 rs72811696

只需删除每行的第一个空格:

$ sed 's/ //' file
12 12.57428314.57490104 ENSG00000065361 rs2271194 rs61939899
2 2.198148577.198835577 ENSG00000065413 rs4524134 rs2697288 rs6738721
6 6.84279922.84407274 ENSG00000065609 rs2016358 rs35791305
10 10.104585135.104956335 ENSG00000065613 rs72811696