File Awk-按条件将一个.txt文件分隔为多个文件_File_Awk_Split

File Awk-按条件将一个.txt文件分隔为多个文件

file awk

File Awk-按条件将一个.txt文件分隔为多个文件,file,awk,split,File,Awk,Split,我有一个问题，我想分开一个文件的条件，以更多的文件。输入：一个文本文件 variable chrom=chr1 1000 10 1010 20 1020 10 vriable chrom=chr2 1000 20 1100 30 1200 10 输出：本例中有两个文件 chr1.txt variable chrom=chr1 1000 10 1010 20 1020 10 chr2.txt variable chrom=chr2 1000 20 1100 30 1200 10 因此，如果

我有一个问题，我想分开一个文件的条件，以更多的文件。输入：一个文本文件

variable chrom=chr1
1000 10
1010 20
1020 10
vriable chrom=chr2
1000 20
1100 30
1200 10

输出：本例中有两个文件

chr1.txt

variable chrom=chr1
1000 10
1010 20
1020 10

chr2.txt

variable chrom=chr2
1000 20
1100 30
1200 10

因此，如果行以chrom=chr$i（i={1..22}）=>separate to other text file开头，则分隔符条件为。

谢谢你

以下几点：

awk 'BEGIN { filename="unknown.txt" } /^variable chrom=/ { close(filename); filename = substr($0, index($0, "=") + 1) ".txt"; } { print > filename }'

awk代码在哪里

BEGIN { filename="unknown.txt" }   # default file name, used only if the
                                   # file doesn't start with a variable chrom=
                                   # line
/^variable chrom=/ {               # in such a line:
  close(filename)                  # close the previous file (if open)
                                   # and set the new filename
  filename = substr($0, index($0, "=") + 1) ".txt"  filename
}
{ print > filename }               # print everything to the current file.

基本算法非常简单：逐行读取文件，在找到一行开始新节时更改文件名，始终将当前行打印到当前文件中，因此关键在于将文件名与标记行隔离。

filename = substr($0, index($0, "=") + 1) ".txt"

这种方法过于简单，但对于您展示的示例来说是可行的：它需要在

之后附加

.txt

来获取文件名。如果您的标记行比

variable chrom=filenamestub

更复杂，则必须对其进行修改，但在这种情况下，我只能猜测您的要求，可能猜错了。

沿着这些行：

awk 'BEGIN { filename="unknown.txt" } /^variable chrom=/ { close(filename); filename = substr($0, index($0, "=") + 1) ".txt"; } { print > filename }'

awk代码在哪里

BEGIN { filename="unknown.txt" }   # default file name, used only if the
                                   # file doesn't start with a variable chrom=
                                   # line
/^variable chrom=/ {               # in such a line:
  close(filename)                  # close the previous file (if open)
                                   # and set the new filename
  filename = substr($0, index($0, "=") + 1) ".txt"  filename
}
{ print > filename }               # print everything to the current file.

基本算法非常简单：逐行读取文件，在找到一行开始新节时更改文件名，始终将当前行打印到当前文件中，因此关键在于将文件名与标记行隔离。

filename = substr($0, index($0, "=") + 1) ".txt"

这种方法过于简单，但对于您展示的示例来说是可行的：它需要在

之后附加

.txt

来获取文件名。如果您的标记行比

variable chrom=filenamestub

更复杂，则必须对其进行修改，但在这种情况下，我只能猜测您的要求，可能猜错了。

沿着这些行：

awk 'BEGIN { filename="unknown.txt" } /^variable chrom=/ { close(filename); filename = substr($0, index($0, "=") + 1) ".txt"; } { print > filename }'

awk代码在哪里

BEGIN { filename="unknown.txt" }   # default file name, used only if the
                                   # file doesn't start with a variable chrom=
                                   # line
/^variable chrom=/ {               # in such a line:
  close(filename)                  # close the previous file (if open)
                                   # and set the new filename
  filename = substr($0, index($0, "=") + 1) ".txt"  filename
}
{ print > filename }               # print everything to the current file.

基本算法非常简单：逐行读取文件，在找到一行开始新节时更改文件名，始终将当前行打印到当前文件中，因此关键在于将文件名与标记行隔离。

filename = substr($0, index($0, "=") + 1) ".txt"

这种方法过于简单，但对于您展示的示例来说是可行的：它需要在

之后附加

.txt

来获取文件名。如果您的标记行比

variable chrom=filenamestub

更复杂，则必须对其进行修改，但在这种情况下，我只能猜测您的要求，可能猜错了。

沿着这些行：

awk 'BEGIN { filename="unknown.txt" } /^variable chrom=/ { close(filename); filename = substr($0, index($0, "=") + 1) ".txt"; } { print > filename }'

awk代码在哪里

BEGIN { filename="unknown.txt" }   # default file name, used only if the
                                   # file doesn't start with a variable chrom=
                                   # line
/^variable chrom=/ {               # in such a line:
  close(filename)                  # close the previous file (if open)
                                   # and set the new filename
  filename = substr($0, index($0, "=") + 1) ".txt"  filename
}
{ print > filename }               # print everything to the current file.

基本算法非常简单：逐行读取文件，在找到一行开始新节时更改文件名，始终将当前行打印到当前文件中，因此关键在于将文件名与标记行隔离。

filename = substr($0, index($0, "=") + 1) ".txt"

这种方法过于简单，但对于您展示的示例来说是可行的：它需要在

之后附加

.txt

来获取文件名。如果您的标记行比

variable chrom=filenamestub

更复杂，则必须对其进行修改，但在这种情况下，我只能猜测您的要求，可能会猜错。

如果您知道其间有多少行，则可以使用

split-l 4 textfile.txt

这将每隔找到的第四行分割文本文件，使文件

xaa

和

xab

，依此类推。

如果您知道中间有多少行，可以使用

split-l 4 textfile.txt

这将每隔找到的第四行分割文本文件，使文件

xaa

和

xab

，依此类推。

如果您知道中间有多少行，可以使用

split-l 4 textfile.txt

这将每隔找到的第四行分割文本文件，使文件

xaa

和

xab

，依此类推。

如果您知道中间有多少行，可以使用

split-l 4 textfile.txt

这将每隔找到第四行分割文本文件，使文件

xaa

和

xab

，依此类推。

可能重复的可能重复的可能重复的可能重复的可能重复的