Csv 表格化文本文件数据
我有一个文本文件,里面有这种格式的信息Csv 表格化文本文件数据,csv,awk,sed,Csv,Awk,Sed,我有一个文本文件,里面有这种格式的信息 %%% key1 = value1 key2 = value2 key3 = subkey1:subvalue1;subkey2:subvalue2 %%% key1 = value1 key2 = value2 key3 = subkey1:subvalue1;subkey2:subvalue2 %%% 我想将其转换为类似CSV的格式: key1,key2,key3_subkey1,key3_subkey2 value1,value2,subvalue
%%%
key1 = value1
key2 = value2
key3 = subkey1:subvalue1;subkey2:subvalue2
%%%
key1 = value1
key2 = value2
key3 = subkey1:subvalue1;subkey2:subvalue2
%%%
我想将其转换为类似CSV的格式:
key1,key2,key3_subkey1,key3_subkey2
value1,value2,subvalue1,subvalue2
value1,value2,subvalue1,subvalue2
最好的方法是什么。我希望有像Awk/Sed/Grep这样的unix实用程序可以用来代替编写python/perl程序来读取每一行,维护状态并转换为csv格式。我不确定在这方面做一个快速的手动解析比这容易多少。由于Pandas构建了表,下面的代码甚至可以处理任何一组键和带有子键的任意键
data = []
for line in open('input.txt'):
if line.startswith('%%%'):
o = {}
data.append(o)
continue
key, value = line.strip().split(' = ')
if ':' in value:
for pairstring in value.split(';'):
subkey, subvalue = pairstring.split(':')
o[f'{key}_{subkey}'] = subvalue
else:
o[key] = value
import pandas
pandas.DataFrame.from_records(data).to_csv('output.csv')
$cat tst.awk
开始{
FS=“[:space:][]*=[[:space:][]*”
OFS=“,”
}
!/%%%/ {
hdrs=hdrs sep$1
n=拆分($2,子文件夹,/[:;]/)
如果(n==1){
VAL=9月VAL$2
sep=OFS
}
否则{
对于(i=1;我可以请您添加您为解决您的问题而付出的努力,然后让我们知道。
$ cat tst.awk
BEGIN {
FS = "[[:space:]]*=[[:space:]]*"
OFS = ","
}
!/%%%/ {
hdrs = hdrs sep $1
n = split($2,subFlds,/[:;]/)
if ( n == 1 ) {
vals = vals sep $2
sep = OFS
}
else {
for ( i=1; i<=n; i+=2) {
hdrs = hdrs sep subFlds[i]
vals = vals sep subFlds[i+1]
}
if ( !doneHdr++) {
print hdrs
}
print vals
hdrs = vals = sep = ""
}
}
$ awk -f tst.awk file
key1,key2,key3,subkey1,subkey2
value1,value2,subvalue1,subvalue2
value1,value2,subvalue1,subvalue2