Sorting 组织不一致的值
不知道在这里问这个是否合适,因为它不是编程,但我不知道还能去哪里: 我希望以一致的方式组织以下数据。目前情况很糟糕,只有前两列以逗号分隔。其余的列可以是1-9之间的任意数字,并且通常是不同的。 换句话说,我想对它进行排序,使文本匹配一行中的所有值列、一行中的所有反冲列,等等。然后我可以删除文本并添加标题,它仍然有意义Sorting 组织不一致的值,sorting,csv,text,awk,Sorting,Csv,Text,Awk,不知道在这里问这个是否合适,因为它不是编程,但我不知道还能去哪里: 我希望以一致的方式组织以下数据。目前情况很糟糕,只有前两列以逗号分隔。其余的列可以是1-9之间的任意数字,并且通常是不同的。 换句话说,我想对它进行排序,使文本匹配一行中的所有值列、一行中的所有反冲列,等等。然后我可以删除文本并添加标题,它仍然有意义 bm_wp_upg_o_t1micro, sight, value = 3, zoom = 3, recoil = 1, spread_moving = -1 bm_wp_upg
bm_wp_upg_o_t1micro, sight, value = 3, zoom = 3, recoil = 1, spread_moving = -1
bm_wp_upg_o_marksmansight_rear, sight, value = 3, zoom = 1, recoil = 1, spread = 1
bm_wp_upg_o_marksmansight_front, extra, value = 1
bm_wp_m4_upper_reciever_edge, upper_reciever, value = 3, recoil = 1
bm_wp_m4_upper_reciever_round, upper_reciever, value = 1
bm_wp_m4_uupg_b_long, barrel, value = 4, damage = 1, spread = 1, spread_moving = -2, concealment = -2
任何建议,即使是关于正确的地方是真正问这将是伟大的。
上下文只是我试图整理的一个游戏文件中的原始数据。我担心正则表达式在这里帮不了你什么,因为你输入的内容是不规则的,可能会匹配它,但以这样或那样的方式安排会很麻烦。这可以用任何编程语言很容易地完成,但对于这样的东西,我总是去 假设您的输入位于名为input.txt的文件中,请将以下内容放入名为parse.awk的程序中: 并获得您的输出:
id, sight, value, zoom, recoil, spread_moving, extra, upper_receiver, barrel, damage, spread_moving, concealment
bm_wp_upg_o_t1micro, x, 3, 3, 1, -1, , , , , -1,
bm_wp_upg_o_marksmansight_rear, x, 3, 1, 1, , , , , , ,
bm_wp_upg_o_marksmansight_front, , 1, , , , x, , , , ,
bm_wp_m4_upper_reciever_edge, , 3, , 1, , , , , , ,
bm_wp_m4_upper_reciever_round, , 1, , , , , , , , ,
bm_wp_m4_uupg_b_long, , 4, , , -2, , , x, 1, -2, -2
请注意,我选择仅使用“x”表示视觉,这似乎是一个现在/不存在的东西。你想用什么就用什么
如果您使用的是Linux或Macintosh,您应该有awk可用。如果你在Windows上,你必须安装它。坚持Ethan的答案-这只是我在享受自己。是的,这让我很奇怪
awk脚本
输出
我做了另一个awk版本。我认为这应该更容易阅读。 从文件中读取所有值/列,使其尽可能动态
awk -F, '
{
ID[$1]=$2 # use column 1 as index
for (i=3;i<=NF;i++ ) # loop through all fields from #3 to end
{
gsub(/ +/,"",$i) # remove space from field
split($i,a,"=") # split field in name and value a[1] and a[2]
COLUMN[a[1]]++ # store field name as column name
DATA[$1" "a[1]]=a[2] # store data value in DATA using field #1 and column name as index
}
}
END {
printf "%49s ","info" # print info
for (i in COLUMN)
{printf "%15s",i} # print column name
print ""
for (i in ID) # loop through all ID
{
printf "%32s %16s ",i, ID[i] # print ID and info
for (j in COLUMN)
{
printf "%14s ",DATA[i" "j]+0 # print value
}
print ""
}
}' file
首先,你的问题确实含糊不清。您说过前两列是一致的,但它们根本不一致。我得到的唯一一件事是你想整理一些东西,但是什么?我不知道。此外,如果您没有指定您正在使用的语言,那么您希望如何获得有针对性的答案?此外,我们不是免费的编码服务,您必须向我们展示您试图解决的问题。干得好。您是否需要为“extra”和“upper_receiver”等键设置值“X”或“1”?我想知道是否值得处理任意的密钥集合,而不是预先确定的密钥集合?第一种可能是直截了当的,值得的,甚至是必要的。第二种更具推测性,更难做,属于“只有在你有时间和精力的情况下”类别。嘿,是的,我在写这篇文章的时候考虑过,但我的awk有点生疏了。也许当我有时间的时候,我会多走一段路……谢谢!我真的不知道该去哪里解决这类问题,也不知道该用什么。你完美地解决了它!看起来棒极了!我还不确定我是否能理解它,但它给了我一些想法,在我使用正则表达式识别和清理相关内容的那一刻,直接从我的源文件收集数据,以生成我在第一篇文章中包含的数据。
id, sight, value, zoom, recoil, spread_moving, extra, upper_receiver, barrel, damage, spread_moving, concealment
bm_wp_upg_o_t1micro, x, 3, 3, 1, -1, , , , , -1,
bm_wp_upg_o_marksmansight_rear, x, 3, 1, 1, , , , , , ,
bm_wp_upg_o_marksmansight_front, , 1, , , , x, , , , ,
bm_wp_m4_upper_reciever_edge, , 3, , 1, , , , , , ,
bm_wp_m4_upper_reciever_round, , 1, , , , , , , , ,
bm_wp_m4_uupg_b_long, , 4, , , -2, , , x, 1, -2, -2
awk 'BEGIN {
# f_idx[field] holds the column number c for a field=value item
# f_name[c] holds the names
# f_width[c] holds the width of the widest value (or the field name)
# f_fmt[c] holds the appropriate format
FS = " *, *"; n = 2;
f_name[0] = "id"; f_width[0] = length(f_name[0])
f_name[1] = "type"; f_width[1] = length(f_name[1])
}
{
#-#print NR ":" $0
line[NR,0] = $1
len = length($1)
if (len > f_width[0])
f_width[0] = len
line[NR,1] = $2
len = length($2)
if (len > f_width[1])
f_width[1] = len
for (i = 3; i <= NF; i++)
{
split($i, fv, " = ")
#-#print "1:" fv[1] ", 2:" fv[2]
if (!(fv[1] in f_idx))
{
f_idx[fv[1]] = n
f_width[n++] = length(fv[1])
}
c = f_idx[fv[1]]
f_name[c] = fv[1]
gsub(/ /, "", fv[2])
len = length(fv[2])
if (len > f_width[c])
f_width[c] = len
line[NR,c] = fv[2]
#-#print c ":" f_name[c] ":" f_width[c] ":" line[NR,c]
}
}
END {
for (i = 0; i < n; i++)
f_fmt[i] = "%s%" f_width[i] "s"
#-#for (i = 0; i < n; i++)
#-# printf "%d: (%d) %s %s\n", i, f_width[i], f_name[i], f_fmt[i]
#-# pad = ""
for (j = 0; j < n; j++)
{
printf f_fmt[j], pad, f_name[j]
pad = ","
}
printf "\n"
for (i = 1; i <= NR; i++)
{
pad = ""
for (j = 0; j < n; j++)
{
printf f_fmt[j], pad, line[i,j]
pad = ","
}
printf "\n"
}
}' data
bm_wp_upg_o_t1micro, sight, value = 3, zoom = 3, recoil = 1, spread_moving = -1
bm_wp_upg_o_marksmansight_rear, sight, value = 3, zoom = 1, recoil = 1, spread = 1
bm_wp_upg_o_marksmansight_front, extra, value = 1
bm_wp_m4_upper_receiver_edge, upper_receiver, value = 3, recoil = 1
bm_wp_m4_upper_receiver_round, upper_receiver, value = 1
bm_wp_m4_uupg_b_long, barrel, value = 4, damage = 1, spread = 1, spread_moving = -2, concealment = -2
id, type,value,zoom,recoil,spread_moving,spread,damage,concealment
bm_wp_upg_o_t1micro, sight, 3, 3, 1, -1, , ,
bm_wp_upg_o_marksmansight_rear, sight, 3, 1, 1, , 1, ,
bm_wp_upg_o_marksmansight_front, extra, 1, , , , , ,
bm_wp_m4_upper_receiver_edge,upper_receiver, 3, , 1, , , ,
bm_wp_m4_upper_receiver_round,upper_receiver, 1, , , , , ,
bm_wp_m4_uupg_b_long, barrel, 4, , , -2, 1, 1, -2
awk -F, '
{
ID[$1]=$2 # use column 1 as index
for (i=3;i<=NF;i++ ) # loop through all fields from #3 to end
{
gsub(/ +/,"",$i) # remove space from field
split($i,a,"=") # split field in name and value a[1] and a[2]
COLUMN[a[1]]++ # store field name as column name
DATA[$1" "a[1]]=a[2] # store data value in DATA using field #1 and column name as index
}
}
END {
printf "%49s ","info" # print info
for (i in COLUMN)
{printf "%15s",i} # print column name
print ""
for (i in ID) # loop through all ID
{
printf "%32s %16s ",i, ID[i] # print ID and info
for (j in COLUMN)
{
printf "%14s ",DATA[i" "j]+0 # print value
}
print ""
}
}' file
info spread recoil zoom concealment spread_moving damage value
bm_wp_m4_upper_reciever_round upper_reciever 0 0 0 0 0 0 1
bm_wp_m4_uupg_b_long barrel 1 0 0 -2 -2 1 4
bm_wp_upg_o_marksmansight_rear sight 1 1 1 0 0 0 3
bm_wp_upg_o_marksmansight_front extra 0 0 0 0 0 0 1
bm_wp_m4_upper_reciever_edge upper_reciever 0 1 0 0 0 0 3
bm_wp_upg_o_t1micro sight 0 1 3 0 -1 0 3