Shell Unix-删除带有终止符逗号的内部双引号
输入文件:Shell Unix-删除带有终止符逗号的内部双引号,shell,perl,unix,ksh,Shell,Perl,Unix,Ksh,输入文件: "1","2col",""3col " " "2","2col"," "3c,ol " " "3","2col"," 3co,l" "4","2col","3co,l" "5","2col",""3co,l "" " "6","2col",""3c,ol ""3c,ol""" "1","2col","3col " "2","2col"," 3c,ol " "3","2col"," 3co,l" "4"
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
sed 's/""|/|/g'
sed -e "s/\"\"//g"
perl -pe 's/(?<!^)(?<!\,)"(?!\,)(?!$)/""/g'
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
tr -d '"' < input | awk -F',' -v OFS=',' '{$1="\""$1"\"";$2="\""$2"\"";printf $1 OFS $2 OFS "\"";for(u=3;u<=NF;u++){if(u!=NF)printf $u OFS;else printf $u};printf "\"" RS}'
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l "
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
'{
$1="\""$1"\""
$2="\""$2"\""
printf $1 OFS $2 OFS "\""
for(u=3;u<=NF;u++)
{
if(u!=NF)printf $u OFS
else printf $u
}
printf "\"" RS
}'
输出文件:
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
sed 's/""|/|/g'
sed -e "s/\"\"//g"
perl -pe 's/(?<!^)(?<!\,)"(?!\,)(?!$)/""/g'
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
tr -d '"' < input | awk -F',' -v OFS=',' '{$1="\""$1"\"";$2="\""$2"\"";printf $1 OFS $2 OFS "\"";for(u=3;u<=NF;u++){if(u!=NF)printf $u OFS;else printf $u};printf "\"" RS}'
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l "
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
'{
$1="\""$1"\""
$2="\""$2"\""
printf $1 OFS $2 OFS "\""
for(u=3;u<=NF;u++)
{
if(u!=NF)printf $u OFS
else printf $u
}
printf "\"" RS
}'
请帮助我使用Unix命令获取上述输出。请注意,输出中的第3列已修改,所有内部双引号已删除
逗号是终止符。如果双引号之间存在逗号,则不将其视为终止符。参见第6行和第2个逗号后,逗号作为双引号之间的文本出现,这很好
到目前为止我所尝试的:
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
sed 's/""|/|/g'
sed -e "s/\"\"//g"
perl -pe 's/(?<!^)(?<!\,)"(?!\,)(?!$)/""/g'
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
tr -d '"' < input | awk -F',' -v OFS=',' '{$1="\""$1"\"";$2="\""$2"\"";printf $1 OFS $2 OFS "\"";for(u=3;u<=NF;u++){if(u!=NF)printf $u OFS;else printf $u};printf "\"" RS}'
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l "
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
'{
$1="\""$1"\""
$2="\""$2"\""
printf $1 OFS $2 OFS "\""
for(u=3;u<=NF;u++)
{
if(u!=NF)printf $u OFS
else printf $u
}
printf "\"" RS
}'
sed的/“|/|/g”
sed-e“s/\”\“//g”
perl-pe的/(?假设(例如,第一列和第二列是“干净的”,它们不包含,
)
输入:
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
sed 's/""|/|/g'
sed -e "s/\"\"//g"
perl -pe 's/(?<!^)(?<!\,)"(?!\,)(?!$)/""/g'
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
tr -d '"' < input | awk -F',' -v OFS=',' '{$1="\""$1"\"";$2="\""$2"\"";printf $1 OFS $2 OFS "\"";for(u=3;u<=NF;u++){if(u!=NF)printf $u OFS;else printf $u};printf "\"" RS}'
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l "
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
'{
$1="\""$1"\""
$2="\""$2"\""
printf $1 OFS $2 OFS "\""
for(u=3;u<=NF;u++)
{
if(u!=NF)printf $u OFS
else printf $u
}
printf "\"" RS
}'
命令:
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
sed 's/""|/|/g'
sed -e "s/\"\"//g"
perl -pe 's/(?<!^)(?<!\,)"(?!\,)(?!$)/""/g'
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
tr -d '"' < input | awk -F',' -v OFS=',' '{$1="\""$1"\"";$2="\""$2"\"";printf $1 OFS $2 OFS "\"";for(u=3;u<=NF;u++){if(u!=NF)printf $u OFS;else printf $u};printf "\"" RS}'
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l "
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
'{
$1="\""$1"\""
$2="\""$2"\""
printf $1 OFS $2 OFS "\""
for(u=3;u<=NF;u++)
{
if(u!=NF)printf $u OFS
else printf $u
}
printf "\"" RS
}'
解释:
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
sed 's/""|/|/g'
sed -e "s/\"\"//g"
perl -pe 's/(?<!^)(?<!\,)"(?!\,)(?!$)/""/g'
"1","2col",""3col " "
"2","2col"," "3c,ol " "
"3","2col"," 3co,l"
"4","2col","3co,l"
"5","2col",""3co,l "" "
"6","2col",""3c,ol ""3c,ol"""
tr -d '"' < input | awk -F',' -v OFS=',' '{$1="\""$1"\"";$2="\""$2"\"";printf $1 OFS $2 OFS "\"";for(u=3;u<=NF;u++){if(u!=NF)printf $u OFS;else printf $u};printf "\"" RS}'
"1","2col","3col "
"2","2col"," 3c,ol "
"3","2col"," 3co,l "
"4","2col","3co,l"
"5","2col","3co,l "
"6","2col","3c,ol 3c,ol"
'{
$1="\""$1"\""
$2="\""$2"\""
printf $1 OFS $2 OFS "\""
for(u=3;u<=NF;u++)
{
if(u!=NF)printf $u OFS
else printf $u
}
printf "\"" RS
}'
tr-d'”
将删除所有“
| awk
通过管道将输出传输到awk
-F','-v OFS=','
输入/输出字段分隔符定义为逗号
- 使用
$1=“\”“$1”\”;$2=“\”“$2”\”;
将前两列用“
环绕,然后打印它们printf$1 of s$2 of s”\;
for(u=3;u使用引号查找前两个字段并连接其他字段
awk -F '"' '
BEGIN {q="\""}
{printf "%s", q$2q$3q$4q$5q; for (i=6;i<=NF;i++) printf "%s", $i; print q}
' inputfile
你好,Allan,sed的/“|/”/g”和sed-e的“s/\”/g”和perl-pe的/(?perl
'sText::CSV
模块可能很好地解决了这个问题。为什么在测试示例的第一行中,输入有两个尾随空格:“1”、“2col”、“3col[]”“
”,而在输出中有四(四)个尾随空格:“1”,“2col”、“3col[]”
?请调整示例,使其准确符合您的要求。双引号在记录5中不平衡……您能澄清一下吗that@Allan几天前,我已经对下面的答案进行了升级。前两个字段中的逗号将失败:“1”、“2、col”、“3col”“
@WalterA:我已经编辑并添加了一条免责声明。在OP post中,col1和COL2看起来不错,所以我在这个假设的基础上构建了我的解决方案。