Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/shell/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Bash 用于更改/修改CSV分隔符和分隔符的sed语句_Bash_Shell_Csv_Sed_Perl_Awk - Fatal编程技术网

Bash 用于更改/修改CSV分隔符和分隔符的sed语句

Bash 用于更改/修改CSV分隔符和分隔符的sed语句,bash,shell,csv,sed,perl,awk,Bash,Shell,Csv,Sed,Perl,Awk,我有一些CSV文件,其中包含逗号分隔的值,一些列值可以包含字符,如,!/\& 我正在尝试将CSV转换为逗号分隔、引号括起的CSV 示例数据: DateCreated,DateModified,SKU,Name,Category,Description,Url,OriginalUrl,Image,Image50,Image100,Image120,Image200,Image300,Image400,Price,Brand,ModelNumber 2012-10-19 10:52:50,2013-

我有一些CSV文件,其中包含逗号分隔的值,一些列值可以包含字符,如
,!/\&

我正在尝试将CSV转换为逗号分隔、引号括起的CSV

示例数据:

DateCreated,DateModified,SKU,Name,Category,Description,Url,OriginalUrl,Image,Image50,Image100,Image120,Image200,Image300,Image400,Price,Brand,ModelNumber
2012-10-19 10:52:50,2013-06-11 02:07:16,34,Austral Foldaway 45 Rotary Clothesline,Home & Garden > Household Supplies > Laundry Supplies > Drying Racks & Hangers,"Watch the Product Video            Plenty of Space to Hang a Family Wash  Austral's Foldaway 45 rotary clothesline is a folding head rotary clothes hoist beautifully finished in either Beige or Heritage Green.  Even though the Foldaway 45 is compact, you still get a large 45 metres of line space, big enough for a full family wash.  If you want the advantage of a rotary hoist, but dont want to lose your yard, then the Austral Foldaway 45 is the clothesline for you.   Installation Note: A core hole is only required when installing into existing concrete, e.g. a pathway. Not required in the ground(grass/soil).  To watch video on YouTube, click the following link: Austral Foldaway 45 Rotary Clothesline                   //           Customer Video Reviews            ",https://track.commissionfactory.com.au/p/10604/1718695,http://www.lifestyleclotheslines.com.au/austral-foldaway-45-rotary-clothesline/,http://content.commissionfactory.com.au/Products/7228/1718695.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@50x50.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@100x100.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@120x120.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@200x200.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@300x300.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@400x400.jpg,309.9000 AUD,Austral,FA45GR
我想要达到的结果是

"DateCreated","DateModified","SKU","Name","Category","Description","Url","OriginalUrl","Image","Image50","Image100","Image120","Image200","Image300","Image400","Price","Brand","ModelNumber"
"2012-10-19 10:52:50","2013-06-11 02:07:16","34","Austral Foldaway 45 Rotary Clothesline","Home & Garden > Household Supplies > Laundry Supplies > Drying Racks & Hangers","Watch the Product Video            Plenty of Space to Hang a Family Wash  Austral's Foldaway 45 rotary clothesline is a folding head rotary clothes hoist beautifully finished in either Beige or Heritage Green.  Even though the Foldaway 45 is compact, you still get a large 45 metres of line space, big enough for a full family wash.  If you want the advantage of a rotary hoist, but dont want to lose your yard, then the Austral Foldaway 45 is the clothesline for you.   Installation Note: A core hole is only required when installing into existing concrete, e.g. a pathway. Not required in the ground(grass/soil).  To watch video on YouTube, click the following link: Austral Foldaway 45 Rotary Clothesline                   //           Customer Video Reviews            ","https://track.commissionfactory.com.au/p/10604/1718695","http://www.lifestyleclotheslines.com.au/austral-foldaway-45-rotary-clothesline/","http://content.commissionfactory.com.au/Products/7228/1718695.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@50x50.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@100x100.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@120x120.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@200x200.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@300x300.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@400x400.jpg","309.9000 AUD","Austral","FA45GR"

非常感谢您的帮助。

听起来您希望文件中的每一行都以双引号开头和结尾。如果是这样,这应该是可行的:

sed -i.bak 's/^\(.*\)$/"\1"/' filename
试试这个解决方案。它比我以前的版本要好,因为现在我使用的解析器可以正确处理字段中的逗号。需要模块
Text::csvxs
才能工作:

#!/usr/bin/env perl

use strict;
use warnings;
use Text::CSV_XS;

die qq|Usage: perl $0 <csv-file>\n| unless @ARGV == 1;

open my $fh, '<', shift or die qq|ERROR: Could not open input file\n|;

my $csv = Text::CSV_XS->new( {
        always_quote => 1,
} );

while ( my $row = $csv->getline( $fh ) ) { 
        $csv->print( *STDOUT, $row );
        print "\n";
}
$csv->eof;
close $fh;
#/usr/bin/env perl
严格使用;
使用警告;
使用Text::csvxs;
die qq |用法:perl$0\n |除非@ARGV==1;

打开我的$fh,“首先,让我们尝试一下简单(而且“不够好”)的解决方案,它只是在每个字段中添加一个双引号(包括那些已经有双引号的字段!这不是您想要的)

很好,第一部分查找不带逗号的序列,第二部分在其周围添加双引号,最后的“g”表示每行执行多次

这将转变

abc,345, some words ,"some text","text,with,commas"
进入 “abc”、“345”、“一些单词”、“一些文本”、“文本”、“带”、“逗号”

需要注意的几点:

  • 它正确地围绕着“一些单词”并在它们之间留有空格,但也围绕着开头和结尾的空格。我想这没关系,但如果不行,可以修复

  • 如果该字段已经有引号,它将再次被引用,这是错误的。需要修复

  • 如果字段已经有引号,而内部文本有逗号(不应将其视为字段分隔符),则这些逗号也会被引用。这也需要解决

因此,我们希望匹配两个不同的regexp-要么是带引号的字符串,要么是不带逗号的字段:

sed -r 's/([^,"]*|"[^"]*")/"\1"/g'
现在的结果是

"abc","345"," some words ",""some text"",""text,with,commas""
正如你所看到的,我们在最初引用的文本上有一个双引号。我们必须使用第二个sed命令删除此项:

sed -r 's/([^,"]*|"[^"]*")/"\1"/g' | sed 's/""/"/g'
导致

"abc","345"," some words ","some text","text,with,commas"

抱歉,不是CSV中的每一行、每一列都要包含在“@TimDunkley:请提供更多信息”中。有错误吗?完全没有修改?有些行可以工作,但不是所有行都可以吗?如果你试着用你整个档案的一小部分怎么样?@TimDunkley:啊,好的。失败的行是因为字段中有逗号。太复杂了。这个
awk
程序不能处理这种情况。有必要切换到带有CSV解析器的语言。我会尝试修复它。@TimDunkley:我已经编辑了答案,改为使用CSV解析器。我还不得不改变语言,现在是
perl
。我已经更新了原始帖子,出于某种原因,perl Text::CSV_XS似乎导致了我现有帖子的问题script@TimDunkley:
.convert.pl
这是
perl
脚本吗?我不理解您的整个脚本,但我认为您应该查看为
\u header\u列
变量赋值的行。在这里,您转换了标题并删除了前导双引号(使用
sed
),但没有删除尾随双引号。感谢您的回复,我刚刚测试了您的sed语法,但是它似乎没有提供所需的结果,我不知道我是否遗漏了一些内容,仍在掌握所有这些:)如果您想查看其中一个原始文件,请使用以下内容获取CSV
wget--trust server names--header=“Content Type:text/CSV”-O EXAMPLE_IN.CSV--user agent=“Mozilla/5.0(X11;U;Linux x86_64;en-US)AppleWebKit/534.16(KHTML,比如Gecko)铬合金/10.0.648.205 Safari/534.16英寸'http://dashboard.commissionfactory.com.au/Affiliate/Creatives/DataFeeds/jPSA4dbg17SY7svvjeSX5Jf1iO@b5JXshOfY@ovjzeKj4PGivuyn5qqxrLDr86GysLTj$bTyoKaj77Pltfmh9dvnkOCS4MHzjvTSlK6Dfg==/'
sed:-e表达式#1,字符24:不匹配(或\(
是我的Ubuntu客户端试图获取文件的输出。我得到了一个名为“EXAMPLE_IN.csv”的文件,但它是空的:(好的,找到了问题。你的文件是在Windows中创建的,所以“行尾”与Linux不兼容。请参阅,Windows将两个字符放在一个新行:\r后跟\n。Linux仅使用\n。因此,这种不兼容性会使您感到不适。我们需要首先从您的文件中删除\r。请尝试以下操作:
cat EXAMPLE_IN.csv | tr-d'\r'| sed-r's/([^,]*.“[^”]*.[^”]*”/“\1”/g'
"abc","345"," some words ","some text","text,with,commas"