Regex 使用正则表达式修改CSV中的特定列

Regex 使用正则表达式修改CSV中的特定列,regex,perl,r,csv,Regex,Perl,R,Csv,我希望将CSV中的一些字符串转换为00-24小时格式,这些字符串为0000-2400小时格式。e、 g 2011-01-01,"AA",12478,31703,12892,32575,"0906",-4.00,"1209",-26.00,2475.00 2011-01-02,"AA",12478,31703,12892,32575,"0908",-2.00,"1236",1.00,2475.00 2011-01-03,"AA",12478,31703,12892,32575,"0907",-3.

我希望将CSV中的一些字符串转换为00-24小时格式,这些字符串为0000-2400小时格式。e、 g

2011-01-01,"AA",12478,31703,12892,32575,"0906",-4.00,"1209",-26.00,2475.00
2011-01-02,"AA",12478,31703,12892,32575,"0908",-2.00,"1236",1.00,2475.00
2011-01-03,"AA",12478,31703,12892,32575,"0907",-3.00,"1239",4.00,2475.00
第7列和第9列分别是出发时间和到达时间。最好在我完成后,线条应如下所示:

2011-01-01,"AA",12478,31703,12892,32575,"09",-4.00,"12",-26.00,2475.00
整个csv最终将被导入到R中,我想尝试预先处理一些处理,因为它有点大。我最初试图用Perl来实现这一点,但我在识别带有正则表达式的多个数字时遇到了困难。我可以用lookback表达式得到给定逗号前的一位数字,但不能超过一位


我也乐于接受这样的说法,用Perl做这件事是不必要的愚蠢,我应该坚持使用R.:)

我还可以提供我自己的解决方案,这就是

s/"(\d\d)\d\d"/"$1"/g

正如我在评论中提到的,使用CSV模块是一个安全的选择。这是如何使用它的一个快速示例脚本。您会注意到,它不会保留引号,尽管它应该保留引号,因为我在
中输入了keep\u meta\u info
。如果这对你很重要,我肯定有办法解决

use strict;
use warnings;
use Data::Dumper;

use Text::CSV;
my $csv = Text::CSV->new({
        binary => 1,
        eol => $/,
        keep_meta_info => 1,
});
while (my $row = $csv->getline(*DATA)) {
    for ($row->[6], $row->[8]) {
        s/\d\d\K\d\d//;
    }
    $csv->print(*STDOUT, $row);
}

__DATA__
2011-01-01,"AA",12478,31703,12892,32575,"0906",-4.00,"1209",-26.00,2475.00
2011-01-02,"AA",12478,31703,12892,32575,"0908",-2.00,"1236",1.00,2475.00
2011-01-03,"AA",12478,31703,12892,32575,"0907",-3.00,"1239",4.00,2475.00
输出:

2011-01-01,AA,12478,31703,12892,32575,09,-4.00,12,-26.00,2475.00
2011-01-02,AA,12478,31703,12892,32575,09,-2.00,12,1.00,2475.00
2011-01-03,AA,12478,31703,12892,32575,09,-3.00,12,4.00,2475.00

我会考虑使用一个设计来处理CSV的模块,例如,谢谢这个更新。我最初只是需要一些愚蠢和不安全的东西,但这可能更明智