Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/cocoa/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Regex 使用正则表达式匹配或删除两个字符串中多次出现的字符串_Regex_Pcre - Fatal编程技术网

Regex 使用正则表达式匹配或删除两个字符串中多次出现的字符串

Regex 使用正则表达式匹配或删除两个字符串中多次出现的字符串,regex,pcre,Regex,Pcre,我有一个大型csv导出,其中列不对齐,因为有些值意外地放在多个单元格中,而不是一个单元格中。幸运的是,这些值位于两个唯一的字符串之间。我希望使用正则表达式将这些值合并到一个单元格中。样本数据如下: "apple","NULL","0","0","0",",","1",",","fruit","red","sweet","D$","object" "horse","NULL","0","0","0",",","1",",","animal","large","tail","D$","object"

我有一个大型csv导出,其中列不对齐,因为有些值意外地放在多个单元格中,而不是一个单元格中。幸运的是,这些值位于两个唯一的字符串之间。我希望使用正则表达式将这些值合并到一个单元格中。样本数据如下:

"apple","NULL","0","0","0",",","1",",","fruit","red","sweet","D$","object"
"horse","NULL","0","0","0",",","1",",","animal","large","tail","D$","object"
"Los Angeles","NULL","0","0","0",",","1",","city","California","smoggy","entertainment","D$","location"
未合并的值在之后开始

"NULL","0","0","0",",","1",",","
未合并的值在

","D$"
我试图找出一个正则表达式,它将删除值之间的“,”以合并它们,因此输出如下所示:

"apple","NULL","0","0","0",",","1",",","fruit,red,sweet","D$","object"
"horse","NULL","0","0","0",",","1",",","animal,large,tail","D$","object"
"Los Angeles","NULL","0","0","0",",","1",",","city,California,smoggy,entertainment","D$","location"

我在正则表达式中所能做的最好的事情就是匹配整个字符串的值,而不是将它们放入捕获组中。这意味着我不能在没有回调函数的情况下进行匹配/替换。根据您的语言,您必须以不同的方式执行此操作,但我将展示一个PHP示例。以下是:

它将匹配:

"fruit","red","sweet"

在PHP中,我使用
preg\u replace\u callback()
循环遍历每个匹配,然后将
“,”
的所有实例替换为
。当
$csv
等于示例数据时,这将为您提供预期的输出

$csv = preg_replace_callback(
    '/(?<="NULL","0","0","0",",","1",",)(?:"[^"]+",?)+(?=,"D\$")/',
    function($matches) {
        return str_replace('","', ',', reset($matches));
    },
    $csv
);
$csv=preg\u replace\u回调(
“/(?您可以这样做:

$pattern = '~(?:"NULL","0","0","0",",","1",",","|(?!^)\G)[^"]+\K","(?!D\$)~';
$csv = preg_replace($pattern, ',', $csv);
图案详情:

~             # delimiter
(?:
    "NULL","0","0","0",",","1",",","
  |           
    (?!^)\G   # anchor for the end of the last match
)
[^"]+         # content between quotes
\K            # removes all on the left from match result
","           # ","
(?!D\$)       # not followed by D$
~
该模式的思想是使用
\G
锚,意思是“字符串的开始”或“最后一个匹配的结束”。我添加了
(?!^)
,以避免出现第一种情况

将“NULL”、“0”、“0”、“0”、“1”、“1”、“1”、“1”、“3”、“3”作为第一次匹配的入口点。然后匹配引号之间的内容。由于
\K
删除了匹配结果左侧的所有内容,因此只替换了
,“


下一个匹配使用
\G
作为入口点,连续的匹配将继续进行,直到
(?!D\$)
成功。

您在此处中断csv是否正常
“1”,“fruit
还是打字错误?那是一个打字错误--更正了它!感谢您的快速响应和解释。@Sam:
\K
对于模拟可变长度查找也很有用。
$pattern = '~(?:"NULL","0","0","0",",","1",",","|(?!^)\G)[^"]+\K","(?!D\$)~';
$csv = preg_replace($pattern, ',', $csv);
~             # delimiter
(?:
    "NULL","0","0","0",",","1",",","
  |           
    (?!^)\G   # anchor for the end of the last match
)
[^"]+         # content between quotes
\K            # removes all on the left from match result
","           # ","
(?!D\$)       # not followed by D$
~