JavaScript正则表达式和捕获组_Javascript_Regex

JavaScript正则表达式和捕获组

javascript regex

JavaScript正则表达式和捕获组,javascript,regex,Javascript,Regex,我不熟悉JavaScript中的正则表达式，从文本字符串中获取匹配数组时遇到问题，如下所示： Sentence would go here -foo -bar Another sentence would go here -baz -bat 我希望获得如下匹配数组： match[0] = [ 'foo', 'bar' ] match[1] = [ 'baz', 'bat' ] 总之，我要寻找的是：在句子之后出现的任何破折号+单词（-foo、-bar等）任何

我不熟悉JavaScript中的正则表达式，从文本字符串中获取匹配数组时遇到问题，如下所示：

Sentence would go here
-foo
-bar
Another sentence would go here
-baz
-bat

我希望获得如下匹配数组：

match[0] = [
    'foo',
    'bar'
]
match[1] = [
    'baz',
    'bat'
]

总之，我要寻找的是：

在句子之后出现的任何破折号+单词（-foo、-bar等）

任何人都可以提供一个公式来捕获所有迭代而不是最后一次，因为重复捕获组只捕获最后一次迭代。如果这是个愚蠢的问题，请原谅我。如果有人想给我发送一些测试，我将使用regex101。Regexp捕获无法很好地处理无限数量的组。相反，在这里拆分效果更好：

var text=document.getElementById（'text'）.textContent；
var blocks=text.split（/^（？）/m）；
var结果=blocks.map（函数（块）{
返回block.split（/^-/m）.slice（1）.map（函数（行）{
返回线.trim（）；
});
});
document.getElementById（'text'）.textContent=JSON.stringify（结果）
这个句子应该放在这里
-福
-酒吧
这里还有一句话
-巴兹
-球棒
Regexp捕获无法很好地处理无限数量的组。相反，在这里拆分效果更好：

var text=document.getElementById（'text'）.textContent；
var blocks=text.split（/^（？）/m）；
var结果=blocks.map（函数（块）{
返回block.split（/^-/m）.slice（1）.map（函数（行）{
返回线.trim（）；
});
});
document.getElementById（'text'）.textContent=JSON.stringify（结果）
这个句子应该放在这里
-福
-酒吧
这里还有一句话
-巴兹
-球棒
我想到的第一个正则表达式如下：
/([^-]+)(-\w*)/g

第一组（[^-]+）
抓住所有不是破折号的东西。然后，我们跟踪我们想要的实际捕获组（\w+）
。我们添加了标志g
，以使正则表达式对象跟踪它最后查看的位置。这意味着，每次运行regex.exec（search）
时，我们都会得到您在regex101中看到的下一个匹配项
注：JavaScript的\w
相当于[a-zA-Z0-9_33;]
。所以，如果你只想要字母，就用这个代替\w
：[a-zA-Z]


下面是实现此正则表达式的代码
<p id = "input">
    Sentence would go here
    -foo
    -bar
    Another sentence would go here
    -baz
    -bat
</p>

<p id = "output">

</p>

<script>
    // Needed in order to make sure did not get a sentence.
    function check_for_word(search) {return search.split(/\w/).length > 1}
    function capture(regex, search) {
        var 
        // The initial match.
            match  = regex.exec(search),
        // Stores all of the results from the search.
            result = [],
        // Used to gather results.
            gather;
        while(match) {
            // Create something empty.
            gather = [];
            // Push onto the gather.
            gather.push(match[2]);
            // Get the next match.
            match = regex.exec(search);
            // While we have more dashes...
            while(match && !check_for_word(match[1])) {
                // Push result on!
                gather.push(match[2]);
                // Get the next match to be checked.
                match = regex.exec(search);
            };
            // Push what was gathered onto the result.
            result.push(gather);
        }
        // Hand back the result.
        return result;
    };
    var output = capture(/([^-]+)(-\w+)/g, document.getElementById("input").innerHTML);
    document.getElementById("output").innerHTML = JSON.stringify(output);
</script>

[^-\w]*
的额外位允许每个破折号字之间有某种分隔。然后添加非捕获组（？：）
，以允许+
一个或多个破折号。我们也不需要（）
周围的[^-]+
，因为您将在下面看到，不再需要数据。第一个是关于破折号单词之间的分隔更灵活，但我发现这个要干净得多

函数捕获（正则表达式，搜索）{
变量
//第一场比赛。
match=regex.exec（搜索），
//存储搜索的所有结果。
结果=[]，
//用于收集结果。
聚集；
while（比赛）{
//创造一些空的东西。
聚集=[]；
//打破这场大型比赛。
var temp=match[1]。拆分（'-'）；
用于（温度中的变量i）
{
temp[i]=temp[i].拆分（/\W*/）.join（“”）；
//确保确实有东西要收集。
如果（温度[i]。长度>0）
聚.推（“-”+temp[i]）；
}
//将收集到的内容推到结果上。
结果：推（收集）；
//参加下一场比赛。
match=regex.exec（搜索）；
};
//交回结果。
返回结果；
};
var output=capture（/[^-]+（（？：-\w+[^-\w]*）+）/g，document.getElementById（“input”）.innerHTML）；
document.getElementById（“输出”）.innerHTML=JSON.stringify（输出）

这句话就到此为止
-福
-酒吧
这里还有一句话
-巴兹
-球棒
我自己的话！
-得到
-全部
-的
-这些！


我提出的第一个正则表达式如下：
/([^-]+)(-\w*)/g

第一组（[^-]+）
抓住所有不是破折号的东西。然后，我们跟踪我们想要的实际捕获组（\w+）
。我们添加了标志g
，以使正则表达式对象跟踪它最后查看的位置。这意味着，每次运行regex.exec（search）
时，我们都会得到您在regex101中看到的下一个匹配项
注：JavaScript的\w
相当于[a-zA-Z0-9_33;]
。所以，如果你只想要字母，就用这个代替\w
：[a-zA-Z]


下面是实现此正则表达式的代码
<p id = "input">
    Sentence would go here
    -foo
    -bar
    Another sentence would go here
    -baz
    -bat
</p>

<p id = "output">

</p>

<script>
    // Needed in order to make sure did not get a sentence.
    function check_for_word(search) {return search.split(/\w/).length > 1}
    function capture(regex, search) {
        var 
        // The initial match.
            match  = regex.exec(search),
        // Stores all of the results from the search.
            result = [],
        // Used to gather results.
            gather;
        while(match) {
            // Create something empty.
            gather = [];
            // Push onto the gather.
            gather.push(match[2]);
            // Get the next match.
            match = regex.exec(search);
            // While we have more dashes...
            while(match && !check_for_word(match[1])) {
                // Push result on!
                gather.push(match[2]);
                // Get the next match to be checked.
                match = regex.exec(search);
            };
            // Push what was gathered onto the result.
            result.push(gather);
        }
        // Hand back the result.
        return result;
    };
    var output = capture(/([^-]+)(-\w+)/g, document.getElementById("input").innerHTML);
    document.getElementById("output").innerHTML = JSON.stringify(output);
</script>

[^-\w]*
的额外位允许每个破折号字之间有某种分隔。然后添加非捕获组（？：）
，以允许+
一个或多个破折号。我们也不需要（）
周围的[^-]+
，因为您将在下面看到，不再需要数据。第一个是关于破折号单词之间的分隔更灵活，但我发现这个要干净得多

函数捕获（正则表达式，搜索）{
变量
//第一场比赛。
match=regex.exec（搜索），
//存储搜索的所有结果。
结果=[]，
//用于收集结果。
聚集；
while（比赛）{
//创造一些空的东西。
聚集=[]；
//打破这场大型比赛。
var temp=match[1]。拆分（'-'）；
用于（温度中的变量i）
{
temp[i]=temp[i].拆分（/\W*/）.join（“”）；
//确保确实有东西要收集。
如果（温度[i]。长度>0）
聚.推（“-”+temp[i]）；
}
//将收集到的内容推到结果上。
结果：推（收集）；
//参加下一场比赛。
match=regex.exec（搜索）；
};
//交回结果。
返回结果；
};
var output=capture（/[^-]+（（？：-\w+[^-\w]*）+）/g，document.getElementById（“input”）.innerHTML）；
document.getElementById（“输出”）.innerHTML=JSON.stringify（输出）

他会被判刑的