Php 正则表达式，用于在保留标点符号的同时由连字符和下划线连接的单词_Php_Regex

Php 正则表达式，用于在保留标点符号的同时由连字符和下划线连接的单词

php regex

Php 正则表达式，用于在保留标点符号的同时由连字符和下划线连接的单词,php,regex,Php,Regex,我一直在阅读、搜索和试用不同的方法来编写正则表达式，比如p{L}、[a-z]和\w，但我似乎没有得到我想要的结果问题我有一个由带标点符号的完整句子组成的数组，我正在使用以下pre_匹配通过一个数组进行解析，该数组在保留单词和标点符号方面效果很好 preg_match_all('/(\w+|[.;?!,:])/', $match, $matches) 然而，我现在有这样的话：换个词更多像这样的词我希望能够保留这些单词的完整性，因为它们是相互关联的，但我目前的preg_匹配将它们分解

我一直在阅读、搜索和试用不同的方法来编写正则表达式，比如p{L}、[a-z]和\w，但我似乎没有得到我想要的结果

问题我有一个由带标点符号的完整句子组成的数组，我正在使用以下pre_匹配通过一个数组进行解析，该数组在保留单词和标点符号方面效果很好

preg_match_all('/(\w+|[.;?!,:])/', $match, $matches)

然而，我现在有这样的话：

换个词
更多像这样的词

我希望能够保留这些单词的完整性，因为它们是相互关联的，但我目前的preg_匹配将它们分解为单个单词

我试过的及

我从中找到的

但无法实现这一预期结果：

Array ( [0] A, [1] word, [2] like_this, [3] connected, [4] ; ,[5] with-relevant-punctuation)

理想情况下，我还能够解释特殊字符，因为其中一些单词可能有重音

，只需在字符类中插入连字符即可。但请注意，连字符需要出现在字符集的开头或结尾。否则它将被视为范围符号

(\w+|[-.;?!,:])

例子 现场演示

示例文本

However, I now have words like these:

Word-another-word
more_words_like_these

and I would like to be able to retain the integrity of these words as they are (connected) but my current preg_match breaks them down into individual words.

样本匹配

其他单词如前所述被捕获，但带有连字符的单词也被捕获

Omitted Match 1-9 for brevity 

MATCH 10
1.  [39-56] `Word-another-word`

MATCH 11
1.  [57-78] `more_words_like_these`

Omitted Match 12+ for brevity

解释

你试过

[\w.；？！，：]+

吗？它输入了

一个像这样的单词；与相关标点符号

或

类似的单词连接；使用相关的标点符号（“/（\S+/”，$match，$match，$matches）

）？…或者甚至是

\S

都可以做到-

preg\u match\u all（“/（\S+）/”，$match，$matches）

感谢您的回答并花时间解释它；非常有用。它起作用了

However, I now have words like these:

Word-another-word
more_words_like_these

and I would like to be able to retain the integrity of these words as they are (connected) but my current preg_match breaks them down into individual words.

Omitted Match 1-9 for brevity 

MATCH 10
1.  [39-56] `Word-another-word`

MATCH 11
1.  [57-78] `more_words_like_these`

Omitted Match 12+ for brevity

NODE                     EXPLANATION
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    \w+                      word characters (a-z, A-Z, 0-9, _) (1 or
                             more times (matching the most amount
                             possible))
----------------------------------------------------------------------
   |                        OR
----------------------------------------------------------------------
    [-.;?!,:]                any character of: '-', '.', ';', '?',
                             '!', ',', ':'
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------