Regex 如何使用正则表达式选择特定数量的字符词

Regex 如何使用正则表达式选择特定数量的字符词,regex,text,Regex,Text,我有一个文本如下 Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the fivec harword 1500s, when an unknown printer took a galley of type and scrambled it to make a

我有一个文本如下

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum 
has been the industry's standard dummy text ever since the fivec harword 1500s, when an unknown printer 
took a galley of type and scrambled it to make a type specimen fivec harword book. It has survived not
only five centuries, but also the leap into electronic typesetting, remaining essentially 
unchanged. It was popularised in the 1960s with the release of fivec harword Letraset sheets containing 
Lorem Ipsum passages, and more recently with desktop publishing software like Aldus 
PageMaker including versions of Lorem Ipsum.
以下是我需要的正则表达式:

1-选择五字符单词

2-在第一步后选择一个空格

3-在第二步后选择七个字符的单词


它应该捕获所有
fivec harword
字符串。我怎样才能做到这一点呢?

这应该可以做到

(^|\W)\w{5}\s\w{7}($|\W)
(^ |\W)
字符串或非单词字符的开头

\w{5}
由5个字组成的字符串

\s
空格

\w{7}
由7个字组成的字符串

($|\W)
字符串或非单词字符的结尾


如果您特别希望字符串周围有空格(而不是标点符号等),请将
\W
\s

替换为
,这样就可以了

(^|\W)\w{5}\s\w{7}($|\W)
(^ |\W)
字符串或非单词字符的开头

\w{5}
由5个字组成的字符串

\s
空格

\w{7}
由7个字组成的字符串

($|\W)
字符串或非单词字符的结尾

如果您特别希望字符串周围有空格(与标点符号等相反),请将
\W
替换为
\s

使用此空格:

\b\w{5}\s\w{7}\b
说明:

The regular expression:

(?-imsx:\b\w{5}\s\w{7}\b)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
----------------------------------------------------------------------
  \w{5}                    word characters (a-z, A-Z, 0-9, _) (5
                           times)
----------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
  \w{7}                    word characters (a-z, A-Z, 0-9, _) (7
                           times)
----------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
使用这个:

\b\w{5}\s\w{7}\b
说明:

The regular expression:

(?-imsx:\b\w{5}\s\w{7}\b)

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
----------------------------------------------------------------------
  \w{5}                    word characters (a-z, A-Z, 0-9, _) (5
                           times)
----------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
  \w{7}                    word characters (a-z, A-Z, 0-9, _) (7
                           times)
----------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
试试这个

\b[a-zA-Z]{5}\s[][a-zA-Z]{7}\b
\b表示边界

[a-zA-Z]所有阿尔法赌注

{5} 带上一个表达式的5个字符

\s单个空格

试试这个

\b[a-zA-Z]{5}\s[][a-zA-Z]{7}\b
\b表示边界

[a-zA-Z]所有阿尔法赌注

{5} 带上一个表达式的5个字符


\s单个空格

这将在字符串的开头或结尾不匹配。为什么这一个不起作用?我在正则表达式的末尾添加了5个字符的单词
\W\W{5}\s\W{7}\s\W{5}\W
我尝试了正确的文本,应该可以找到。M42是对的,我已经调整了我的,以适应字符串的开头和结尾(仍然允许灵活地匹配空格,如果必要的话),这在字符串的开头或结尾都不匹配。为什么这个不起作用?我在正则表达式的末尾添加了5个字符的单词
\W\W{5}\s\W{7}\s\W{5}\W
我尝试了正确的文本,应该可以找到。M42是正确的,我已经调整了我的文本以适应字符串的开头和结尾(仍然允许在必要时灵活匹配空格)