Regex 正则表达式-贪婪但在字符串匹配之前停止_Regex_Editpad

Regex 正则表达式-贪婪但在字符串匹配之前停止

regex

Regex 正则表达式-贪婪但在字符串匹配之前停止,regex,editpad,Regex,Editpad,我有一些数据，我想把它转换成表格格式这是输入数据 1- This is the 1st line with a newline character 2- This is the 2nd line 每行可以包含多个换行符输出 <td>1- This the 1st line with a new line character</td> <td>2- This is the 2nd line</td> 1-这是带有新行字符这是第二条线

我有一些数据，我想把它转换成表格格式

这是输入数据

1- This is the 1st line with a 
newline character
2- This is the 2nd line

每行可以包含多个换行符

输出

<td>1- This the 1st line with 
a new line character</td>
<td>2- This is the 2nd line</td>

1-这是带有
新行字符
这是第二条线

我试过以下方法

^（\d{1,3}-[^\d]*

但它似乎只匹配到第1位的数字1

我希望在字符串中找到另一个\d{1,3}\-后能够停止匹配。有什么建议吗

编辑：

我使用的是EditPad Lite。

您没有指定语言（有许多regexp实现），但一般来说，您正在寻找的是“正向前瞻”，它允许您添加会影响匹配的模式，但不会成为匹配的一部分

在您使用的任何语言的文档中搜索前瞻

编辑：下面的示例似乎在vim中工作

:%s#\v(^\d+-\_.{-})\ze(\n\d+-|%$)#<td>\1</td>

：%s#\v（^\d+-\.{-}）\ze（\n\d+-\\\%$）\1

注释如下：

%      - for all lines
s#     - substitute the following (you can use any delimiter, and slash is most
         common, but as that will require that we escape slashes in the command
         I chose to use the number sign)
\v     - very magic mode, let's us use less backslashes
(      - start group for back referencing
^      - start of line
\d+    - one or more digits (as many as possible)
-      - a literal dash!
\_.    - any character, including a newline
{-}    - zero or more of these (as few as possible)
)      - end group
\ze    - end match (anything beyond this point will not be included in the match)
(      - start a new group
[\n\r] - newline (in any format - thanks Alan)
\d+    - one or more digits
-      - a dash
|      - or
%$     - end of file
)      - end group
#      - start substitute string
<td>\1</td> - a TD tag around the first matched group

%-用于所有行
s#-替换以下内容（您可以使用任何分隔符，斜杠是最常用的分隔符
常见，但这需要在命令中转义斜杠
我选择使用数字符号）
\v-非常神奇的模式，让我们少用反斜杠
（-用于反向引用的起始组）
^-线路起点
\d+-一个或多个数字（尽可能多）
--字面上的破折号！
\_-任何字符，包括换行符
{-}-零个或多个（尽可能少）
)-端基
\ze-结束比赛（超出此点的任何内容都不包括在比赛中）
（-成立一个新团队
[\n\r]-换行符（任何格式-谢谢Alan）
\d+-一个或多个数字
--短跑
|-或
%$-文件结束
)-端基
#-开始替换字符串
\1-第一个匹配组周围的TD标签

这是针对vim的，使用零宽度正向前瞻：

/^\d\{1,3\}-\_.*[\r\n]\(\d\{1,3\}-\)\@=

步骤：

/^\d\{1,3\}-              1 to 3 digits followed by -
\_.*                      any number of characters including newlines/linefeeds
[\r\n]\(\d\{1,3\}-\)\@=   followed by a newline/linefeed ONLY if it is followed 
                          by 1 to 3 digits followed by - (the first condition)

编辑：在pcre/ruby中是这样的：

/(\d{1,3}-.*?[\r\n])(?=(?:\d{1,3}-)|\Z)/m

注意：您需要一个以换行符结尾的字符串来匹配最后一个条目。

您只能匹配分隔符并在其上拆分。例如，在C#中，可以这样做：

string s = "1- This is the 1st line with a \r\nnewline character\r\n2- This is the 2nd line";
string ss = "<td>" + string.Join("</td>\r\n<td>", Regex.Split(s.Substring(3), "\r\n\\d{1,3}- ")) + "</td>";
MessageBox.Show(ss);

string s=“1-这是第1行，带有\r\n换行符\r\n2-这是第2行”；
string ss=”“+string.Join（“\r\n”，Regex.Split（s.Substring（3），“\r\n\\d{1,3}-”）+”；
MessageBox.Show（ss）；

分三步做对你有好处吗

（这些是perl正则表达式）：

替换第一个：

$input =~ s/^(\d{1,3})/<td>\1/;

$input=~s/^（\d{1,3}）/\1/；

替换其余的

$input =~ s/\n(\d{1,3})/<\/td>\n<td>\1/gm;

$input=~s/\n（\d{1,3}）/\n\1/gm；

添加最后一个：

$input .= '</td>';

$input.=''；

搜索：^\d+-.*（:[\r\n]++（？！\d+-）.）*
替换：0美元

[\r\n]+

匹配一个或多个回车符或换行符，因此您不必担心文件是否使用Unix（

\n

）、DOS（

\r\n

）或旧的Mac（

\r

）行分隔符

（？！\d+-）

声明行分隔符后面的第一个不是另一个行号

我在

[\r\n]+

中使用了所有格

，以确保它与整个分隔符匹配。否则，如果分隔符是

\r\n

，

[\r\n]+

可以与

\r

匹配，

（？！\d+-）

可以与

\n

匹配

在EditPad Pro中测试过，但在Lite中也应该可以使用。

我尝试过这个，但除了最后一行之外，它似乎与所有内容都匹配。因此，如果有10行{1-bla-bla，2-bla-bla，…，9-bla-bal，10-bla-bla}，它将匹配除第10行之外的所有行；顺便说一句，我建议在这里使用类似于Eugene的方法，如果不是为了学习regexest，那么这与全文相匹配！：）我正在使用EditPad Lite，如果这是任何信息的话。我在发布之前尝试过。凭你自己的样品。如果你使用的是PCRE引擎，它就工作了。我用我的例子和你的表达式试过了，它也做了同样的事情。regexpal是PCRE吗？我会用Java试试看，然后告诉你。@Nerrve，我已经编辑了我的答案，其中包含了一个与vim一起使用的示例，你可以很容易地下载它。我希望这有帮助。

$input .= '</td>';

SEARCH:   ^\d+-.*(?:[\r\n]++(?!\d+-).*)*

REPLACE:  <td>$0</td>