Python 如何通过标识一个模式，然后在同一模式的子页面下进行拆分_Python_Regex

Python 如何通过标识一个模式，然后在同一模式的子页面下进行拆分

python regex

Python 如何通过标识一个模式，然后在同一模式的子页面下进行拆分,python,regex,Python,Regex,我已经习惯了重建图书馆，这里是交易：例如，我有一个例子，这个字符串： x = '501086110 - Werfen Portugal, Lda | 501524177 - Biomérieux Portugal, Lda | 503387398 - ALFAGENE,LDA. | 503842770 - VWR | 504282921 - Roche - Sistemas de Diagnóstico | 507699321 - B|Nice Juices' 我想用“|”分割它，但前提是它

我已经习惯了重建图书馆，这里是交易：

例如，我有一个例子，这个字符串：

x = '501086110 - Werfen Portugal, Lda | 501524177 - Biomérieux Portugal, Lda | 503387398 - ALFAGENE,LDA. | 503842770 - VWR | 504282921 - Roche - Sistemas de Diagnóstico | 507699321 - B|Nice Juices'

我想用“|”分割它，但前提是它前面有字母或特殊字符，后面有数字

我在做：

re.split(pattern= '\w\s\W\s\d|\W\s\W\s\d', string = x)

对于这个例子，它有点像我想要的，但是它去掉了split1的最后一个字符和split2的第一个字符

你能建议一个更好的方法来实现这一点吗？理想情况下，我会将以下内容作为输出：

["501086110 - Werfen Portugal, Lda",  "501524177 - Biomérieux Portugal, Lda","503387398 - ALFAGENE,LDA.","503842770 - VWR","504282921 - Roche - Sistemas de Diagnóstico", "507699321 - B|Nice Juices"]

您可以在

split

中将此正则表达式与环视一起使用：

(?<=.)\s*\|\s*(?=\d)

（？>>导入re
>>>x='501086110-Werfen葡萄牙，Lda | 501524177-Biomérieux葡萄牙，Lda | 503387398-ALFAGENE，Lda.| 503842770-VWR | 50428921-罗氏-诊断系统| 507699321-B |尼斯果汁
>>>reg=re.compile（r'（？也许你只需要x.split（'|'）
？如果你只需要使用空格进行拆分，那么就不需要正则表达式了+|
+space谢谢你的回复。这是我第一次尝试，但我也有类似这样恼人的“507699321-B”这样的例子这会破坏解决方案，但这种情况很好。507699321-B
中的
周围没有空格，很好的果汁

@anubhava真是上帝般的，成功了！你能教我背后的逻辑吗？@anubhava非常感谢你！帮了大忙！

>>> import re
>>> x = '501086110 - Werfen Portugal, Lda | 501524177 - Biomérieux Portugal, Lda | 503387398 - ALFAGENE,LDA. | 503842770 - VWR | 504282921 - Roche - Sistemas de Diagnóstico | 507699321 - B|Nice Juices'
>>> reg = re.compile(r'(?<=.)\s*\|\s*(?=\d)')
>>> arr = reg.split(x)
>>> print ( "\n".join(arr) )
501086110 - Werfen Portugal, Lda
501524177 - Biomérieux Portugal, Lda
503387398 - ALFAGENE,LDA.
503842770 - VWR
504282921 - Roche - Sistemas de Diagnóstico
507699321 - B|Nice Juices