Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/laravel/11.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如果有正的前向和正的后向,但没有分隔符,如何拆分字符串?_Python_Regex_Python 3.x_Split - Fatal编程技术网

Python 如果有正的前向和正的后向,但没有分隔符,如何拆分字符串?

Python 如果有正的前向和正的后向,但没有分隔符,如何拆分字符串?,python,regex,python-3.x,split,Python,Regex,Python 3.x,Split,例如: s = "Thisissometext andthisissometext" 我想将文本分为“是”和“一些”: 如果我这样做: re.split("(?<=is)s(?=ome)", s) --> ['Thisis', 'ometext andthisis', 'ometext'] re.split((?您需要支持空拆分的较新版本: import regex as re s = "Thisissometext andthisissometext" print(r

例如:

s = "Thisissometext andthisissometext"
我想将文本分为“是”和“一些”:

如果我这样做:

re.split("(?<=is)s(?=ome)", s)
-->    ['Thisis', 'ometext andthisis', 'ometext']
re.split((?您需要支持空拆分的较新版本:

import regex as re

s = "Thisissometext andthisissometext"

print(re.split(r"(?V1)(?<=is)(?=some)", s))
# ['Thisis', 'sometext andthisis', 'sometext']

这里不是使用
split
,而是一个正则表达式,您可以在
re.findall
中使用它来完成工作:

>>> s = "Thisissometext andthisissometext"
>>> print re.findall(r'[\w\s]+?(?:is(?=some)|$)', s)
['Thisis', 'sometext andthisis', 'sometext']

正则表达式分解:

  • [\w\s]+?
    :匹配1+个单词或空格字符(非贪婪的
  • (?:
    :启动非捕获组
    • is
      :匹配文字
      is
    • (?=some)
      :后面必须跟着
      some
    • |
      :或
    • $
      :它是字符串的结尾
  • :结束非捕获组

一种简单快捷的方法,如果您知道文本中不存在字符,
@'
这里:

s.replace('issome','is@some').split('@')
# ['Thisis', 'sometext andthisis', 'sometext']
测试:

In [300]: %timeit s.replace('issome','is@some').split('@')
976 ns ± 21.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [301]: %timeit regex.split(r"(?V1)(?<=is)(?=some)", s)
7.36 µs ± 145 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [302]: %timeit re.findall(r'[\w\s]+?(?:is(?=some)|$)', s)
4.28 µs ± 97.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
[300]中的
:%timeit s.replace('issome','is@some“)。拆分(“@”)
每个回路976纳秒±21.6纳秒(7次运行的平均值±标准偏差,每个1000000个回路)

[301]:%timeit regex.split(r)(?V1)(?Hoi-Jan,很好的解决方案!从没听说过“(?V1)”哇。@Reman:很高兴能帮忙。在答案的底部提供了另一种选择。谢谢你的解决方案。非常好,但有时我需要regex来分割我的字符串。另外还有一个用于timit!
print(re.split(r"(?<=is)(?=some)", s, flags = re.VERSION1))
>>> s = "Thisissometext andthisissometext"
>>> print re.findall(r'[\w\s]+?(?:is(?=some)|$)', s)
['Thisis', 'sometext andthisis', 'sometext']
s.replace('issome','is@some').split('@')
# ['Thisis', 'sometext andthisis', 'sometext']
In [300]: %timeit s.replace('issome','is@some').split('@')
976 ns ± 21.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [301]: %timeit regex.split(r"(?V1)(?<=is)(?=some)", s)
7.36 µs ± 145 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [302]: %timeit re.findall(r'[\w\s]+?(?:is(?=some)|$)', s)
4.28 µs ± 97.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)