在python中按特定顺序查找子字符串
我有一个很长的字符串列表,其中包含按给定顺序排列的感兴趣的子字符串,但下面是一个在文本文件中使用句子的小示例:在python中按特定顺序查找子字符串,python,string,Python,String,我有一个很长的字符串列表,其中包含按给定顺序排列的感兴趣的子字符串,但下面是一个在文本文件中使用句子的小示例: This is a long drawn out sentence needed to emphasize a topic I am trying to learn. It is new idea for me and I need your help with it please! Thank you so much in advance, I really appreciate i
This is a long drawn out sentence needed to emphasize a topic I am trying to learn.
It is new idea for me and I need your help with it please!
Thank you so much in advance, I really appreciate it.
从这个文本文件中,我想找到任何既包含“I”
又包含“need”
的句子,但它们必须按顺序出现
所以在这个例子中,'I'
和'need'
都出现在第1句和第2句中,但是在第1句中它们的顺序是错误的,所以我不想返回。我只想返回第二句话,因为它的顺序是“我需要”
我已使用此示例来标识子字符串,但我无法确定如何仅按顺序查找它们:
id1 = "I"
id2 = "need"
with open('fun.txt') as f:
for line in f:
if id1 and id2 in line:
print(line[:-1])
这将返回:
This is a long drawn out sentence needed to emphasize a topic I am trying to learn.
It is new idea for me and I need your help with it please!
但我只想:
It is new idea for me and I need your help with it please!
谢谢 就这么做吧
import re
match = re.match('pattern','yourString' )
因此,您正在寻找的模式是“我(.*)需要”
您可能需要以不同的方式构建您的模式
因为我不知道是否有例外。如果是这样,您可以运行regex两次以获得原始字符串的子集,再次运行regex以获得所需的精确匹配您需要在
id1
后面的行中标识id2
:
infile = [
"This is a long drawn out sentence needed to emphasize a topic I am trying to learn.",
"It is new idea for me and I need your help with it please!",
"Thank you so much in advance, I really appreciate it.",
]
id1 = "I"
id2 = "need"
for line in infile:
if id1 in line:
pos1 = line.index(id1)
if id2 in line[pos1+len(id1) :] :
print(line)
输出:
It is new idea for me and I need your help with it please!
您可以使用正则表达式来检查这一点。一种可能的解决办法是:
id1 = "I"
id2 = "need"
regex = re.compile(r'^.*{}.*{}.*$'.format(id1, id2))
with open('fun.txt') as f:
for line in f:
if re.search(regex, line):
print(line[:-1])
您可以定义一个函数来计算两个
集合的交集
(每个句子和我需要
),并使用带有键的排序
,该键按句子中相同的出现顺序对结果进行排序。这样,您可以检查结果列表的顺序是否与I need
中的顺序匹配:
a = ['I','need']
l = ['This is a long drawn out sentence needed to emphasize a topic I am trying to learn.',
'It is new idea for me and I need your help with it please!',
'Thank you so much in advance, I really appreciate it.']
自定义函数。如果字符串的顺序相同,则返回True
:
def same_order(l1, l2):
inters = sorted(set(l1) & set(l2.split(' ')), key = l2.split(' ').index)
return True if inters == l1 else False
如果返回了True
,则返回列表l
中的给定字符串:
[l[i] for i, j in enumerate(l) if same_order(a, j)]
#['It is new idea for me and I need your help with it please!']
检查我的答案这里同样适用于如果id1和id2在同一行:
。这是一个通用用例,但不是一个单一的模式。如果'I'
是句子中的第一个单词,而'need'
是最后一个单词,例如'I拥有我们需要的一切。
我仍然想返回这个句子。在这里使用正则表达式当然是个好主意,但您的答案不能满足问题。模式I\sneed
和代码re.compile('pattern','yourString')
都是错误的。请改进或删除。很好!一个改进是使用regex=re.compile(r'{}.{}.format(id1,id2))
和regex.search(line)