如何在文件中搜索特定的from和to关键字并用python打印句子_Python_Regex_File_Search

如何在文件中搜索特定的from和to关键字并用python打印句子

python regex file search

如何在文件中搜索特定的from和to关键字并用python打印句子,python,regex,file,search,Python,Regex,File,Search,我试图获取一个文件作为输入，并搜索一个特殊字符。我把from和to键作为输入。如果to关键字在下一行，我应该打印直到找到to关键字 for line in contents: if line.startswith("*CHI: ") : line = line.strip("*") tokenize = line.split() p.token_filter(tokenize) 假设我有一个文件： *CHI: (hi) new [

我试图获取一个文件作为输入，并搜索一个特殊字符。我把from和to键作为输入。如果to关键字在下一行，我应该打印直到找到to关键字

for line in contents:

if line.startswith("*CHI: ") :

       line = line.strip("*")
       tokenize = line.split()
       p.token_filter(tokenize)

假设我有一个文件：

       *CHI: (hi) new [/] friend [//] there [/-] [ bch] [ bch ] /[]/ [/=]
<new>
<there>.
%mod: hi there.
*CHI: <dude>
<there>
*CHI: &=sighs <and instead the> [//] and then the gira?e got it and gave it to the elephant . 
*CHI: <he> [/] <he> [/] he hold it .
*CHI: then [/-] the doctor give the [/] money to the man
*CHI: and (i)s then (.) the little (.) gira?e is crying because it (i)s sinking

我的另一个目标是印刷

['new'、'[/]'、'friend'、'[/]'、'there'、'bch'、'/[]/''new''there'.]

对于任意文本，可以使用正则表达式：

>>> import re
>>> text = "*foo* bar *foobar*"
>>> re.findall("\*[^/*]*\*", text)
['*foo*', '*foobar*']

要清除星号，请执行以下操作：

>>> [s.replace("*", "") for s in re.findall("\*[^/*]*\*", text)]
['foo', 'foobar']

对于任意文本，可以使用正则表达式：

>>> import re
>>> text = "*foo* bar *foobar*"
>>> re.findall("\*[^/*]*\*", text)
['*foo*', '*foobar*']

要清除星号，请执行以下操作：

>>> [s.replace("*", "") for s in re.findall("\*[^/*]*\*", text)]
['foo', 'foobar']

如果可以读取文件并将其转换为字符串。我们可以使用

string = "123123STRINGabcabc"

def find_between( string, first, last ):
    try:
        start = string.index( first ) + len( first )
        end = string.index( last, start )
        return string[start:end]
    except ValueError:
        return ""

print find_between( string, "123", "abc" )

给予

如果可以读取文件并将其转换为字符串。我们可以使用

string = "123123STRINGabcabc"

def find_between( string, first, last ):
    try:
        start = string.index( first ) + len( first )
        end = string.index( last, start )
        return string[start:end]
    except ValueError:
        return ""

print find_between( string, "123", "abc" )

给予

“但是我无法打印“new”和“there”……这些行不是以

开头的，所以您的第一条if语句返回

False

是否要打印每一行？不确定目标是什么is@SuperStewOP可能想抓住两个

之间的所有东西。你想要的是类似的东西吗？正则表达式的答案可能是最好的方法“但我无法打印“new”和“there”…这些行不是以

开头的，所以您的第一条if语句返回

False

是否要打印每一行？不确定目标是什么is@SuperStewOP可能想要抓住两个

之间的所有东西。你想要的是类似的东西吗？正则表达式答案可能是最好的方法，只需将

[^/*]*

包装在一个捕获组中即可？顺便说一下，不需要在字符集中转义

。你的斜杠也错了。有人能单独解释一下（“*[^/*]**”，text）这部分吗？还是把

[^/*]*

放在一个捕获组中？顺便说一下，不需要在字符集中转义

。你的斜杠也不对。有人能单独解释一下（“*[^/*]**”，text）这部分吗