Python 从列表中删除特定单词_Python

Python 从列表中删除特定单词

python

Python 从列表中删除特定单词,python,Python,我正在尝试从列表中删除特定的单词，以及在文本文件中找到的和我还需要删除列表中包含的单词，如words=[a，is，and，here，here] 我的列表行由以下文本组成：- 行= [估计加权平均值的查询复杂度'，“通过校正和等价查询学习DFA的算法的查询复杂度的新边界'，“检查合取查询包含的一般过程。] 请帮助我删除列表中包含的单词，并且假设您从以下内容开始（稍微固定）：通过re.sub功能 >>> lines= ['<title>The query compl

我正在尝试从列表中删除特定的单词，以及在文本文件中找到的

和

我还需要删除列表中包含的单词，如

words=[a，is，and，here，here]

我的列表

行

由以下文本组成：-

行=

[估计加权平均值的查询复杂度'，“通过校正和等价查询学习DFA的算法的查询复杂度的新边界'，“检查合取查询包含的一般过程。]

请帮助我删除列表中包含的单词，并且假设您从以下内容开始（稍微固定）：

通过

re.sub

功能

>>> lines= ['<title>The query complexity of estimating weighted averages.</title>', '<title>New bounds for the query complexity of an algorithm that learns DFAs with correction and equivalence queries.</title>', '<title>A general procedure to check conjunctive query containment.</title>']
>>> words=['a','is','and','there','here']
>>> [re.sub(r'</?title>|\b(?:'+'|'.join(words)+r')\b', r'', line) for line in lines]
['The query complexity of estimating weighted averages.', 'New bounds for the query complexity of an algorithm that learns DFAs with correction  equivalence queries.', 'A general procedure to check conjunctive query containment.']

>>lines=[“估计加权平均值的查询复杂度”，“通过校正和等价查询学习DFA的算法的查询复杂度的新界限”，“检查合取查询包含的一般过程”。]
>>>单词=['a'、'is'、'and'、'there'、'here']
>>>[re.sub（r'|\b（？：'+'|'.连接（单词）+r'\b'，r''，行）用于行中的行]
[“估计加权平均值的查询复杂度”，“使用校正等价查询学习DFA的算法的查询复杂度的新界限”，“检查合取查询包含的一般过程”。]

\b

单词前后有助于精确匹配单词

\b

称为单词边界，匹配单词字符和非单词字符。

首先，您应该始终发布您迄今为止尝试过的内容

仅使用内置库：

for i in range(0, len(lines)-1):
    for it in range(0, len(words)-1):
        lines[i] = lines[i].replace(words[it], '')

代码由第行解释：

对于列表“行”中的每个项目，i=当前行的项目编号

对于“单词”列表中的每个项目，它=当前单词在“单词”中的项目编号；将word中在“列表”中的当前项中找到的所有项替换为“”

列表“行”中的当前项已更改为自身，而“字”中的当前项未更改

无需使用正则表达式，您可以更高效地执行此操作：

lines = ['<title>The query complexity of estimating weighted averages.</title>',
         '<title>New bounds for the query complexity of an algorithm that learns DFAs with correction and equivalence queries.</title>',
         '<title>A general procedure to check conjunctive query containment.</title>']
words = {"a", "is", "and", "there", "here"}

print([" ".join([w for line in lines
             for w in line[7:-8:].split(" ")
             if w.lower() not in words])])


['The query complexity of estimating weighted averages.
 New bounds for the query complexity of an algorithm that learns 
 DFAs with correction equivalence queries.
 general procedure to check conjunctive query containment.']

lines=[“估计加权平均值的查询复杂性”，
“使用校正和等价查询学习DFA的算法的查询复杂度的新界限。”，
'检查连接查询包含的常规过程。']
单词={“a”，“is”，“and”，“there”，“here”}
打印（[“”.join（[w代表行中的行
对于第[7:-8:]行中的w，拆分（“”）
如果w.lower（）不在文字中]）
['估计加权平均数的查询复杂性。
学习算法的查询复杂度的新界
带有修正等价查询的DFA。
检查连接查询包含的一般过程。“]

如果是case matter，请删除

w.lower（）

调用。此外，如果您是通过解析网页来提取行，我建议您在写入文件之前从标记中提取文本。

您有一个字符串列表，可以在每行使用或，请附加一些代码并显示具体问题。首先显示您的代码。将要删除的单词放入第二个列表中，迭代它们并在第一个列表中删除它们。但现在我得到如下输出

[[“T”，“h”，“e”，“q”，“u”，“e”，“r”，“y”，“c”，“o”，“m”，“p”，“l”，“e”，“x”，“i”，“T”，“y”，“o”，“f”，“e”，“s”，“T”，“i”，“m”，“m”，“T”，“i”，“n”，“g”，“w”，“e”，“i”，“g”，“h”，“T”，“e”，“d”，“v”，“e”，“r”，“g”，“e”，“s”，“s”，“s”，“g”，“g”，“g”，“h”，“T”，“e”，“e”，“d”，“v”，“e”，“e”，“e”，“e”，“e”，“e”，“s”，“e”，“s”，“s”，“g”，“g”，“g”，“g”，“g”，“g”，“g”，“g”，“g”，“g”，“g”，“g”，“g”，“g”，“e”，“g”，“e”，“，['你不应该；我不应该。也许你只是在操作第一个标题，而不是答案中显示的标题列表？不要将引号放在方括号外，它们在OP的方括号内丢失了。
>>> lines= ['<title>The query complexity of estimating weighted averages.</title>', '<title>New bounds for the query complexity of an algorithm that learns DFAs with correction and equivalence queries.</title>', '<title>A general procedure to check conjunctive query containment.</title>']
>>> words=['a','is','and','there','here']
>>> [re.sub(r'</?title>|\b(?:'+'|'.join(words)+r')\b', r'', line) for line in lines]
['The query complexity of estimating weighted averages.', 'New bounds for the query complexity of an algorithm that learns DFAs with correction  equivalence queries.', 'A general procedure to check conjunctive query containment.']

for i in range(0, len(lines)-1):
    for it in range(0, len(words)-1):
        lines[i] = lines[i].replace(words[it], '')

lines=['<title>The query complexity of estimating weighted averages.</title>', '<title>New bounds for the query complexity of an algorithm that learns DFAs with correction and equivalence queries.</title>', '<title>A general procedure to check conjunctive query containment.</title>']

words = [' a ', ' is ', ' and ', ' there ', ' here ', '<title>', '</title>']

for i in words:
  for j in range(0,len(lines)):
    lines[j]=lines[j].replace(i,'')

lines = ['<title>The query complexity of estimating weighted averages.</title>',
         '<title>New bounds for the query complexity of an algorithm that learns DFAs with correction and equivalence queries.</title>',
         '<title>A general procedure to check conjunctive query containment.</title>']
words = {"a", "is", "and", "there", "here"}

print([" ".join([w for line in lines
             for w in line[7:-8:].split(" ")
             if w.lower() not in words])])


['The query complexity of estimating weighted averages.
 New bounds for the query complexity of an algorithm that learns 
 DFAs with correction equivalence queries.
 general procedure to check conjunctive query containment.']