从句子中查找并删除一个单词（在单词匹配之间）python_Python_Python 3.7

从句子中查找并删除一个单词（在单词匹配之间）python

python

从句子中查找并删除一个单词（在单词匹配之间）python,python,python-3.7,Python,Python 3.7,我有下面这样的句子 mainsentence="My words aren't available give didn't give apple and did happening me" stopwords=['are','did','word', 'able','give','happen'] 如果有任何单词与中间的单词匹配，则要将其删除（例如：“word”应与“word”匹配并将其删除，“did”应与“Not”匹配并将其删除，“able”应删除“available”，因为“able”单

我有下面这样的句子

mainsentence="My words aren't available give didn't give apple and did happening me"

stopwords=['are','did','word', 'able','give','happen']

如果有任何单词与中间的单词匹配，则要将其删除（例如：“word”应与“word”匹配并将其删除，“did”应与“Not”匹配并将其删除，“able”应删除“available”，因为“able”单词位于“available”中

finalsentence="My apple and me"

尝试使用以下代码，但

querywords = mainsentence.split()
resultwords  = [word for word in querywords if word.lower() not in stopwords]
result = ' '.join(resultwords)
print(result)

但它只适用于精确匹配

请帮助我。

以下代码将满足您在问题中所述的要求，但结果可能不是您想要的。代码的一般基础结构应该正确，但您可能希望更改部分匹配的条件（

testword中的stopword

）：

或者，使用列表理解和

all（）

（

any（）

可以等效使用）：

以下代码将满足您在问题中所述的要求，但结果可能不是您想要的。代码的一般基础结构应该正确，但您可能希望更改部分匹配的条件（

testword中的stopword

）：

或者，使用列表理解和

all（）

（

any（）

可以等效使用）：

您可以执行以下操作：

>>> ' '.join([word for word in mainsentence.split() if not any([stopword in word for stopword in stopwords])])
'My apple and me'

编辑：这不需要双向检查，只需查看word是否包含stopword
EDIT2：更新的结果和更新的问题参数

不区分大小写的版本：

' '.join([word for word in mainsentence.split() if not any([stopword.lower() in word.lower() for stopword in stopwords])])

您可以执行以下操作：

>>> ' '.join([word for word in mainsentence.split() if not any([stopword in word for stopword in stopwords])])
'My apple and me'

编辑：这不需要双向检查，只需查看word是否包含stopword
EDIT2：更新的结果和更新的问题参数

不区分大小写的版本：

' '.join([word for word in mainsentence.split() if not any([stopword.lower() in word.lower() for stopword in stopwords])])

您的问题可以通过以下步骤得到可持续的解决方案

像我有->我有，没有->没有一样展开单词。查看

使用单词引理获得每个单词的基本形式，即将单词的形式更改为词根形式。例如：playing、plays、played Be play。让我们将语料库的当前状态称为干净语料库。查看

现在从干净的语料库中删除所有停止词

你可能还会发现我写的一篇文章很有趣，其中还包括拼写更正，可以用来制作文本清理管道。

你遇到的问题可以在以下步骤中找到一个可持续的解决方案

像我有->我有，没有->没有一样展开单词。查看

现在从干净的语料库中删除所有停止词

你可能还会发现我写的一个很有趣的例子，其中还包括拼写更正，可以用来制作文本清理管道。

你可以使用正则表达式的强大功能来解决这类问题

重新导入

你可以得到这样的所有匹配词：

words=re.findall（r'[a-z]*did[a-z]*'，主句）

您还可以替换它们：

re.sub（r'[a-z]*表[a-z]*'，''，主句）

最后的答案是：

mainStation=“我的话不可用给苹果，给我”
stopwords=['are'、'did'、'word'、'able'、'give'、'course']
对于stopwords中的单词：
mainStation=re.sub（fr'[a-z\']*{word}[a-z\']*'，''，mainStation）
#我的苹果和我

对于这类问题，您可以使用正则表达式的强大功能

重新导入

你可以得到这样的所有匹配词：

words=re.findall（r'[a-z]*did[a-z]*'，主句）

您还可以替换它们：

re.sub（r'[a-z]*表[a-z]*'，''，主句）

最后的答案是：

mainStation=“我的话不可用给苹果，给我”
stopwords=['are'、'did'、'word'、'able'、'give'、'course']
对于stopwords中的单词：
mainStation=re.sub（fr'[a-z\']*{word}[a-z\']*'，''，mainStation）
#我的苹果和我

这里的问题是，你想要部分匹配，但是

将是你大部分单词的部分匹配。而且，

发生

应该在

最终内容

@tituszban:更正了这个问题。听起来你需要检查的不是你的单词列表，而是同义词列表有几种方法可以做到这一点，一种是PyDictionary：这里的问题是，你想要部分匹配，但是

将是你大部分单词的部分匹配。另外，

occure

应该在

finalscontence

@tituszban:更正了这个问题。听起来你会的不需要对照单词列表进行检查，而是对照单词列表中的同义词列表进行检查。有几种方法可以做到这一点，其中一种是使用PyDictionary：注意它将区分大小写。您也不需要在

any（）中创建中间列表

。请注意，它将区分大小写。您也不需要在

any（）中创建中间列表。