如何在python中使用正则表达式从字符串中提取特定单词

如何在python中使用正则表达式从字符串中提取特定单词,python,regex,string,text-extraction,Python,Regex,String,Text Extraction,我有两个字符串,它们的类型为: text1 = 'Mau/VBT ngasih/NN hadiah/NN untuk/IN Anniv/NN ,/, Graduation/NN ,/, Birthday/NN ,/, Wedding/NN ,/, dll/VBT ?/. Nih/DT ,/, ada/VBI hadiah/NN kece/JJ yang/SC at/IN Yasmin/NNP 33/CDP' text2 = 'Yang/SC kelaparan/NN habis/VBI lati

我有两个字符串,它们的类型为:

text1 = 'Mau/VBT ngasih/NN hadiah/NN untuk/IN Anniv/NN ,/, Graduation/NN ,/, Birthday/NN ,/, Wedding/NN ,/, dll/VBT ?/. Nih/DT ,/, ada/VBI hadiah/NN kece/JJ yang/SC at/IN Yasmin/NNP 33/CDP'
text2 = 'Yang/SC kelaparan/NN habis/VBI latihan/NN ilovenaylambem/NN at/IN Jl/NNP Halimun/NNP Raya/NNP ,/, Menteng/NN'
我喜欢从带有
/NN
标记的单词中提取任何单词,然后将其与带有
/NNP
/CDP
标记的单词进行比较。以下是我迄今为止的代码(仍然仅适用于
/NNP
标记):

到目前为止,代码的结果是:

Mau/VBT ngasih/NN hadiah/NN untuk/IN Anniv/NN ,/, Graduation/NN ,/, Birthday/NN ,/, Wedding/NN ,/, dll/VBT ?/. Nih/DT ,/, ada/VBI hadiah/NN kece/JJ yang/SC at/IN Yasmin/NNP 33/CDP
['at/IN Yasmin/NNP']

Yang/SC kelaparan/NN habis/VBI latihan/NN ilovenaylambem/NN at/IN Jl/NNP Halimun/NNP Raya/NNP ,/, Menteng/NN
['at/IN Jl/NNP Halimun/NNP Raya/NNP']
正如我们看到的第一个字符串(
text1
)的
entityExtractPreposition
仍然无法获取
33/CDP
。如何使用text1中的
/CDP
标记或text2中的
/NNP
使
实体提取介词
正常工作

预期结果是:

Mau/VBT ngasih/NN hadiah/NN untuk/IN Anniv/NN ,/, Graduation/NN ,/, Birthday/NN ,/, Wedding/NN ,/, dll/VBT ?/. Nih/DT ,/, ada/VBI hadiah/NN kece/JJ yang/SC at/IN Yasmin/NNP 33/CDP
['at/IN Yasmin/NNP 33/CDP']

Yang/SC kelaparan/NN habis/VBI latihan/NN ilovenaylambem/NN at/IN Jl/NNP Halimun/NNP Raya/NNP ,/, Menteng/NN
['at/IN Jl/NNP Halimun/NNP Raya/NNP']
谢谢

Mau/VBT ngasih/NN hadiah/NN untuk/IN Anniv/NN ,/, Graduation/NN ,/, Birthday/NN ,/, Wedding/NN ,/, dll/VBT ?/. Nih/DT ,/, ada/VBI hadiah/NN kece/JJ yang/SC at/IN Yasmin/NNP 33/CDP
['at/IN Yasmin/NNP 33/CDP']

Yang/SC kelaparan/NN habis/VBI latihan/NN ilovenaylambem/NN at/IN Jl/NNP Halimun/NNP Raya/NNP ,/, Menteng/NN
['at/IN Jl/NNP Halimun/NNP Raya/NNP']
\b[^\s/]+/IN\b(?:(?!/IN\b).)*/(?:NNP|CDP)\b