python if multiple string返回句子中包含的单词
我有一个单词列表,我想做if语句,下面是我的列表:python if multiple string返回句子中包含的单词,python,pandas,combinations,matching,Python,Pandas,Combinations,Matching,我有一个单词列表,我想做if语句,下面是我的列表: list = ['camera','display','price','memory'(will have 200+ words in the list)] 这是我的密码: def check_it(sentences): if 'camera' in sentences and 'display' in sentences and 'price' in sentences: return "Camera/Displa
list = ['camera','display','price','memory'(will have 200+ words in the list)]
这是我的密码:
def check_it(sentences):
if 'camera' in sentences and 'display' in sentences and 'price' in sentences:
return "Camera/Display/Price"
if 'camera' in sentences and 'display' in sentences:
return "Camera/Display"
...
return "Others"
h.loc[:, 'Category'] = h.Mention.apply(check_it)
将有太多的组合,这些,而且我想有文字返回到单独的行。
是否有人知道如何制作此示例并单独返回单词,而不是执行“照相机/显示器/价格”操作 由regex使用-将列表的所有值与|
连接,最后的值由/
连接:
df = pd.DataFrame({'Mention':['camera in sentences and display in sentences',
'camera in sentences price']})
L = ['camera','display','price','memory']
pat = '|'.join(r"\b{}\b".format(x) for x in L)
df['Category'] = df['Mention'].str.findall(pat).str.join('/')
print (df)
Mention Category
0 camera in sentences and display in sentences camera/display
1 camera in sentences price camera/price
另一个具有列表理解功能的解决方案,也适用于列表生成器和join
:
df['Category1'] = [[y for y in x.split() if y in L] for x in df['Mention']]
df['Category2'] = ['/'.join(y for y in x.split() if y in L) for x in df['Mention']]
print (df)
Mention Category1 \
0 camera in sentences and display in sentences [camera, display]
1 camera in sentences price [camera, price]
Category2
0 camera/display
1 camera/price
为什么不检查每个句子中的单词呢
wordsList = ['camera','display','price','memory'(will have 200+ words in the list)]
def check_it(sentence, wordsList):
wordString = ''
flag = False
counter = 0
for word in sentence.split():
if word in wordsList:
if counter != 0:
wordString = wordString + '/' + word
else:
wordString = word
flag = True
counter += 1
if flag:
return wordString
elif not flag:
return 'Others'
你的例子表达得不好。如果句子同时包含“camera”和“display”,那么如果还有“price”,则不清楚会发生什么,因为第二个If块将永远不会执行(已从前一个块返回)。可能关键字最多的情况应该首先返回。
Category
中的类别是否按字母顺序排列?例如,Camera/Price/Display
应返回为Camera/Display/Price
?结果将只返回“other”,不返回列表中的单词我刚刚编辑过它。我添加了句子.split(),因为在它检查每个字符而不是每个单词之前。结果给了我“camera/display/camera”是指“camera in句子,display or camera in句子”。你知道如何解决这个问题吗?我只想要一台照相机result@Aimee-在这里添加apply(set)
类似df['Category']=df['notify'].str.findall(pat).apply(set).str.join('/')
对于另一个解决方案,使用df['Category1']=[list(set([y代表x中的y.split(),如果y代表L中的y]))来表示df['not'not code>或df['Category2'=['/')。join(set(y代表x中的y。拆分()如果y代表L))代表df中的x['提及]]
@Aimee-如果我的答案有帮助,请不要忘记接受。要将答案标记为已接受,请单击答案旁边的复选标记,将其从空心切换为绿色()。谢谢。
wordsList = ['camera','display','price','memory'(will have 200+ words in the list)]
def check_it(sentence, wordsList):
wordString = ''
flag = False
counter = 0
for word in sentence.split():
if word in wordsList:
if counter != 0:
wordString = wordString + '/' + word
else:
wordString = word
flag = True
counter += 1
if flag:
return wordString
elif not flag:
return 'Others'