Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/304.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/regex/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫在序列和返回关键字中找到共同的字符串_Python_Regex_String_Pandas - Fatal编程技术网

Python 熊猫在序列和返回关键字中找到共同的字符串

Python 熊猫在序列和返回关键字中找到共同的字符串,python,regex,string,pandas,Python,Regex,String,Pandas,我想改进基于一系列关键字在pandas系列中搜索字符串的方法。我现在的问题是如何将在DataFrame行中找到的关键字作为新列获取。“w”系列的关键词是: 数据帧“df”是: 以下解决方案很好地屏蔽了数据帧: import re r = re.compile(r'.*({}).*'.format('|'.join(w.values)), re.IGNORECASE) masked = map(bool, map(r.match, df['Tweet'])) df['Tweet_masked']

我想改进基于一系列关键字在pandas系列中搜索字符串的方法。我现在的问题是如何将在DataFrame行中找到的关键字作为新列获取。“w”系列的关键词是:

数据帧“df”是:

以下解决方案很好地屏蔽了数据帧:

import re
r = re.compile(r'.*({}).*'.format('|'.join(w.values)), re.IGNORECASE)
masked = map(bool, map(r.match, df['Tweet']))
df['Tweet_masked'] = masked
并将此返回:

   User_ID              Tweet Tweet_masked
0        1             hi all        False
1        2  see you somewhere         True
2        3           So weird        False
3        4         hi all :-)        False
4        5     next big thing         True
5        6  how can i say no?        False
6        7         so strange         True
7        8         not at all        False
现在我在寻找这样的结果:

User_ID;Tweet;Keyword
01;hi all;None
02;see you somewhere;somewhere
03;So weird;None
04;hi all :-);None
05;next big thing;thing
06;how can i say no?;None
07;so strange;strange
08;not at all;None
提前感谢您的支持。

更换怎么样

masked = map(bool, map(r.match, df['Tweet']))


谢谢它起作用了!我想知道,如何在一个字符串中找到多个关键字?我的意思是,如果df被一个包含“09;这么奇怪的东西”的新行更新,我怎么能从掩码中同时得到“奇怪”和“东西”?我尝试了r.search()方法在all字符串中搜索,但没有结果…@fblamanna-类似于:
df['Tweet'].map(lambda x:tuple(re.findall(r'({}).format('|'.join(w.values)),x))
哪个返回元组?
User_ID;Tweet;Keyword
01;hi all;None
02;see you somewhere;somewhere
03;So weird;None
04;hi all :-);None
05;next big thing;thing
06;how can i say no?;None
07;so strange;strange
08;not at all;None
masked = map(bool, map(r.match, df['Tweet']))
masked = [m.group(1) if m else None for m in map(r.match, df['Tweet'])]