Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/299.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python-在dataframe行内的列表中搜索元素_Python_Python 3.x_Pandas - Fatal编程技术网

Python-在dataframe行内的列表中搜索元素

Python-在dataframe行内的列表中搜索元素,python,python-3.x,pandas,Python,Python 3.x,Pandas,我试图捕获列表格式的数据框/熊猫中的元素。下面捕获整个列表如果字符串存在,如何只按行捕获特定字符串的元素而忽略其余元素 这是我试过的 l1 = [1,2,3,4,5,6] l2 = ['hello world \n my world','world is a great place \n we live in it','planet earth',np.NaN,'\n save the water',''] df = pd.DataFrame(list(zip(l1,l2)),

我试图捕获列表格式的数据框/熊猫中的元素。下面捕获整个列表如果字符串存在,如何只按行捕获特定字符串的元素而忽略其余元素

这是我试过的

l1 = [1,2,3,4,5,6]
l2 = ['hello world \n my world','world is a great place \n we live in it','planet earth',np.NaN,'\n save the water','']

df = pd.DataFrame(list(zip(l1,l2)),
            columns=['id','sentence'])
df['sentence_split'] = df['sentence'].str.split('\n')
print(df)
此代码的结果:

df[df.sentence_split.str.join(' ').str.contains('world', na=False)]  # does the trick but still not exactly what I am looking for. 


id  sentence                                  sentence_split
1   hello world \n my world                   [hello world , my world]
2   world is a great place \n we live in it   [world is a great place , we live in it]
但是寻找:

id  sentence                                  sentence_split
1   hello world \n my world                   hello world; my world
2   world is a great place \n we live in it   world is a great place

您正在搜索序列列表中的字符串。一种方法是:

# Drop NaN rows
df = df.dropna(subset=["sentence_split"])
应用只保留要查找的列表中的元素的函数

# Apply this lamda function
df["sentence_split"] = df["sentence_split"].apply(lambda x: [i for i in x if "world" in i])

   id                                 sentence             sentence_split
0   1                  hello world \n my world  [hello world ,  my world]
1   2  world is a great place \n we live in it  [world is a great place ]
2   3                             planet earth                         []
4   5                        \n save the water                         []
5   6                                                                  []