Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/282.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在另一个数据帧行中查找数据帧行中的单词_Python_Pandas_Dataframe - Fatal编程技术网

Python 在另一个数据帧行中查找数据帧行中的单词

Python 在另一个数据帧行中查找数据帧行中的单词,python,pandas,dataframe,Python,Pandas,Dataframe,我想检查数据框B行中的单词是否存在于另一个数据框a行中,并检索数据框a的行号 数据帧A的示例 LineNumber Description 2539 5401845 Either the well was very deep, or she fell very slowly, 4546 5409117 for she had plenty of time as she went down to look about her, 4368 5408

我想检查数据框B行中的单词是否存在于另一个数据框a行中,并检索数据框a的行号

数据帧A的示例

      LineNumber               Description
2539  5401845  Either the well was very deep, or she fell very slowly,
4546  5409117  for she had plenty of time as she went down to look about her, 
4368  5408517  and to wonder what was going to happen next
数据帧B的示例

                 Words
50062   well deep fell
44263   plenty time above
4731    plenty time down look
我现在想知道数据帧B的每一行中的所有单词是否都在数据帧A的任何一行中。如果是这样,我将从数据帧A检索行号并将其分配给数据帧B

输出应该是这样的

                     Words             LineNumber
50062   well deep fell                 5401845
44263   plenty time above
4731    plenty time down look          5409117
我试过这样的东西,但不起作用

a = 'for she had plenty of time as she went down to look about her,'
str = 'plenty time down look'
if all(x in str for x in a):
    print(True)
else:
    print(False)

谢谢

您已接近您要做的事情。试着这样做:

a = 'for she had plenty of time as she went down to look about her,'
string = 'plenty time down look'
a = a.split(' ')
string = string.split(' ')
if all(x in a for x in string):
    print(True)
else:
    print(False)
最初在a中x的字符串中使用
x的方式有两个问题。第一个是
string
a
中的每个元素都是字符,因此要比较单词,需要创建一个单词列表,这就是我包含拆分的原因

第二个是,如果
a
中的每个元素都在
string
中,则逻辑
x in string for x in a
表示返回True,但您需要的是
x in a for x in string
如果
string
中的每个元素都在
a
中,则返回
True

制作数据帧

通过索引将数据帧y中的描述与数据帧x匹配,并从数据帧x中获取匹配的索引


查看这篇文章:谢谢,迭代x数据帧更有效
x = pd.DataFrame({"Description": ["for she had plenty of time as she went down to look about her",
                                  "for she had of time as she went down to look about her"]})

>>> x
    Description
0   for she had plenty of time as she went down to look about her
1   for she had of time as she went down to look about her

y = pd.DataFrame({"Description": ["plenty time down look"]})
>>> y
    Description
0   plenty time down look
with_words = y["Description"].iloc[[0]].item().split()
with_regex = "".join(['(?=.*{})'.format(word) for word in with_words])

>>> with_regex
'(?=.*plenty)(?=.*time)(?=.*down)(?=.*look)'

>>> x.loc[(x.Description.str.contains(with_regex))].index.item()
0