Python 将一列中的字符串值从一个df检查到另一个df中的另一列

Python 将一列中的字符串值从一个df检查到另一个df中的另一列,python,pandas,dataframe,Python,Pandas,Dataframe,假设我有两只熊猫,看起来像这样: data_set_1 = [['A big string of words', 30], ['Random data point', 60], ['Big string of words', 50]] data_set_2 = [['string of', 30], ['Character value', 40], ['Big swords', 90]] first_df = pd.DataFrame(data_set_1, columns = ['Word_

假设我有两只熊猫,看起来像这样:

data_set_1 = [['A big string of words', 30], ['Random data point', 60], ['Big string of words', 50]]
data_set_2 = [['string of', 30], ['Character value', 40], ['Big swords', 90]]

first_df = pd.DataFrame(data_set_1, columns = ['Word_set', 'Numbers'])
second_df = pd.DataFrame(data_set_1, columns = ['Words', 'Numbers'])
是否有方法将第二个DF中的
单词
列与第一个DF中的
单词集
列进行比较。理想情况下,任何匹配值都会保存到新的DF中

示例输出:

Output:

Column 1                                  Column 2
-----------                               ------------
'A big string of words', 'string of'      30
'Big string of words', 'Big swords'

这里的逻辑是在每个索引级别查找匹配的字符串对象,然后使用此命令将其连接起来以获得最终结果
any(x在first_df['Word\u set'][i]中,x在j.split()中)

请查看此代码:

import pandas as pd

data_set_1 = [['A big string of words', 30], ['Random data point', 60], ['Big string of words', 50]]
data_set_2 = [['string of', 30], ['Character value', 40], ['Big swords', 90]]

first_df = pd.DataFrame(data_set_1, columns = ['Word_set', 'Numbers'])
second_df = pd.DataFrame(data_set_2, columns = ['Words', 'Numbers'])

col1 = []
for i, j in zip(range(3),second_df['Words']):
    if any(x in first_df['Word_set'][i] for x in j.split()):
       col1.append(', '.join([first_df['Word_set'][i], j])) 
    col2 = list(first_df['Numbers'][first_df['Numbers'] == second_df['Numbers']])

df = pd.DataFrame(
    data= [col1, col2],
    index=['Column 1', 'Column 2']
).T

print(df)
                           Column 1 Column 2
0  A big string of words, string of       30
1   Big string of words, Big swords     None
输出:

import pandas as pd

data_set_1 = [['A big string of words', 30], ['Random data point', 60], ['Big string of words', 50]]
data_set_2 = [['string of', 30], ['Character value', 40], ['Big swords', 90]]

first_df = pd.DataFrame(data_set_1, columns = ['Word_set', 'Numbers'])
second_df = pd.DataFrame(data_set_2, columns = ['Words', 'Numbers'])

col1 = []
for i, j in zip(range(3),second_df['Words']):
    if any(x in first_df['Word_set'][i] for x in j.split()):
       col1.append(', '.join([first_df['Word_set'][i], j])) 
    col2 = list(first_df['Numbers'][first_df['Numbers'] == second_df['Numbers']])

df = pd.DataFrame(
    data= [col1, col2],
    index=['Column 1', 'Column 2']
).T

print(df)
                           Column 1 Column 2
0  A big string of words, string of       30
1   Big string of words, Big swords     None

谢谢,这解决了那个问题。你能解释一下你的代码吗?@LuiHelleSee现在可以了!