Python 如何使用pandas执行索引/匹配excel功能等效？_Python_Pandas_Dataframe

Python 如何使用pandas执行索引/匹配excel功能等效？

python pandas dataframe

Python 如何使用pandas执行索引/匹配excel功能等效？,python,pandas,dataframe,Python,Pandas,Dataframe,我面临以下挑战例如，让虚拟数据帧 col1_a | A | B | C a | 1 | 4 | 7 b | 2 | 5 | 8 c | 3 | 6 | 9 让另一个数据帧 col1_b |col2 b | B c | C a | A 输出数据帧应如下所示： col1_b | col2 | output b | B | 5 c | C | 9 a | A

我面临以下挑战

例如，让虚拟数据帧

 col1_a | A | B | C
    a   | 1 | 4 | 7
    b   | 2 | 5 | 8
    c   | 3 | 6 | 9

让另一个数据帧

 col1_b |col2
    b   | B
    c   | C
    a   | A

输出数据帧应如下所示：

 col1_b | col2 | output
    b   |  B   |  5
    c   |  C   |  9
    a   |  A   |  1

我的思路是创建字典，在这种情况下

A, B, C

({'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9})

然后是这个功能,

def func_test1(row):
    if row['col2'] == 'B':
        return test1_df.col1_b.map(B)
    elif row['col2'] == 'C':
        return test1_df.col1_b.map(C)
    elif test1_df['col2'] == 'A':
        return test1_df.col1_b.map(A)
    
test1_df['output'] = test1_df.apply(func_test1, axis=1)

我总是会遇到以下错误

 ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 2')

此外，我认为这根本不是一个有效的解决方案

如果有pandas内置功能来帮助实现此目标，那就太好了。

与unpivot

df1

by with一起使用，对于新列名，

rename

：

df = df2.join(df1.set_index('col1_a').stack().rename('output'), on=['col1_b','col2'])
print (df)
  col1_b col2  output
0      b    B       5
1      c    C       9
2      a    A       1

另一个想法是使用，

重命名

列和：

使用

如果df2的所有组合都在df1中，那么这起作用

ya，您将落入样本数据陷阱。但好的，如果相同的长度，相同的值，它就工作了。我知道这不是一个通用的解决方案，但这可能足够了。我同意你的看法，这里没有问题：）@jezraelh如果情况相反，如何加入df？请看一下这个。。你知道怎么回答这个问题吗？

df1 = (df1.melt('col1_a', var_name='col2', value_name='output')
          .rename(columns={'col1_a':'col1_b'}))
    
df = df2.merge(df1, on=['col1_b','col2'], how='left')
print (df)
  col1_b col2  output
0      b    B       5
1      c    C       9
2      a    A       1

df2['output'] = df1.set_index('col1_a').lookup(df2['col1_b'], df2['col2'])