Python 同一数据帧中的交叉引用列_Python_Pandas

Python 同一数据帧中的交叉引用列

python pandas

Python 同一数据帧中的交叉引用列,python,pandas,Python,Pandas,**我将两列的示例df编辑为元组而不是整数，以说明在将数据从整数更改为元组时解决方案的问题** 我正在尝试在Pandas中创建一个新列，其值将取决于单独列上不同行上存在的特定列的值，如果找到匹配项，则使用第三列的值要进行说明，请参见下面的示例我在df.apply（）中使用lambda函数来执行以下操作：在第一行中，它将过滤列'two'的值等于列'zero'的行，在过滤行中，它将获取列'one'的值并将其复制到新列'two'中 df=pd.数据帧（[（0,9），（1,9），（2,9），（3,9

**我将两列的示例df编辑为元组而不是整数，以说明在将数据从整数更改为元组时解决方案的问题**

我正在尝试在Pandas中创建一个新列，其值将取决于单独列上不同行上存在的特定列的值，如果找到匹配项，则使用第三列的值

要进行说明，请参见下面的示例

我在df.apply（）中使用lambda函数来执行以下操作：在第一行中，它将过滤列'two'的值等于列'zero'的行，在过滤行中，它将获取列'one'的值并将其复制到新列'two'中

df=pd.数据帧（[（0,9），（1,9），（2,9），（3,9），（4,9）]，['a'，'b'，'c'，'d'，'e']，[（2,9），（3,9），（4,9），（5,9），（6,9）]。转置（）

注意，列'two'和列'zero'是唯一的，因此筛选结果将只有一行

理论上，“三”列的结果应该是：“c”、“d”、“e”、“nan”、“nan”

谢谢

只需将行

设置为零

作为索引，方便查找列

一

更新：该解决方案现在适用于元组索引

import pandas as pd
import numpy as np

df = pd.DataFrame([[0,1,2,3,4],['a','b','c','d','e'],[2,3,4,5,6]]).transpose()
df.columns = ['zero','one','two']

# set index for quick lookup    
df_indexed = df.set_index("zero")

# the indexed dataset look like this
df_indexed
Out[21]: 
     one two
zero        
0      a   2
1      b   3
2      c   4
3      d   5
4      e   6

# apply the mapping logic, taking df_indexed from outside the function
def f(el):
    return df_indexed.at[el, "one"] if el in df_indexed.index else np.nan

df["three"] = df["two"].apply(f)

print(df)
Out[18]: 
  zero one two three
0    0   a   2     c
1    1   b   3     d
2    2   c   4     e
3    3   d   5   NaN
4    4   e   6   NaN

# On the updated dataset
df
Out[71]: 
     zero one     two three
0  (0, 9)   a  (2, 9)     c
1  (1, 9)   b  (3, 9)     d
2  (2, 9)   c  (4, 9)     e
3  (3, 9)   d  (5, 9)   NaN
4  (4, 9)   e  (6, 9)   NaN

比尔，你的回答在这里非常有效。有一件事是，我的实际索引是一个元组，出于某种原因，它正在放弃这个。我认为这与

el

是

有关，而

df_index.index

项，当我通过

类型（df_index.index[0]）

检查时，只会得到“tuple”。keyerror如下所示：

keyerror:“[Index（[（1,1）、（2,1）、（3,1）]，dtype='object'，name='tenor'）]中没有一个在[Index]”

中，这让我觉得我需要访问

el

的值？我无法理解。对我来说，问题只与第0列、第1列和第2列有关，而与索引无关，无论索引包含什么。你能举例说明吗？仅供参考，您可以在

之前.reset_index（）
重置_index（）

以允许在不丢失数据的情况下重新分配索引列。我已将上面的示例df编辑为0列和2列中的元组，这似乎将解决方案抛在脑后……我终于得到了它。应该使用

.at[]

而不是

.loc[]

，因为无论索引是什么，都只需要返回一个值。忽略这一点确实是我的错。这个解决方案现在应该适用于元组索引了。：）好的，这很好。工作完美。非常感谢你的帮助。

import pandas as pd
import numpy as np

df = pd.DataFrame([[0,1,2,3,4],['a','b','c','d','e'],[2,3,4,5,6]]).transpose()
df.columns = ['zero','one','two']

# set index for quick lookup    
df_indexed = df.set_index("zero")

# the indexed dataset look like this
df_indexed
Out[21]: 
     one two
zero        
0      a   2
1      b   3
2      c   4
3      d   5
4      e   6

# apply the mapping logic, taking df_indexed from outside the function
def f(el):
    return df_indexed.at[el, "one"] if el in df_indexed.index else np.nan

df["three"] = df["two"].apply(f)

print(df)
Out[18]: 
  zero one two three
0    0   a   2     c
1    1   b   3     d
2    2   c   4     e
3    3   d   5   NaN
4    4   e   6   NaN

# On the updated dataset
df
Out[71]: 
     zero one     two three
0  (0, 9)   a  (2, 9)     c
1  (1, 9)   b  (3, 9)     d
2  (2, 9)   c  (4, 9)     e
3  (3, 9)   d  (5, 9)   NaN
4  (4, 9)   e  (6, 9)   NaN