Python 熊猫：从数据帧检索索引以填充另一个df_Python_Pandas

Python 熊猫：从数据帧检索索引以填充另一个df

python pandas

Python 熊猫：从数据帧检索索引以填充另一个df,python,pandas,Python,Pandas,我曾试图找到解决这个问题的办法，但失败了我有交易数据的主df，特别是信用卡名称： transactionId, amount, type, person 1 -30 Visa john 2 -100 Visa Premium john 3 -12 Mastercard jenny 我是按个人分组，然后按记录和金额进行聚合 person numbTrans Amount john

我曾试图找到解决这个问题的办法，但失败了

我有交易数据的主df，特别是信用卡名称：

transactionId, amount, type, person
1              -30     Visa  john
2              -100    Visa Premium john
3              -12     Mastercard jenny

我是按个人分组，然后按记录和金额进行聚合

person   numbTrans   Amount
john     2           -130
jenny    1           -12

这很好，但我需要将信用卡类型的维度添加到我的df中。我已将正在使用的信用卡的df分组

index    CreditCardName
0        Visa
1        Visa Premium
2        Mastercard

因此，我不能做的是在主数据框中创建一个名为“CreditCard_id”的新列，它使用字符串“Visa/Visa Premium/Mastercard”为该列引入索引

transactionId, amount, type, CreditCardId, person
1              -30     Visa  0             john
2              -100    Visa Premium 1      john
3              -12     Mastercard 2        jenny

我需要这个，因为我正在做一些简单的kmeans集群，需要整数，而不是字符串（或者至少我认为我需要）

提前谢谢

Rob

如果您将“CreditCardName”设置为第二个df的索引，那么您可以调用

map

：

In [80]:
# setup dummydata
import pandas as pd

temp = """transactionId,amount,type,person
1,-30,Visa,john
2,-100,Visa Premium,john
3,-12,Mastercard,jenny"""

temp1 = """index,CreditCardName
0,Visa
1,Visa Premium
2,Mastercard"""
df = pd.read_csv(io.StringIO(temp))
# crucually set the index column to be the credit card name 
df1 = pd.read_csv(io.StringIO(temp1), index_col=[1])
df
Out[80]:
   transactionId  amount          type person
0              1     -30          Visa   john
1              2    -100  Visa Premium   john
2              3     -12    Mastercard  jenny
In [81]:

df1
Out[81]:
                index
CreditCardName       
Visa                0
Visa Premium        1
Mastercard          2

In [82]:
# now we can call map passing the series, naturally the map will align on index and return the index value for our new column
df['CreditCardId'] = df['type'].map(df1['index'])
df
Out[82]:
   transactionId  amount          type person  CreditCardId
0              1     -30          Visa   john             0
1              2    -100  Visa Premium   john             1
2              3     -12    Mastercard  jenny             2

谢谢，不幸的是，df=pd.read_csv（io.StringIO（temp））失败，出现“TypeError:initial_值必须是unicode或None，而不是str”这是我的代码，只是为了重新创建df，您不应该使用它来创建df。在代码中，您应该使用

df.set_index将第二个df上的索引设置为“CreditCardName”（'CreditCardName'，inplace=True）

太好了，在转到我的代码之前，我只是试着运行你的代码来感受一下。谢谢。很好！！