Python 将数据帧转换为具有几个键(成员ID)和附加值(帐户余额)的字典

Python 将数据帧转换为具有几个键(成员ID)和附加值(帐户余额)的字典,python,numpy,dataframe,dictionary,type-conversion,Python,Numpy,Dataframe,Dictionary,Type Conversion,我有以下问题: 我的目标是提取数据帧的唯一成员ID,并将其作为键,同时提取和汇总它们拥有的事务,并将其作为值 唯一成员id=例如708504419744905670928446 金额=150.78 card_members=df['unique_mem_id'].unique() 编辑: 以下是数据框架的摘录: transaction_date unique_mem_id description amount 6/21/2014 7.08504E+22 HILLERS MARKET

我有以下问题: 我的目标是提取数据帧的唯一成员ID,并将其作为键,同时提取和汇总它们拥有的事务,并将其作为值

唯一成员id=例如708504419744905670928446

金额=150.78

card_members=df['unique_mem_id'].unique()

编辑: 以下是数据框架的摘录:

transaction_date    unique_mem_id   description amount
6/21/2014   7.08504E+22 HILLERS MARKET         NORTHVILLE   MI  61.72
6/22/2014   7.08504E+22 BUSCH'S #1032          PLYMOUTH     MI  25.48
6/23/2014   7.08504E+22 SPEEDWAY XXXXX 5 M     PLYMOUTH     MI  30.73
6/23/2014   7.08504E+22 HENDERSON GLASS INC    NOVI         MI  29.95
6/23/2014   7.08504E+22 HILLERS MARKET         NORTHVILLE   MI  59.6
6/23/2014   7.08504E+22 SPEEDWAY XXXXX 5 M     PLYMOUTH     MI  60.59
6/24/2014   7.08504E+22 BEACHWAY RESORT        SAUGATUCK    MI  1142.4
6/24/2014   7.08504E+22 PUMPERNICKELS EATERY   SAUGATUCK    MI  88.52
6/24/2014   7.08504E+22 DEMOND'S SUPER         DOUGLAS      MI  79.75
6/25/2014   7.08504E+22 DEMOND'S SUPER         DOUGLAS      MI  128.21
编辑结束

日期时间索引:852个条目,2014-06-21至2020-01-23数据列 (共4列):#列非空计数数据类型

---------
0事务处理日期852非空datetime64[ns]
1个唯一的\u mem\u id 852非空对象
2说明852非空对象
3金额852非空浮动64
数据类型:datetime64ns,float64(1),对象(2)内存使用率:53.3+KB

这是我试图编写的字典:

#test for transactions
from collections import defaultdict
transaction_dict = defaultdict(list)

for row in df_card.items():
    try:
        key = card_members
        value = df_card.amount
    except ValueError:
        continue

    transaction_dict[key] += value

print(transaction_dict)
出现的错误是:unhabable类型:“numpy.ndarray”

我也尝试过df_card.iterrows(),但也失败了:(

dic={}
对于范围内的i(len(df)):
key=df.at[i,'唯一成员id']
如果dic中的df.at[i,‘唯一成员id’]:
dic[key]+=df.at[i,'说明金额']
其他:
dic[key]=df.at[i,‘说明金额’]
试试这个!
首先我们创建一个字典。然后我们遍历数据帧的每一行,检查特定的
唯一的\u mem\u id
是否已经作为一个键存在于字典中。如果它是一个键,只需将
描述\u金额
添加到该键,否则在字典中创建一个新键。

我认为您可以使用
df\u卡。iterrows
作为f下面:

transaction_dict = {}
for i, row in df_card.iterrows():
  key = row['unique_mem_id']
  val = row['amount']
  transaction_dict[key] = transaction_dict.get(key,0) + val

希望对您有所帮助!

在这里完成此操作并对第三方有用我的最终解决方案: 我使用了一个变通方法,它是以下步骤的副产品,但与Yosua发布的解决方案类似

  • 创建一个将交易标记为“费用”或“收入”的新列,然后运行if循环,直到找到第一个“收入”并将其相加
  • `

    `

    在添加新列之后,我使用following以元组的形式对行进行迭代,并在单击“income”时停止,然后将之前的所有“expenses”相加

    数据帧的切片

    df_1 = df_card[['unique_mem_id', 'amount', 'transaction_class']][df_card['unique_mem_id'] == '70850441974905670928446']
    
    元组迭代

    cumulative_amount = []
    amount_list = []
    for row in df_1.itertuples():
       #access data using column names
       if row.transaction_class == "expense":
         #print(index, row.unique_mem_id, row.amount, row.transaction_class)
         amount_list.append(row.amount)
         cumulative_amount = np.cumsum(amount_list, axis = 0)
         #print(row.unique_mem_id, cumulative_amount)
       else:
      #print(f"stopped at user_ID: {row.unique_mem_id}, cumulative sum injected {cumulative_amount[-1]}")
                            break
                    #print out the member id as part of the for-loop and and the last element of the list which is the amount to be injected
                    print(f"unique_member_ID: {row.unique_mem_id}; initial injection needed in USD: {cumulative_amount[-1]}")
    
    这将打印相应的会员ID及其费用总额。
    希望这些变体能有所帮助:)

    提供一些示例数据。您似乎只有3列,但数据中的每一行包含6个属性。我尝试了这一点,并进行了一些小的调整。谢谢!
    cumulative_amount = []
    amount_list = []
    for row in df_1.itertuples():
       #access data using column names
       if row.transaction_class == "expense":
         #print(index, row.unique_mem_id, row.amount, row.transaction_class)
         amount_list.append(row.amount)
         cumulative_amount = np.cumsum(amount_list, axis = 0)
         #print(row.unique_mem_id, cumulative_amount)
       else:
      #print(f"stopped at user_ID: {row.unique_mem_id}, cumulative sum injected {cumulative_amount[-1]}")
                            break
                    #print out the member id as part of the for-loop and and the last element of the list which is the amount to be injected
                    print(f"unique_member_ID: {row.unique_mem_id}; initial injection needed in USD: {cumulative_amount[-1]}")