Python 在熊猫中,将时间序列数据与元数据相结合的正确方式是什么?
我有两个csv文件:Python 在熊猫中,将时间序列数据与元数据相结合的正确方式是什么?,python,pandas,csv,data-analysis,Python,Pandas,Csv,Data Analysis,我有两个csv文件: customer.csv: id name birthday 1 Martin 28.04.1990 2 Twain 30.11.1835 .... purchase_id customer_id item price 1 1 About the ugly German language 3.14 2
customer.csv
:
id name birthday
1 Martin 28.04.1990
2 Twain 30.11.1835
....
purchase_id customer_id item price
1 1 About the ugly German language 3.14
2 1 Food 15.92
3 1 Book 65.35
4 2 Stone 89.79
和purchases.csv
:
id name birthday
1 Martin 28.04.1990
2 Twain 30.11.1835
....
purchase_id customer_id item price
1 1 About the ugly German language 3.14
2 1 Food 15.92
3 1 Book 65.35
4 2 Stone 89.79
我可以将两个数据帧作为
df_customers = pd.read_csv('customers.csv')
df_purchases = pd.read_csv('purchases.csv')
但我如何将这两者结合起来,以便轻松回答以下问题:
- 每位顾客购买了多少件商品
- 每个客户的物品平均价格是多少
右侧一起使用
连接:
df = pd.merge(df_customers, df_purchases, left_on='id', right_on='customer_id', how='right')
print (df)
purchase_id customer_id item price
0 1 1 About the ugly German language 3.14
1 2 1 Food 15.92
2 3 1 Book 65.35
3 4 2 Stone 89.79
id name birthday purchase_id customer_id \
0 1 Martin 28.04.1990 1 1
1 1 Martin 28.04.1990 2 1
2 1 Martin 28.04.1990 3 1
3 2 Twain 30.11.1835 4 2
item price
0 About the ugly German language 3.14
1 Food 15.92
2 Book 65.35
3 Stone 89.79