Python 如何根据客户ID组合不同的单元格?
我有一个事务数据集,我想根据客户ID进行转换。下面给出了示例Python 如何根据客户ID组合不同的单元格?,python,r,excel,Python,R,Excel,我有一个事务数据集,我想根据客户ID进行转换。下面给出了示例 CustomerID Description 17850 WHITE HANGING HEART T-LIGHT HOLDER 17850 WHITE METAL LANTERN 13047 ASSORTED COLOUR BIRD ORNAMENT 13047 POPPY'S PLAYHOUSE BEDROOM 13047 POPPY'S PLA
CustomerID Description
17850 WHITE HANGING HEART T-LIGHT HOLDER
17850 WHITE METAL LANTERN
13047 ASSORTED COLOUR BIRD ORNAMENT
13047 POPPY'S PLAYHOUSE BEDROOM
13047 POPPY'S PLAYHOUSE KITCHEN
我希望此数据集按以下顺序排列:-
17850 WHITE HANGING HEART T-LIGHT HOLDER, WHITE METAL LANTERN
13047 ASSORTED COLOUR BIRD ORNAMENT,POPPY'S PLAYHOUSE BEDROOM, POPPY'S PLAYHOUSE KITCHEN
数据集为csv格式,每个值位于单独的单元格中。
有人能建议在excel、R或python中使用任何方法吗?在python中,您可以使用
安装它,然后再试一次
import pandas as pd
# Read the cvs file
df = pd.read_csv('yourFileName.csv')
# Group by CustomerID and join Descriptions with commas
df.groupby('CustomerID')['Description'].apply(','.join)
# Save the result in cvs file
df.to_csv('resultFileName.csv', index=False)
您可以使用
aggregate()
函数,创建我自己的数据,您可以对上面的数据框执行此操作。根据客户
编号,将文本
连接起来
> df <- data.frame(Customer = c(1,1,2,3,3,4), Texts = c("AAA","aaa","BBB","bbb","CCC","ccc"))
> df
Customer Texts
1 1 AAA
2 1 aaa
3 2 BBB
4 3 bbb
5 3 CCC
6 4 ccc
> aggregate(Texts~Customer,toString,data=df)
Customer Texts
1 1 AAA, aaa
2 2 BBB
3 3 bbb, CCC
4 4 ccc
>测向
客户短信
11 AAA
2.1 aaa
3.2 BBB
4.3 bbb
5.3 CCC
6.4 ccc
>聚合(text~Customer,toString,data=df)
客户短信
11 AAA,AAA
2 BBB
3 bbb,CCC
4 ccc
其他方法包括使用plyr
和data.table
。data.table可能更高效、简单,并且提供了控制
library(plyr)
ddply(df, .(ID), summarize, Text = paste(Text, collapse = ","))
或
require(DT)
对于代码和数据,请缩进四个空格,使其突出。在这种情况下,我已经为您做了这件事,所以您可以看到区别。谢谢您,先生,问题的格式。你能给我一些简单的方法来实现我想要的数据格式吗。有一万多个数据点。
require(DT)
DT <- data.table(df)
# group the table by ID and then add a new column by pasting the list
# of values in each group together.
DT[, list(Text = paste(Text, collapse = ",")), by = ID]
ID Text
1: 17850 WHITE HANGING HEART T-LIGHT HOLDER,WHITE METAL LANTERN
2: 13047 ASSORTED COLOUR BIRD ORNAMENT,POPPY'S PLAYHOUSE BEDROOM, POPPY'S PLAYHOUSE KITCHEN
df <- data.frame(ID = c(17850,17850,13047,13047,13047),
Text = c("WHITE HANGING HEART T-LIGHT HOLDER","WHITE METAL LANTERN",
" ASSORTED COLOUR BIRD ORNAMENT","POPPY'S PLAYHOUSE BEDROOM",
" POPPY'S PLAYHOUSE KITCHEN"))