R 将聚合行重塑为新列、分类数据
我正在尝试使用R将行聚合到列。这是我的数据集的一个示例R 将聚合行重塑为新列、分类数据,r,reshape,R,Reshape,我正在尝试使用R将行聚合到列。这是我的数据集的一个示例 age sex hash emotion color 22 1 b17f9762462b37e7510f0e6d2534530d Lonely #006666 22 1 b17f9762462b37e7510f0e6d2534530d Energetic #66CC00 22 1 b17f9762462b37e7510f0e6d
age sex hash emotion color
22 1 b17f9762462b37e7510f0e6d2534530d Lonely #006666
22 1 b17f9762462b37e7510f0e6d2534530d Energetic #66CC00
22 1 b17f9762462b37e7510f0e6d2534530d Calm #FFFFFF
22 1 b17f9762462b37e7510f0e6d2534530d Angry #FF0000
24 1 7bb50ca97a9b517239b39440a966d2f6 Calm #006666
24 1 7bb50ca97a9b517239b39440a966d2f6 Excited #0033cc
24 1 7bb50ca97a9b517239b39440a966d2f6 Empty/void #999999
24 1 7bb50ca97a9b517239b39440a966d2f6 No emotion #FF6600
26 1 209f1ba8ef86e855deccc0aae120825c Comfortable #330066
21 1 b9e9309c0b1255a7efb2edf9ba66ae46 Energetic #330099
21 1 b9e9309c0b1255a7efb2edf9ba66ae46 Happy #330066
26 1 209f1ba8ef86e855deccc0aae120825c No emotion #FFCC00
26 1 209f1ba8ef86e855deccc0aae120825c Calm #006666
21 1 61debd3dea6d1aacce5c9fc7daec4fe5 Empty/void #FFFFFF
21 1 b9e9309c0b1255a7efb2edf9ba66ae46 Calm #006666
26 1 209f1ba8ef86e855deccc0aae120825c No emotion #339900
21 1 61debd3dea6d1aacce5c9fc7daec4fe5 Loved #FF6600
26 1 209f1ba8ef86e855deccc0aae120825c No emotion #66CC00
我想做的是得到这个:
age sex hash #000000 #FF0000 ... #FFFFFF
22 1 8798tkojstwz9ei sad happy ... loved
...
一个响应由散列定义,相关数据是年龄和性别
我希望每个响应都是1,而不是几个列。每种颜色都应该有它自己的列,并且关联的情感作为该列的值
整个数据集有13种颜色,20+种情绪和1000+种响应。数据集与示例完全相同,存储在mySQL数据库中
我尝试过重塑,但它不能很好地处理分类数据,或者我没有使用适当的函数。有什么想法吗?如果需要,它可以包括一些mySQL准备。Java在这里非常慢,因为我有12k+行,所以R听起来是适合这个的
谢谢。如果我正确理解了您的目标,
重塑()
确实是您需要的功能。假设您的数据集名为mydf
,请尝试以下操作:
reshape(mydf, direction = "wide",
idvar = c("hash", "age", "sex"),
timevar = "color")
# age sex hash emotion.#006666 emotion.#66CC00
# 1 22 1 b17f9762462b37e7510f0e6d2534530d Lonely Energetic
# 5 24 1 7bb50ca97a9b517239b39440a966d2f6 Calm <NA>
# 9 26 1 209f1ba8ef86e855deccc0aae120825c Calm No emotion
# 10 21 1 b9e9309c0b1255a7efb2edf9ba66ae46 Calm <NA>
# 14 21 1 61debd3dea6d1aacce5c9fc7daec4fe5 <NA> <NA>
# emotion.#FFFFFF emotion.#FF0000 emotion.#0033cc emotion.#999999 emotion.#FF6600
# 1 Calm Angry <NA> <NA> <NA>
# 5 <NA> <NA> Excited Empty/void No emotion
# 9 <NA> <NA> <NA> <NA> <NA>
# 10 <NA> <NA> <NA> <NA> <NA>
# 14 Empty/void <NA> <NA> <NA> Loved
# emotion.#330066 emotion.#330099 emotion.#FFCC00 emotion.#339900
# 1 <NA> <NA> <NA> <NA>
# 5 <NA> <NA> <NA> <NA>
# 9 Comfortable <NA> No emotion No emotion
# 10 Happy Energetic <NA> <NA>
# 14 <NA> <NA> <NA> <NA>
重塑(mydf,direction=“wide”,
idvar=c(“散列”、“年龄”、“性别”),
timevar=“color”)
#年龄性别情绪。#006666情绪。#66CC00
#1 22 1 b17f9762462b37e7510f0e6d2534530d
#5 24 1 7bb50ca97a9b517239b39440a966d2f6平静
#9 26 1 209f1ba8ef86e855decc0aae120825c冷静无情绪
#10 21 1 b9e9309c0b1255a7efb2edf9ba66ae46平静
#14 21 1 DEBD3DEA6D1AACCE5C9FC7DAEC4FE5
#情感。#FFFFFF情感。#FF0000情感。#0033cc情感。#999999情感。#FF6600情感
#1平息愤怒
#5.兴奋、空虚、无情感
# 9
# 10
#14空/空
#情绪。#330066情绪。#330099情绪。#FFCC00情绪。#339900情绪
# 1
# 5
#9舒适无情感无情感
#10快乐精力充沛
# 14
如果需要,您可以稍后重命名这些列。如果我正确理解您的目标,
重塑()
确实是您需要的函数。假设您的数据集名为mydf
,请尝试以下操作:
reshape(mydf, direction = "wide",
idvar = c("hash", "age", "sex"),
timevar = "color")
# age sex hash emotion.#006666 emotion.#66CC00
# 1 22 1 b17f9762462b37e7510f0e6d2534530d Lonely Energetic
# 5 24 1 7bb50ca97a9b517239b39440a966d2f6 Calm <NA>
# 9 26 1 209f1ba8ef86e855deccc0aae120825c Calm No emotion
# 10 21 1 b9e9309c0b1255a7efb2edf9ba66ae46 Calm <NA>
# 14 21 1 61debd3dea6d1aacce5c9fc7daec4fe5 <NA> <NA>
# emotion.#FFFFFF emotion.#FF0000 emotion.#0033cc emotion.#999999 emotion.#FF6600
# 1 Calm Angry <NA> <NA> <NA>
# 5 <NA> <NA> Excited Empty/void No emotion
# 9 <NA> <NA> <NA> <NA> <NA>
# 10 <NA> <NA> <NA> <NA> <NA>
# 14 Empty/void <NA> <NA> <NA> Loved
# emotion.#330066 emotion.#330099 emotion.#FFCC00 emotion.#339900
# 1 <NA> <NA> <NA> <NA>
# 5 <NA> <NA> <NA> <NA>
# 9 Comfortable <NA> No emotion No emotion
# 10 Happy Energetic <NA> <NA>
# 14 <NA> <NA> <NA> <NA>
重塑(mydf,direction=“wide”,
idvar=c(“散列”、“年龄”、“性别”),
timevar=“color”)
#年龄性别情绪。#006666情绪。#66CC00
#1 22 1 b17f9762462b37e7510f0e6d2534530d
#5 24 1 7bb50ca97a9b517239b39440a966d2f6平静
#9 26 1 209f1ba8ef86e855decc0aae120825c冷静无情绪
#10 21 1 b9e9309c0b1255a7efb2edf9ba66ae46平静
#14 21 1 DEBD3DEA6D1AACCE5C9FC7DAEC4FE5
#情感。#FFFFFF情感。#FF0000情感。#0033cc情感。#999999情感。#FF6600情感
#1平息愤怒
#5.兴奋、空虚、无情感
# 9
# 10
#14空/空
#情绪。#330066情绪。#330099情绪。#FFCC00情绪。#339900情绪
# 1
# 5
#9舒适无情感无情感
#10快乐精力充沛
# 14
如果需要,您可以稍后重命名这些列。使用
重塑2
dcast(dat,...~color,value.var='emotion')
age sex hash #0033cc #006666 #330066 #330099 #339900 #66CC00 #999999 #FF0000 #FF6600
1 21 1 61debd3dea6d1aacce5c9fc7daec4fe5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Loved
2 21 1 b9e9309c0b1255a7efb2edf9ba66ae46 <NA> Calm Happy Energetic <NA> <NA> <NA> <NA> <NA>
3 22 1 b17f9762462b37e7510f0e6d2534530d <NA> Lonely <NA> <NA> <NA> Energetic <NA> Angry <NA>
4 24 1 7bb50ca97a9b517239b39440a966d2f6 Excited Calm <NA> <NA> <NA> <NA> Empty <NA> Noemotion
5 26 1 209f1ba8ef86e855deccc0aae120825c <NA> Calm Comfortable <NA> Noemotion Noemotion <NA> <NA> <NA>
#FFCC00 #FFFFFF
1 <NA> Empty
2 <NA> <NA>
3 <NA> Calm
4 <NA> <NA>
5 Noemotion <NA>
dcast(dat,…~color,value.var='emotion')
年龄性别哈希表#0033cc#006666#330066#330099#339900#66CC00#999999#FF0000#FF6600
1 21 1 DEBD3 DEAD6 AACCE5C9FC7DAEC4FE5爱
2 21 b9e9309c0b1255a7efb2edf9ba66ae46冷静快乐精力充沛
3 22 1 b17f9762462b37e7510f0e6d2534530d
4 24 1 7bb50ca97a9b517239b39440a966d2f6激发平静运动
5 26 1 209F1BA8EF86E855DECC0AAE120825C冷静舒适无运动无运动
#FFCC00#FFFFFF
1空
2.
3冷静
4.
5无异议动议
使用重塑2
dcast(dat,...~color,value.var='emotion')
age sex hash #0033cc #006666 #330066 #330099 #339900 #66CC00 #999999 #FF0000 #FF6600
1 21 1 61debd3dea6d1aacce5c9fc7daec4fe5 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> Loved
2 21 1 b9e9309c0b1255a7efb2edf9ba66ae46 <NA> Calm Happy Energetic <NA> <NA> <NA> <NA> <NA>
3 22 1 b17f9762462b37e7510f0e6d2534530d <NA> Lonely <NA> <NA> <NA> Energetic <NA> Angry <NA>
4 24 1 7bb50ca97a9b517239b39440a966d2f6 Excited Calm <NA> <NA> <NA> <NA> Empty <NA> Noemotion
5 26 1 209f1ba8ef86e855deccc0aae120825c <NA> Calm Comfortable <NA> Noemotion Noemotion <NA> <NA> <NA>
#FFCC00 #FFFFFF
1 <NA> Empty
2 <NA> <NA>
3 <NA> Calm
4 <NA> <NA>
5 Noemotion <NA>
dcast(dat,…~color,value.var='emotion')
年龄性别哈希表#0033cc#006666#330066#330099#339900#66CC00#999999#FF0000#FF6600
1 21 1 DEBD3 DEAD6 AACCE5C9FC7DAEC4FE5爱
2 21 b9e9309c0b1255a7efb2edf9ba66ae46冷静快乐精力充沛
3 22 1 b17f9762462b37e7510f0e6d2534530d
4 24 1 7bb50ca97a9b517239b39440a966d2f6激发平静运动
5 26 1 209F1BA8EF86E855DECC0AAE120825C冷静舒适无运动无运动
#FFCC00#FFFFFF
1空
2.
3冷静
4.
5无异议动议
我认为aggregate
不是这个问题的合适标题(或标签)。你能考虑把问题重新命名为更精确地描述你的问题所在的东西吗?当然,你有什么建议?不确定!这完全取决于你被困在哪里,因为其他人也可能被困在那里