Arrays R中的字符串数组组合
我开始在R学习,甚至在很多论坛上寻找这个话题,我都找不到一个好的答案。也许我没有使用正确的术语进行搜索,或者在R中不可能这样做,所以请为我的无知道歉 我想知道两个专业人士参与一个给定项目的次数。除此之外,我想绘制他们在一起时的位置 我没有在下面使用特定的符号。例如,假设我有以下字符串数组:Arrays R中的字符串数组组合,arrays,string,r,histogram,Arrays,String,R,Histogram,我开始在R学习,甚至在很多论坛上寻找这个话题,我都找不到一个好的答案。也许我没有使用正确的术语进行搜索,或者在R中不可能这样做,所以请为我的无知道歉 我想知道两个专业人士参与一个给定项目的次数。除此之外,我想绘制他们在一起时的位置 我没有在下面使用特定的符号。例如,假设我有以下字符串数组: Project1: Bob (President), Joe (Vice President), Mary (Participant), Paul (Participant) Project2: Bob (P
Project1: Bob (President), Joe (Vice President), Mary (Participant), Paul (Participant)
Project2: Bob (President), Joe (Vice President), Sue (Participant), Bill (Participant)
Project3: Paul (President), Sue (Vice President), Bob (Participant), Joe (Participant)
Project'n: (...)
产出将是:
鲍勃(总裁)和乔(副总裁)=2
鲍勃(总裁)和玛丽(参与者)=1
Bob(总裁)和Paul(参与者)=1
Bob(参与者)和Paul(总裁)=1
苏(副总裁)和乔(参与者)=1
它会一直持续下去,我假设这些结果可以聚合在一个直方图中。我有86个名字,参与了38个不同的项目,在3个不同的职位上
如果可以在R中执行,有什么想法吗?它是如何实现的?我可以使用任何代码模板或文档来获得这个答案吗
##我的尝试(开始)
现在,例如,在“P1”项目中,我们可以看到保罗担任总裁,鲍勃担任副总裁。同样的情况也发生在项目“P2”中。在“P3”中,我们有保罗担任主席,苏和比尔都是参与者
我现在的疑问是,如何计算在所有项目中出现的给定关系的数量。比如:
Paul/President & Bob/Vice = 2 occurrences,
Paul/President & Sue/Participant = 1 occurrence,
Paul/President & Bill/Participant = 1 occurrence, etc
基本上是基于特定人员/角色组合的“历史记录”
##我的尝试(结束)现在您有了
表
,您可以使用apply
计算不同轴组上不同类型关系的出现次数:
每个项目有多少不同类型的参与者?
> apply(Table, c(2,3), sum)
Role
Group Participant President Vice President
P1 0 1 1
P2 0 1 1
P3 2 1 0
> apply(Table, c(1,3), sum)
Role
Name Participant President Vice President
Bill 1 0 0
Bob 0 0 2
Paul 0 3 0
Sue 1 0 0
> apply(Table, c(1,2), sum)
Group
Name P1 P2 P3
Bill 0 0 1
Bob 1 1 0
Paul 1 1 1
Sue 0 0 1
> apply(Table, 1, sum)
Bill Bob Paul Sue
1 2 3 1
> apply(Table, 2, sum)
P1 P2 P3
2 2 3
> apply(Table, 3, sum)
Participant President Vice President
2 3 2
个人角色组合出现多少次?
> apply(Table, c(2,3), sum)
Role
Group Participant President Vice President
P1 0 1 1
P2 0 1 1
P3 2 1 0
> apply(Table, c(1,3), sum)
Role
Name Participant President Vice President
Bill 1 0 0
Bob 0 0 2
Paul 0 3 0
Sue 1 0 0
> apply(Table, c(1,2), sum)
Group
Name P1 P2 P3
Bill 0 0 1
Bob 1 1 0
Paul 1 1 1
Sue 0 0 1
> apply(Table, 1, sum)
Bill Bob Paul Sue
1 2 3 1
> apply(Table, 2, sum)
P1 P2 P3
2 2 3
> apply(Table, 3, sum)
Participant President Vice President
2 3 2
每个人都在哪些项目中工作?
> apply(Table, c(2,3), sum)
Role
Group Participant President Vice President
P1 0 1 1
P2 0 1 1
P3 2 1 0
> apply(Table, c(1,3), sum)
Role
Name Participant President Vice President
Bill 1 0 0
Bob 0 0 2
Paul 0 3 0
Sue 1 0 0
> apply(Table, c(1,2), sum)
Group
Name P1 P2 P3
Bill 0 0 1
Bob 1 1 0
Paul 1 1 1
Sue 0 0 1
> apply(Table, 1, sum)
Bill Bob Paul Sue
1 2 3 1
> apply(Table, 2, sum)
P1 P2 P3
2 2 3
> apply(Table, 3, sum)
Participant President Vice President
2 3 2
每个人从事多少个项目?
> apply(Table, c(2,3), sum)
Role
Group Participant President Vice President
P1 0 1 1
P2 0 1 1
P3 2 1 0
> apply(Table, c(1,3), sum)
Role
Name Participant President Vice President
Bill 1 0 0
Bob 0 0 2
Paul 0 3 0
Sue 1 0 0
> apply(Table, c(1,2), sum)
Group
Name P1 P2 P3
Bill 0 0 1
Bob 1 1 0
Paul 1 1 1
Sue 0 0 1
> apply(Table, 1, sum)
Bill Bob Paul Sue
1 2 3 1
> apply(Table, 2, sum)
P1 P2 P3
2 2 3
> apply(Table, 3, sum)
Participant President Vice President
2 3 2
每个项目涉及多少人?
> apply(Table, c(2,3), sum)
Role
Group Participant President Vice President
P1 0 1 1
P2 0 1 1
P3 2 1 0
> apply(Table, c(1,3), sum)
Role
Name Participant President Vice President
Bill 1 0 0
Bob 0 0 2
Paul 0 3 0
Sue 1 0 0
> apply(Table, c(1,2), sum)
Group
Name P1 P2 P3
Bill 0 0 1
Bob 1 1 0
Paul 1 1 1
Sue 0 0 1
> apply(Table, 1, sum)
Bill Bob Paul Sue
1 2 3 1
> apply(Table, 2, sum)
P1 P2 P3
2 2 3
> apply(Table, 3, sum)
Participant President Vice President
2 3 2
每个角色有多少人?
> apply(Table, c(2,3), sum)
Role
Group Participant President Vice President
P1 0 1 1
P2 0 1 1
P3 2 1 0
> apply(Table, c(1,3), sum)
Role
Name Participant President Vice President
Bill 1 0 0
Bob 0 0 2
Paul 0 3 0
Sue 1 0 0
> apply(Table, c(1,2), sum)
Group
Name P1 P2 P3
Bill 0 0 1
Bob 1 1 0
Paul 1 1 1
Sue 0 0 1
> apply(Table, 1, sum)
Bill Bob Paul Sue
1 2 3 1
> apply(Table, 2, sum)
P1 P2 P3
2 2 3
> apply(Table, 3, sum)
Participant President Vice President
2 3 2
谢谢@Scottrichie的提示。经过一些额外的阅读和测试,我得出以下结论: 导入了一个csv文件,其中包含名称、项目和角色的列。我还在末尾添加了另一列,如计数器(从一端到另一端的常量值为1) 我做到了:
Groupings <-read.csv("~/Documents/TCC_BIGDATA/Test.csv", sep=";")
Groupings$Counter <- as.integer(Groupings$Counter)
print(Groupings)
Project Name Role Counter
1 P1 Paul President 1
2 P1 Bob Vice President 1
3 P1 Sue Participant 1
4 P1 Bill Participant 1
5 P2 Paul Vice President 1
6 P2 Bob Participant 1
7 P2 Bill President 1
8 P3 Bob President 1
9 P3 Bill Vice President 1
10 P3 Sue Participant 1
名称+角色组合在列表中显示多少次
aggregate(Counter ~ Name, data = Groupings, sum)
Name Counter
1 Bill 3
2 Bob 3
3 Paul 2
4 Sue 2
aggregate(Counter ~ Name + Role, data = Groupings, sum)
Name Role Counter
1 Bill Participant 1
2 Bob Participant 1
3 Sue Participant 2
4 Bill President 1
5 Bob President 1
6 Paul President 1
7 Bill Vice President 1
8 Bob Vice President 1
9 Paul Vice President 1
还可以进行其他练习和组合。最后,这只是你(@ScottRitchie)为回答我的问题而构建的另一种方法。我认为分享是个好主意,这样其他人可以申请。是的,这是可能的。到目前为止你都尝试了什么?嗨@Scottrichie,谢谢你的回复。我用我试过的方法编辑了这个问题。谢谢@Scottrichie。这对我需要评估这个数据集的几个主题有很大帮助。我在这里做了一些同样有助于我进行评估的事情。由于在评论框中,字符数量有限,我将用几分钟前所做的回答我自己的问题。