Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/79.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R:收集重复的列_R - Fatal编程技术网

R:收集重复的列

R:收集重复的列,r,R,样本数据: df1 <- structure(list(Name = structure(c(3L, 2L, 1L), .Label = c("Bob", "Joe", "Mike"), class = "factor"), Location = structure(c(1L, 1L, 2L), .Label = c("CA", "WA"), class = "factor"), Title = structure(c(2L, 3L, 1L), .Label = c("CEO

样本数据:

    df1 <- structure(list(Name = structure(c(3L, 2L, 1L), .Label = c("Bob", 
"Joe", "Mike"), class = "factor"), Location = structure(c(1L, 
1L, 2L), .Label = c("CA", "WA"), class = "factor"), Title = structure(c(2L, 
3L, 1L), .Label = c("CEO", "Manager", "VP"), class = "factor"), 
    Class = structure(c(1L, 2L, 2L), .Label = c("Class1", "Class2"
    ), class = "factor"), Month = c(1, 2, 3), Class.1 = structure(c(3L, 
    2L, 1L), .Label = c("Class1", "Class2", "Class4"), class = "factor"), 
    Month.1 = c(3, 3, 2), Objective = structure(1:3, .Label = c("Obj1", 
    "Obj2", "Obj3"), class = "factor"), Month.2 = c(2, 7, 7), 
    Category = c("x", "y", "z"), Objective.1 = structure(c(3L, 
    2L, 1L), .Label = c("Obj1", "Obj7", "Obj9"), class = "factor"), 
    Month.3 = c(4, 5, 5), Category2 = c("z", "r", "q")), .Names = c("Name", 
"Location", "Title", "Class", "Month", "Class.1", "Month.1", 
"Objective", "Month.2", "Category", "Objective.1", "Month.3", 
"Category2"), class = "data.frame", row.names = c(NA, -3L))

  Name Location   Title  Class Month Class.1 Month.1 Objective Month.2 Category Objective.1 Month.3 Category2
1 Mike       CA Manager Class1     1  Class4       3      Obj1       2        x        Obj9       4         z
2  Joe       CA      VP Class2     2  Class2       3      Obj2       7        y        Obj7       5         r
3  Bob       WA     CEO Class2     3  Class1       2      Obj3       7        z        Obj1       5         q
我用
聚集
分散
等在堆栈上尝试了一些类似的例子,但我不知道如何将班级月份组和目标月份组保持在一起

在我的真实数据集中,有100个列和8个ID列。而不仅仅是班级月或目标月对,上半部分列是四人一组,下半部分列是八人一组。四人小组的一个例子是班级月成本日期

Mike的示例输出:

  Name Location   Title Variable Value Value.2
1 Mike       CA Manager   Class1     1    <NA>
2 Mike       CA Manager   Class4     3    <NA>
3 Mike       CA Manager     Obj1     2       x
4 Mike       CA Manager     Obj9     4       z
名称位置标题变量值。2
1迈克CA经理1 1 1
2迈克CA经理4级3
3麦克CA经理Obj1 2 x
4 Mike CA经理Obj9 4 z

重复值是可以的,但您需要指定哪些值组合在一起(在示例中为“类”和“目标”),以获得OP的输出:

library(data.table)
melt(setDT(df1), 
  meas = patterns("Class|Objective", "Month", "Category")
)[order(Name)]

    Name Location   Title variable value1 value2 value3
 1:  Bob       WA     CEO        1 Class2      3      z
 2:  Bob       WA     CEO        2 Class1      2      q
 3:  Bob       WA     CEO        3   Obj3      7     NA
 4:  Bob       WA     CEO        4   Obj1      5     NA
 5:  Joe       CA      VP        1 Class2      2      y
 6:  Joe       CA      VP        2 Class2      3      r
 7:  Joe       CA      VP        3   Obj2      7     NA
 8:  Joe       CA      VP        4   Obj7      5     NA
 9: Mike       CA Manager        1 Class1      1      x
10: Mike       CA Manager        2 Class4      3      z
11: Mike       CA Manager        3   Obj1      2     NA
12: Mike       CA Manager        4   Obj9      4     NA
如果列名重复相同,或者使用
check.names=TRUE
来消除歧义,这无关紧要,因为
patterns
只匹配名称中的模式。有关如何在需要时指定模式的更多信息,请参见
?regex


melt
(请参阅
?melt.data.table
)的其他参数可用于为结果中的列提供自定义名称(而不是“value1”、“value2”和…)。

对,因此对于Mike,您将有4行,
变量=c(Class1、Class4、Obj1、Obj9)
值=c(1,3,2,4)能否将一些示例行添加到预期输出中?您是否可以控制非唯一列命名,或者这是这个问题的唯一原因?@r2evans为Mike添加了示例输出。我当然可以设置
check.names=TRUE
读取数据,但我不知道这是否有帮助。如果使用
check.names=TRUE
,您将拥有
Class.1
Class.2
,等等,但它们仍然需要在一列中结束。每个“组”是否总是这样列的开头是
目标
?太棒了,这就行了。这需要一点工作,以清理它与完整的100列,但它把一切都正确地结合在一起。谢谢
library(data.table)
melt(setDT(df1), 
  meas = patterns("Class|Objective", "Month", "Category")
)[order(Name)]

    Name Location   Title variable value1 value2 value3
 1:  Bob       WA     CEO        1 Class2      3      z
 2:  Bob       WA     CEO        2 Class1      2      q
 3:  Bob       WA     CEO        3   Obj3      7     NA
 4:  Bob       WA     CEO        4   Obj1      5     NA
 5:  Joe       CA      VP        1 Class2      2      y
 6:  Joe       CA      VP        2 Class2      3      r
 7:  Joe       CA      VP        3   Obj2      7     NA
 8:  Joe       CA      VP        4   Obj7      5     NA
 9: Mike       CA Manager        1 Class1      1      x
10: Mike       CA Manager        2 Class4      3      z
11: Mike       CA Manager        3   Obj1      2     NA
12: Mike       CA Manager        4   Obj9      4     NA