在R中，如何在多个列中按因子拆分/子集数据帧？_R

在R中，如何在多个列中按因子拆分/子集数据帧？

在R中，如何在多个列中按因子拆分/子集数据帧？,r,R,我的数据如下所示： ID Test Type Subject Marks 1 Unit test 1 English 85 2 Unit test 1 English 75 3 Unit test 1 English 78 1 Unit test 2 English 85 2 Unit test 2 English 75 3 Unit test 2 English 78

我的数据如下所示：

ID   Test Type       Subject  Marks
1    Unit test 1     English   85
2    Unit test 1     English   75
3    Unit test 1     English   78
1    Unit test 2     English   85
2    Unit test 2     English   75
3    Unit test 2     English   78
1    Unit test 1     Maths     78
2    Unit test 1     Maths     79
3    Unit test 1     Maths     98
1    Unit test 2     Maths     95
2    Unit test 2     Maths     98
3    Unit test 2     Maths     88

我想按“测试类型”和“主题”分割数据。我应该使用什么函数？我期望的结果是：

data frame 1:
    ID   Test Type       Subject  Marks
    1    Unit test 1     English   85
    2    Unit test 1     English   75
    3    Unit test 1     English   78

data frame 2:
    ID   Test Type       Subject  Marks
    1    Unit test 2     English   85
    2    Unit test 2     English   75
    3    Unit test 2     English   78

data frame 3 :
    ID   Test Type       Subject  Marks
    1    Unit test 1     Maths     78
    2    Unit test 1     Maths     79
    3    Unit test 1     Maths     98

data frame 4:
    ID   Test Type       Subject  Marks
    1    Unit test 2     Maths     95
    2    Unit test 2     Maths     98
    3    Unit test 2     Maths     88

您可以使用

split（）

（感谢DrDom的改进）

其中

df

是原始数据

df <- structure(list(ID = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L), Test.Type = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L), .Label = c("Unit test 1", "Unit test 2"), class = "factor"),
    Subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
    2L, 2L, 2L), .Label = c("English", "Maths"), class = "factor"), 
    Marks = c(85L, 75L, 78L, 85L, 75L, 78L, 78L, 79L, 98L, 95L, 
    98L, 88L)), .Names = c("ID", "Test.Type", "Subject", "Marks"
), class = "data.frame", row.names = c(NA, -12L))

df以下代码将计算每个测试类型/受试者组合的平均分数：
# df($testtype, $subject)

> ddply(df, .(testtype, subject), summarize, avgmark = round(mean(marks), 0))

结果：
     testtype subject avgmark
1 Unit Test 1 English      79
2 Unit Test 1   Maths      85
3 Unit Test 2 English      79
4 Unit Test 2   Maths      94

ddply
函数将为每组计算avgmark
，并返回数据帧结果。您可以用所需的任何聚合函数替换avgmark
。您还可以在avgmark
之后添加更多聚合函数。有关更多信息，请参阅本文。
另一个简单的解决方案是使用by
：
list.df <- by(df, INDICES =  list(df$Test.Type, df$Subject), FUN = data.frame)

然后，您可以使用list.df[[1]]
通过list.df[[4]]
访问每个单独的数据帧
（感谢Richard Scriven在回答中输入数据。）
可能是split（）
，但请显示所需的结果查德的回答会满足您的要求。但是，如果你必须在分组数据帧上做聚合，你应该考虑使用<代码> DDLUP  .@李察Script，根据S拆开函数的PROTYPE：SPLY（X，F，LUP = FALSE，…），我不能使用Stata（Data，Data，C（“测试类型”，“主题”））根据“测试类型”和“主题”来分割数据吗？不，你尝试了吗？您必须选择一列进行拆分。下面的任何一个答案都会很好，你几乎肯定想要dplyr
而不是split
，就像@TimBiegeleisen所说的，因为你将对每个组进行一些处理，然后将结果合并或汇总回数据帧或汇总统计数据dplyr
是一个非常漂亮且可扩展的split-apply联合收割机范例，请查看。它将解决您的所有问题：）split（df，list（df[，2]，df[，3]））
调用将更简单，并返回相同的输出。噢，哇，我从来不知道您可以在split中使用列表。谢谢@DrDom，非常好。每天学习新的东西！就像在聚合中一样，你可以使用列表进行子集设置。在拆分中使用列表确实是疯狂的！另一个有趣的变体：split（df，do.call（paste0，df[2:3]）@user2745366，不要忘记接受（绿色复选标记）你选择的答案，无论它是什么。。。
list.df <- by(df, INDICES =  list(df$Test.Type, df$Subject), FUN = data.frame)

> list.df
: Unit test 1
: English
  ID   Test.Type Subject Marks
1  1 Unit test 1 English    85
2  2 Unit test 1 English    75
3  3 Unit test 1 English    78
-------------------------------------------------------------------------------------------------- 
: Unit test 2
: English
  ID   Test.Type Subject Marks
4  1 Unit test 2 English    85
5  2 Unit test 2 English    75
6  3 Unit test 2 English    78
-------------------------------------------------------------------------------------------------- 
: Unit test 1
: Maths
  ID   Test.Type Subject Marks
7  1 Unit test 1   Maths    78
8  2 Unit test 1   Maths    79
9  3 Unit test 1   Maths    98
-------------------------------------------------------------------------------------------------- 
: Unit test 2
: Maths
   ID   Test.Type Subject Marks
10  1 Unit test 2   Maths    95
11  2 Unit test 2   Maths    98
12  3 Unit test 2   Maths    88