R 所有可能箱子的频率计数_R_Frequency

R 所有可能箱子的频率计数

R 所有可能箱子的频率计数,r,frequency,R,Frequency,我有一个数据框。我想创建一个频率表，按“组”显示bin频率。如果有一个包含0个实体的箱子，我希望它显示该箱子中有0个实体如果我使用table（）函数，我会得到数据帧中所有箱子的频率计数，但不会按“组”计数。它也没有告诉我，例如，我在组1 Bin 3中没有任何行。我还查看了tablate（），但这似乎也不是我所需要的。不知何故，我需要告诉它可能的垃圾箱实际上是什么下面是一些示例代码 df = as.data.frame(rbind(c(1,1.2), c(1,1.4), c(1,2.1

我有一个数据框。我想创建一个频率表，按“组”显示bin频率。如果有一个包含0个实体的箱子，我希望它显示该箱子中有0个实体

如果我使用

table（）

函数，我会得到数据帧中所有箱子的频率计数，但不会按“组”计数。它也没有告诉我，例如，我在组1 Bin 3中没有任何行。我还查看了

tablate（）

，但这似乎也不是我所需要的。不知何故，我需要告诉它可能的垃圾箱实际上是什么

下面是一些示例代码

    df = as.data.frame(rbind(c(1,1.2), c(1,1.4), c(1,2.1), c(1,2.5), c(1,2.7), c(1,4.1), c(2,1.6), c(2,4.5), c(2,4.3), c(2,4.8), c(2,4.9)))
    colnames(df) = c("Group", "Value")
    df.in = split(df, df$Group)

    FindBin = function(df){
      maxbin = max(ceiling(df$Value),na.rm=TRUE)+1 #what is the maximum bin value. 
       bin = seq(from=0, to=maxbin, by=1) #Specify your bins: 0 to the maximum value by increments of 1
       df$bin_index = findInterval(df$Value, bin, all.inside = TRUE) #Determine which bin the value is in 
      return(df)
    }

    df.out = lapply(names(df.in), function(x) FindBin(df.in[[x]]))
    df.out2 = do.call(rbind.data.frame, df.out) #Row bind the list of dataframes to one dataframe

df.out2的输出如下所示：

        Group Value bin_index
    1      1   1.2         2
    2      1   1.4         2
    3      1   2.1         3
    4      1   2.5         3
    5      1   2.7         3
    6      1   4.1         5
    7      2   1.6         2
    8      2   4.5         5
    9      2   4.3         5
    10     2   4.8         5
    11     2   4.9         5

    Group     Bin     Freq
    1         1       0
    1         2       2
    1         3       3
    1         4       0
    1         5       1
    2         1       0
    2         2       1
    2         3       0
    2         4       0
    2         5       4

除了上面的输出外，我还希望我的结果的摘要输出如下所示：

        Group Value bin_index
    1      1   1.2         2
    2      1   1.4         2
    3      1   2.1         3
    4      1   2.5         3
    5      1   2.7         3
    6      1   4.1         5
    7      2   1.6         2
    8      2   4.5         5
    9      2   4.3         5
    10     2   4.8         5
    11     2   4.9         5

    Group     Bin     Freq
    1         1       0
    1         2       2
    1         3       3
    1         4       0
    1         5       1
    2         1       0
    2         2       1
    2         3       0
    2         4       0
    2         5       4

有什么想法吗？

没有

表格

为第一个问题做你想做的：

df$bin_index <- factor(df$bin_index, levels=1:5)
table(df[, c("Group", "bin_index")])
#       bin_index
# Group 1 2 3 4 5
#     1 0 2 3 0 1
#     2 0 1 0 0 4

不相关的，你为什么不直接使用df$bin_索引谢谢，这在很大程度上满足了我的需要，让我比以前走得更远。使用因子，然后使用表是一个好主意。我所有的组都有不同数量的bin_索引。例如，组1可能有高达130的存储箱，而组2有高达105的存储箱，等等。如果存储箱索引大于该组的最大存储箱索引，那么我可能可以按组号删除行。谢谢