Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/68.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/node.js/34.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 按位置数量分组_R - Fatal编程技术网

R 按位置数量分组

R 按位置数量分组,r,R,我想了解一下订单是如何按照地区数量分配的。那么,一个地方有多少订单号,两个订单号,依此类推 例如: Nr <- c("x1", "x2", "x2", "x2", "x3", "x4", "x4", "x4", "x5", "x5", "x5", "x6") location <- c("a", "b", "a", "b", "c", "a", "a", "a", "a", "b", "c", "d") (test <- data.frame(cbind(Nr, locati

我想了解一下订单是如何按照地区数量分配的。那么,一个地方有多少订单号,两个订单号,依此类推

例如:

Nr <- c("x1", "x2", "x2", "x2", "x3", "x4", "x4", "x4", "x5", "x5", "x5", 
"x6")
location <- c("a", "b", "a", "b", "c", "a", "a", "a", "a", "b", "c", "d")
(test <- data.frame(cbind(Nr, location)))

> test
Nr location
1  x1        a
2  x2        b
3  x2        a
4  x2        b
5  x3        c
6  x4        a
7  x4        a
8  x4        a
9  x5        a
10 x5        b
11 x5        c
12 x6        d
我按照订单号和位置对表格进行排序,包括相应的数量:

# A tibble: 9 x 3
# Groups:   Nr [6]
Nr    location quantity
<fct> <fct>     <int>
1 x1    a             1
2 x2    a             1
3 x2    b             2
4 x3    c             1
5 x4    a             3
6 x5    a             1
7 x5    b             1
8 x5    c             1
9 x6    d             1

不幸的是,我不知道怎么做。有人能帮我吗?

一个选项是通过“Nr”创建一列“location”的不同元素,并获得
计数

library(dplyr)
test %>% 
  group_by(Nr) %>% 
  mutate(n_Order = n_distinct(location)) %>% 
  ungroup %>% 
  count(n_Order)
# A tibble: 3 x 2
#  n_Order     n
#    <int> <int>
#1       1     6
#2       2     3
#3       3     3

以下是与您的数字相同的输出:

library(data.table)

test <- as.data.table(test)

> str(test)
Classes ‘data.table’ and 'data.frame':  12 obs. of  2 variables:
 $ order   : chr  "x1" "x2" "x2" "x2" ...
 $ location: chr  "a" "b" "a" "b" ...
 - attr(*, ".internal.selfref")=<externalptr> 

> test[, .(num_locations = length(unique(location)), total_qty = .N), by = .(order)][, .(observations = sum(total_qty)), by = .(num_locations)]
   num_locations observations
1:             1            6
2:             2            3
3:             3            3

请尝试
test%%>%groupby(Nr)%%>%summary(number=n_distinct(location))
您是否需要
test2%%>%groupby(quantity)%%>%summary(number=n())
作为旁注,
data.frame(cbind(Nr,location))
是编写
data.frame(Nr,location)
的更糟糕方法。在这种情况下,结果是等效的,因为所有内容都以
字符开始,但是如果您有数字列,
cbind()
将强制所有内容,包括数字,到
因子类。恐怕不是。我试图在下图中表明我的意图:几乎。观察的数量有问题(或者我是不是计数不正确?)。结果应为n_顺序:1,2,3;第6、3、3条,3@fuul从评论中不清楚。您需要复制行吗?@fuul根据该示例,计数为7,2,3确定这里是另一个图,它应该是什么样子的:。绿色代表一个位置,紫色代表两个位置,黄色代表三个(不同)位置。@aktrun非常感谢您的努力!感谢您提供的替代解决方案!我有点害怕data.table的语法。但是我想我应该看看这个包。@fuul它类似于SQL查询。这里有一个很好的介绍:我发现它是一个非常快速、有用的软件包,可以帮助我用更少的代码行(一般来说)完成工作。一定要看一看!
library(dplyr)
test %>% 
  group_by(Nr) %>% 
  mutate(n_Order = n_distinct(location)) %>% 
  ungroup %>% 
  count(n_Order)
# A tibble: 3 x 2
#  n_Order     n
#    <int> <int>
#1       1     6
#2       2     3
#3       3     3
with(test, table(ave(location, Nr, FUN = function(x) length(unique(x)))))
library(data.table)

test <- as.data.table(test)

> str(test)
Classes ‘data.table’ and 'data.frame':  12 obs. of  2 variables:
 $ order   : chr  "x1" "x2" "x2" "x2" ...
 $ location: chr  "a" "b" "a" "b" ...
 - attr(*, ".internal.selfref")=<externalptr> 

> test[, .(num_locations = length(unique(location)), total_qty = .N), by = .(order)][, .(observations = sum(total_qty)), by = .(num_locations)]
   num_locations observations
1:             1            6
2:             2            3
3:             3            3
# intermediate data.table test2
> test2 <- test[, .(num_locations = length(unique(location)), total_qty = .N), by = .(order)]

> test2[, .(observations = sum(total_qty)), by = .(num_locations)]
   num_locations observations
1:             1            6
2:             2            3
3:             3            3