R 当打印为单个字段或列时,列为空,打印整个数据帧时填充
当我使用dplyr创建计数列时,它似乎已正确填充,直到我尝试单独使用计数列为止。 例子: 我创建此数据帧:R 当打印为单个字段或列时,列为空,打印整个数据帧时填充,r,count,dplyr,R,Count,Dplyr,当我使用dplyr创建计数列时,它似乎已正确填充,直到我尝试单独使用计数列为止。 例子: 我创建此数据帧: V1 <- c("TEST", "test", "tEsT", "tesT", "TesTing", "testing","ME-TESTED", "re tested", "RE testing") V2 <- c("othertest", "anothertest", "testing", "123", "random stuff", "irrelevant", "test
V1 <- c("TEST", "test", "tEsT", "tesT", "TesTing", "testing","ME-TESTED", "re tested", "RE testing")
V2 <- c("othertest", "anothertest", "testing", "123", "random stuff", "irrelevant", "tested", "re-test", "tests")
V3 <- c("type1", "type2", "type1", "type2", "type3", "type2", "type2", "type2", "type1")
df <- data.frame(V1, V2, V3)
但是,当我尝试以任何方式使用counts.count列时,结果为空:
> df$counts.count
NULL
dplyr创建的其他列的结果相同。
但数据帧的其余部分似乎正常:
> df$V1
[1] TEST test tEsT tesT TesTing testing ME-TESTED re tested RE testing
Levels: ME-TESTED re tested RE testing test tesT tEsT TEST testing TesTing
我完全搞不懂为什么打印整个df会给我一个不同的输出,而不仅仅是打印感兴趣的列。我在这里遗漏了什么?如果您倒带并重新创建数据帧,然后不执行赋值,只将结果打印到屏幕上,您会看到:
df %>% group_by(V3) %>% mutate(count = n())
Source: local data frame [9 x 4]
Groups: V3 [3]
V1 V2 V3 count
<fctr> <fctr> <fctr> <int>
1 TEST othertest type1 3
2 test anothertest type2 5
3 tEsT testing type1 3
4 tesT 123 type2 5
5 TesTing random stuff type3 1
6 testing irrelevant type2 5
7 ME-TESTED tested type2 5
8 re tested re-test type2 5
9 RE testing tests type1 3
df%>%groupby(V3)%>%mutate(count=n())
来源:本地数据帧[9 x 4]
分组:V3[3]
V1 V2 V3计数
1测试其他测试类型1 3
2测试另一测试类型2 5
3测试类型1 3
4测试123类型2 5
5测试随机材料类型3 1
6测试无关类型2 5
7经ME测试的测试类型2 5
8重新测试重新测试类型2 5
9重新测试类型1 3
如果你现在做这个假设,结构相当混乱,我认为如果V1或V2的唯一值较少,你可能会得到一个信息更丰富的错误:
df$counts <- df %>% group_by(V3) %>% mutate(count = n())
# snipped what you already showed
str(df)
#-----
'data.frame': 9 obs. of 4 variables:
$ V1 : Factor w/ 9 levels "ME-TESTED","re tested",..: 7 4 6 5 9 8 1 2 3
$ V2 : Factor w/ 9 levels "123","anothertest",..: 4 2 8 1 5 3 7 6 9
$ V3 : Factor w/ 3 levels "type1","type2",..: 1 2 1 2 3 2 2 2 1
$ counts:Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 9 obs. of 4 variables:
..$ V1 : Factor w/ 9 levels "ME-TESTED","re tested",..: 7 4 6 5 9 8 1 2 3
..$ V2 : Factor w/ 9 levels "123","anothertest",..: 4 2 8 1 5 3 7 6 9
..$ V3 : Factor w/ 3 levels "type1","type2",..: 1 2 1 2 3 2 2 2 1
..$ count: int 3 5 3 5 1 5 5 5 3
..- attr(*, "vars")=List of 1
.. ..$ : symbol V3
..- attr(*, "labels")='data.frame': 3 obs. of 1 variable:
.. ..$ V3: Factor w/ 3 levels "type1","type2",..: 1 2 3
.. ..- attr(*, "vars")=List of 1
.. .. ..$ : symbol V3
.. ..- attr(*, "drop")= logi TRUE
..- attr(*, "indices")=List of 3
.. ..$ : int 0 2 8
.. ..$ : int 1 3 5 6 7
.. ..$ : int 4
..- attr(*, "drop")= logi TRUE
..- attr(*, "group_sizes")= int 3 5 1
..- attr(*, "biggest_group_size")= int 5
df$counts%group\u by(V3)%%>%mutate(count=n())
#剪下你已经展示的东西
str(df)
#-----
“data.frame”:9个obs。共有4个变量:
$V1:系数w/9水平“自我测试”、“重新测试”…:7 4 6 5 9 8 1 2 3
$V2:系数w/9级“123”,“另一个测试”,“4 2 8 1 5 3 7 6 9
$V3:系数w/3级“类型1”、“类型2”、..:1 2 1
$counts:class'grouped_df'、'tbl_df'、'tbl'和'data.frame':9 obs。共有4个变量:
..$V1:系数w/9水平“自我测试”、“重新测试”..:7 4 6 5 9 8 1 2 3
..$V2:系数w/9水平“123”,“另一个测试”,“4 2 8 1 5 3 7 6 9
..$V3:系数w/3级“类型1”、“类型2”、..:1 2 1
..$count:int 35 35 1 5 5 5 3
..-attr(*,“vars”)=1个列表
.. ..$ : 符号V3
..-attr(*,“labels”)=“data.frame”:3个obs。第1个变量:
.. ..$ V3:系数w/3级“类型1”、“类型2”和……:1 2 3
.. ..- 属性(*,“变量”)=1的列表
.. .. ..$ : 符号V3
.. ..- 属性(*,“删除”)=logi TRUE
..-属性(*,“索引”)=3个列表
.. ..$ : int 0 2 8
.. ..$ : 国际1 3 5 6 7
.. ..$ : int 4
..-attr(*,“drop”)=logi TRUE
..-属性(*,“组大小”)=int 3 5 1
..-属性(*,“最大组大小”)=整数5
您看到的格式是R如何显示嵌入在数据帧中的矩阵。类
table
(也许还有tbl
?)的对象继承自矩阵
-类。为什么df$counts我误解了语法,认为我必须这样做才能创建一个新列。如果它在一个数据帧中创建了一个数据帧,这就可以解释它了,但我仍然不明白为什么当我打印df时它看起来像一个常规列,而当我打印df$counts.count时它会显示为NULL。这就是R打印包含data.frame的列的方式
df %>% group_by(V3) %>% mutate(count = n())
Source: local data frame [9 x 4]
Groups: V3 [3]
V1 V2 V3 count
<fctr> <fctr> <fctr> <int>
1 TEST othertest type1 3
2 test anothertest type2 5
3 tEsT testing type1 3
4 tesT 123 type2 5
5 TesTing random stuff type3 1
6 testing irrelevant type2 5
7 ME-TESTED tested type2 5
8 re tested re-test type2 5
9 RE testing tests type1 3
df$counts <- df %>% group_by(V3) %>% mutate(count = n())
# snipped what you already showed
str(df)
#-----
'data.frame': 9 obs. of 4 variables:
$ V1 : Factor w/ 9 levels "ME-TESTED","re tested",..: 7 4 6 5 9 8 1 2 3
$ V2 : Factor w/ 9 levels "123","anothertest",..: 4 2 8 1 5 3 7 6 9
$ V3 : Factor w/ 3 levels "type1","type2",..: 1 2 1 2 3 2 2 2 1
$ counts:Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 9 obs. of 4 variables:
..$ V1 : Factor w/ 9 levels "ME-TESTED","re tested",..: 7 4 6 5 9 8 1 2 3
..$ V2 : Factor w/ 9 levels "123","anothertest",..: 4 2 8 1 5 3 7 6 9
..$ V3 : Factor w/ 3 levels "type1","type2",..: 1 2 1 2 3 2 2 2 1
..$ count: int 3 5 3 5 1 5 5 5 3
..- attr(*, "vars")=List of 1
.. ..$ : symbol V3
..- attr(*, "labels")='data.frame': 3 obs. of 1 variable:
.. ..$ V3: Factor w/ 3 levels "type1","type2",..: 1 2 3
.. ..- attr(*, "vars")=List of 1
.. .. ..$ : symbol V3
.. ..- attr(*, "drop")= logi TRUE
..- attr(*, "indices")=List of 3
.. ..$ : int 0 2 8
.. ..$ : int 1 3 5 6 7
.. ..$ : int 4
..- attr(*, "drop")= logi TRUE
..- attr(*, "group_sizes")= int 3 5 1
..- attr(*, "biggest_group_size")= int 5