获取基于另一列的dataframe列中字符串连续出现的计数
我需要知道在某个数据帧的列中,一个值出现了多少次 主要逻辑是根据另一列获取特定字符串的出现次数 例如:获取基于另一列的dataframe列中字符串连续出现的计数,r,aggregate,R,Aggregate,我需要知道在某个数据帧的列中,一个值出现了多少次 主要逻辑是根据另一列获取特定字符串的出现次数 例如: df<- data.frame(fruits = c("apples", "apples", "orange", "pears", "apples", "pears", "pears", "papaya", "papaya"), veggies = c("beans", "carrots", "carrots", "carrots", "brinjal"
df<- data.frame(fruits = c("apples", "apples", "orange", "pears", "apples", "pears", "pears", "papaya", "papaya"),
veggies = c("beans", "carrots", "carrots", "carrots", "brinjal","carrots", "brinjal", "brinjal", "beans"),
branches=c( "Area1", "Area1", "Area1", "Area2","Area2","Area2", "Area2", "Area3", "Area3" ))
输出通常显示所有分支的苹果和其余水果的总数。我需要得到每个分支的准确计数
我所需的输出应基于列df$branchs
for Area1
apples-2 orange-1,
for Area2
pears-3 apples-1
for Area3
papaya-3
试试这个:
library(data.table)
setDT(df)[,list(count=.N),list(branches, fruits)]
# branches fruits count
#1: Area1 apples 2
#2: Area1 orange 1
#3: Area2 pears 3
#4: Area2 apples 1
#5: Area3 papaya 2
也许只需使用
ftable
:
> ftable(fruits ~ branches, data = df)
fruits apples orange papaya pears
branches
Area1 2 1 0 0
Area2 1 0 0 3
Area3 0 0 2 0
> ftable(veggies ~ branches, data = df)
veggies beans brinjal carrots
branches
Area1 1 0 2
Area2 0 2 2
Area3 1 1 0
我不知道您期望的输出,但您可以使用dplyr包获取计数: 例如:
library(dplyr)
df %>% count(fruits, branches)
# OR
count(df, fruits, branches)
输出:
Source: local data frame [5 x 3]
Groups: fruits
fruits branches n
1 apples Area1 2
2 apples Area2 1
3 orange Area1 1
4 papaya Area3 2
5 pears Area2 3
谢谢您的回复,上校……但它在setDT(df)中为我抛出了错误:无法通过引用将'df'转换为data.table,因为绑定已锁定。“df”很可能位于一个包(或环境)中,该包被锁定以防止修改其变量绑定。尝试将对象复制到您当前的环境中,例如:var您可以执行df1=df,然后将上述操作应用于df1吗?@Neha,您粘贴的错误消息准确地告诉您要执行的操作!使用data.table的另一个原因是错误消息非常详细!我们可以根据特定时间间隔的时间戳列获取计数吗?例如,在10分钟间隔之间,我们可以得到重复字符串的计数吗?可能吗?@Neha您的示例数据没有时间戳。如果您有更具体的问题,请编辑您的原始帖子,以包含此信息和您想要的结果。例如,我在data.frame中有date with time列,从05-SEP-14 07.22.13 Am到05-SEP-14 10.22.13 PM。我可以得到每30分钟售出的苹果数吗?这称为聚合。具体来说,您希望通过分支聚合水果或蔬菜。
Source: local data frame [5 x 3]
Groups: fruits
fruits branches n
1 apples Area1 2
2 apples Area2 1
3 orange Area1 1
4 papaya Area3 2
5 pears Area2 3