Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/excel/24.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 如何找到每个分类变量的连续变量的平均值?_R_Mean_Categorical Data_Continuous_Na.rm - Fatal编程技术网

R 如何找到每个分类变量的连续变量的平均值?

R 如何找到每个分类变量的连续变量的平均值?,r,mean,categorical-data,continuous,na.rm,R,Mean,Categorical Data,Continuous,Na.rm,我试图计算每一个与UFO相关的分类形状的UFO目击(连续)的平均持续时间。基本上,每个UFO形状的平均观测长度是多少 我试过: a <- aggregate(duration..seconds. ~ shape, data=alien, FUN=mean, na.rm=TRUE) barplot(a$duration..seconds., names.arg=a$shape) 我意识到我需要以某种方式改变我的数据。我想简单地删除所有缺少相应数据的数据(即,我们知道形状,但

我试图计算每一个与UFO相关的分类形状的UFO目击(连续)的平均持续时间。基本上,每个UFO形状的平均观测长度是多少

我试过:

    a <- aggregate(duration..seconds. ~ shape, data=alien, FUN=mean, na.rm=TRUE)
    barplot(a$duration..seconds., names.arg=a$shape)
我意识到我需要以某种方式改变我的数据。我想简单地删除所有缺少相应数据的数据(即,我们知道形状,但缺少持续时间,反之亦然),但我不太知道如何做到这一点

谢谢你的帮助

“持续时间..秒”是正确的,这就是它如何从excel文件传输过来的

    shape       duration..seconds.
    us  changing    3600    NA  4/27/2004   29.8830556  
    us  changing    300     NA  12/16/2005  29.38421    
    us  changing    3600    NA  1/21/2008   53.2    
    us  changing    900     NA  1/17/2004   28.9783333  
    ca  changing    1200    NA  1/22/2004   21.4180556  
    us  changing    3600    NA  4/27/2007   36.595  
有80000个UFO目击记录,这就是为什么我试图平均它。有29种不同的形状。

数据

df <- read.table(text="
country shape  duration_seconds dummy1 date dummy2
us  changing    3600    NA  4/27/2004   29.8830556  
us  changing    300     NA  12/16/2005  29.38421    
us  changing    3600    NA  1/21/2008   53.2    
us  changing    900     NA  1/17/2004   28.9783333  
ca  changing    1200    NA  1/22/2004   21.4180556  
us  changing    3600    NA  4/27/2007   36.595  
", header = TRUE, stringsAsFactors = FALSE)
df%
总结(平均持续时间=平均持续时间)
#形状平均持续时间秒
#                       
#1.2200。
并使用原始代码

names(df) <- c("country", "shape", "duration_seconds", "dummy1", "date", "dummy2")
a <- aggregate(duration_seconds ~ shape, data=df, FUN=mean, na.rm=TRUE)
barplot(a$duration_seconds, names.arg=a$shape)

a
#   shape    duration_seconds
# 1 changing             2200

names(df)Soooo….我们看到一些样本数据了吗?重复是搜索“[r]表示每个分类变量”的第一次命中。学会在发布前搜索。对“缺乏努力”投反对票。
names(df) <- c("country", "shape", "duration_seconds", "dummy1", "date", "dummy2")
library(dplyr)
df %>% 
  group_by(shape)  %>%
  summarize(mean_duration_seconds = mean(duration_seconds))

#   shape    mean_duration_seconds
#   <chr>                    <dbl>
# 1 changing                 2200.
names(df) <- c("country", "shape", "duration_seconds", "dummy1", "date", "dummy2")
a <- aggregate(duration_seconds ~ shape, data=df, FUN=mean, na.rm=TRUE)
barplot(a$duration_seconds, names.arg=a$shape)

a
#   shape    duration_seconds
# 1 changing             2200