R 由于多个序列没有足够的数据点，黄土方法在数据帧上失败_R_Ggplot2_Smoothing_Loess_Curvesmoothing

R 由于多个序列没有足够的数据点，黄土方法在数据帧上失败

R 由于多个序列没有足够的数据点，黄土方法在数据帧上失败,r,ggplot2,smoothing,loess,curvesmoothing,R,Ggplot2,Smoothing,Loess,Curvesmoothing,我有一个数据框是这样的： dput（xx）这个数据帧是巨大的，它有许多主机。我面临的挑战是，当像上面这样的主机没有足够的数据点时，下面的ggplot失败，基本上是抱怨没有足够的数据点来绘制图形 ggplot(xx, aes(TimeStamp, Max, group=Host, colour=Host)) + geom_point() + geom_smooth(mehtod="loess") 我如何检查并查看此数据框中的特定主机是否有超过10个数据点，如果是，请使用method=“leas

我有一个数据框是这样的：

dput（xx）

这个数据帧是巨大的，它有许多主机。我面临的挑战是，当像上面这样的主机没有足够的数据点时，下面的ggplot失败，基本上是抱怨没有足够的数据点来绘制图形

ggplot(xx, aes(TimeStamp, Max, group=Host, colour=Host)) + geom_point() + geom_smooth(mehtod="loess")

我如何检查并查看此数据框中的特定主机是否有超过10个数据点，如果是，请使用method=“leash”。

如果主机的数据点数量少于10，请使用method=“lm”

是的，很难找到，但似乎是可能的

# for reproducibility
set.seed(42)
# The idea is to first split the data to < 10 and >= 10 points
# I use data.table for that
require(data.table)
dt <- data.frame(Host = rep(paste("Host", 1:10, sep=""), sample(1:20, 10)), 
         stringsAsFactors = FALSE)
dt <- transform(dt, x=sample(1:nrow(dt)), y = 15*(1:nrow(dt)))
dt <- data.table(dt, key="Host")
dt1 <- dt[, .SD[.N >= 10], by = Host]
dt2 <- dt[, .SD[.N < 10], by = Host]

# on to plotting now    
require(ggplot2)
# Now, dt1 has all Hosts with >= 10 observations and dt2 the other way round
# plot now for dt1
p <- ggplot(data=dt1, aes(x = x, y = y, group = Host)) + geom_line() + 
         geom_smooth(method="loess", se=T)
# plot geom_line for dt2 by telling the data and aes
# The TRICKY part: add geom_smooth by telling data=dt2
p <- p + geom_line(data = dt2, aes(x=x, y=y, group = Host)) + 
            geom_smooth(data = dt2, method="lm", se=T)

p

#用于再现性
种子（42）
#想法是首先将数据拆分为<10和>=10个点
#我使用data.table来实现这一点
要求（数据表）
dt除了Arun的优秀答案之外，我认为您只需在视觉上进行区分，例如，使用实线表示黄土，虚线表示lm：
p <- ggplot(data=dt1, aes(x = x, y = y, group = Host)) + geom_line() + 
         geom_smooth(method='loess', linetype='solid', se=T)

p <- p + geom_line(data = dt2, aes(x=x, y=y, group = Host)) + 
            geom_smooth(data = dt2, method='lm', linetype='dashed', se=T)

p可以通过复制数据点和设置geom_smooth功能的span参数来防止警告消息。例如：
data <- rbind(dt1, dt2)
p <- ggplot(data=dt1, aes(x = x, y = y, group = Host)) + geom_line() + 
         geom_smooth(method='loess', span = 1.4, se=T)

data Hi@Arun，我的所有行都需要在一个图表中。是否可以根据该主机的数据点数量在geom_smooth中进行设置？不，我喜欢这样使用geom_smooth。主机>10，方法=“黄土”，Host@Arun：回答得很好，只要你能在视觉上区分，听起来不错。例如，黄土用实线，lm用虚线。如果你想使用facet_wrap（~host）为每个主体创建单独的图形，并且你想使用geom_平滑（）呢？
data <- rbind(dt1, dt2)
p <- ggplot(data=dt1, aes(x = x, y = y, group = Host)) + geom_line() + 
         geom_smooth(method='loess', span = 1.4, se=T)