Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/66.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 在ggplot2中绘制线性回归_R_Data Visualization_Ggplot2 - Fatal编程技术网

R 在ggplot2中绘制线性回归

R 在ggplot2中绘制线性回归,r,data-visualization,ggplot2,R,Data Visualization,Ggplot2,我有这个数据框: > head(data) sx yd sl 1 male 35 36350 2 male 22 35350 3 male 23 28200 4 female 27 26775 5 male 30 33696 6 male 21 28516 其中“sx”表示性别,“yd”表示获得学位后的年数,“sl”表示工资。使用ggplot或plot,我可以轻松地绘制散布 palette(c("pink", "blue")) plot(da

我有这个数据框:

> head(data)
    sx   yd  sl
1   male 35  36350
2   male 22  35350
3   male 23  28200
4 female 27  26775
5   male 30  33696
6   male 21  28516
其中“sx”表示性别,“yd”表示获得学位后的年数,“sl”表示工资。使用ggplot或plot,我可以轻松地绘制散布

palette(c("pink", "blue"))
plot(data$yr, data$sl, col = factor(data$sx), xlab = "Years Since Earned Highest Degree", ylab = "Salary (dollars)", main = "Salary Increases with Experience", pch = 19)
legend("topleft", legend = unique(data$sx), col = c("blue", "pink"), pch=19)

library(ggplot2)
ggplot(data, aes(x=yd,y=sl)) + 
    geom_point(shape=21, aes(col=sx, bg=sx)) + 
    xlab("Years Since Earned Highest Degree") + 
    ylab("Salary (dollars)") + 
    ggtitle("Salary Increases with Experience") + 
    scale_color_discrete(guide=FALSE) + 
    labs(fill="sex")
然而,我也根据数据建立了一个线性模型:

mod<-lm(sl~sx*poly(yd,2),data)
返回此错误:

Error in model.frame.default(formula = formula, data = data, weights = weight,  : 
  variable lengths differ (found for '(weights)')
Error in if (nrow(layer_data) == 0) return() : argument is of length zero

我找不到ggplot方法来执行此操作,因此下面是执行此操作的基本plot方法:

palette(c("pink", "blue"))
plot(data$yr, data$sl, col = factor(data$sx), xlab = "Years Since Earned Highest Degree", ylab = "Salary (dollars)", main = "Salary Increases with Experience", pch = 19)
legend("topleft", legend = unique(data$sx), col = c("blue", "pink"), pch=19)
lines(seq(0,25,0.1), predict.lm(quad, data.frame(yd = seq(0,25,0.1), sx = "female", stringsAsFactors = TRUE)),col="pink", lwd = 5)
lines(seq(0,25,0.1), predict.lm(quad, data.frame(yd = seq(0,25,0.1), sx = "male", stringsAsFactors = TRUE)),col="blue", lwd = 5)
对lines的两个调用就是解决方案。如果有人能用ggplot的方法来做,我会非常感激,因为ggplot看起来好多了

data = data.frame(sx = c("male", "male", "male", "female", "male", "male"),
              yr = c(35, 22, 23, 27, 30, 21),
              sl = c(36350, 35350, 28200, 26775, 33696, 28516))
ggplot(data, aes(x=yr,y=sl)) + 
  geom_point(shape=21, aes(col=sx, bg=sx)) + 
  geom_smooth(aes(color = sx), se = FALSE, method = "lm", formula = y ~ poly(x, 2)) + 
  xlab("Years Since Earned Highest Degree") + 
  ylab("Salary (dollars)") + 
  ggtitle("Salary Increases with Experience") +     
  scale_color_discrete(guide=FALSE)+ labs(fill="sex")
这是你想要的吗?如果你有更多关于女性的数据,你应该得到个人拟合。现在
sum(数据$sx=='female')
是1。没有办法用多项式来拟合它。
例如,尝试:

data = data.frame(sx = c("male", "male", "male", "female", "male", "male", "female", "female", "female"),
                  yr = c(35, 22, 23, 27, 30, 21, 25, 18, 29),
                  sl = c(36350, 35350, 28200, 26775, 33696, 28516, 27402, 31492, 23195))

这应该行得通。

我认为您应该能够使用geom_smooth with method=“lm”和公式@埃里克米特曼:我试过那种方法,但发现了一个奇怪的错误。你看到问题了吗?有点不对劲:术语错误(对象):找不到对象“quad”
data = data.frame(sx = c("male", "male", "male", "female", "male", "male", "female", "female", "female"),
                  yr = c(35, 22, 23, 27, 30, 21, 25, 18, 29),
                  sl = c(36350, 35350, 28200, 26775, 33696, 28516, 27402, 31492, 23195))