R 如何使用split和sapply在ggplot中组合多个数据源?
这个问题与@Rui Barradas和@Duck之前回答的问题有关,但我需要更多帮助。上一链接: 基本上,我需要将3个数据集组合成一个带有次要y轴的图。所有数据集都需要按SITENAME拆分,并将按Sampling.Year进行分面换行。我正在使用split和sapply。在进行面换行时,图的外观如下所示: 但是,我现在尝试将另外两个数据源添加到绘图中,如下所示: 但我正在努力添加另外两个数据源,并让它们按SITENAME进行拆分。她是我目前的密码 将记录打印格式作为应用于拆分列表df的函数(理想情况下,“df”将添加为带有次要y轴的几何图形线,“FF_开始日期”将添加为垂直虚线):R 如何使用split和sapply在ggplot中组合多个数据源?,r,ggplot2,split,sapply,mapply,R,Ggplot2,Split,Sapply,Mapply,这个问题与@Rui Barradas和@Duck之前回答的问题有关,但我需要更多帮助。上一链接: 基本上,我需要将3个数据集组合成一个带有次要y轴的图。所有数据集都需要按SITENAME拆分,并将按Sampling.Year进行分面换行。我正在使用split和sapply。在进行面换行时,图的外观如下所示: 但是,我现在尝试将另外两个数据源添加到绘图中,如下所示: 但我正在努力添加另外两个数据源,并让它们按SITENAME进行拆分。她是我目前的密码 将记录打印格式作为应用于拆分列表df的函
SITENAME\u plot您可以尝试下一个代码。我使用了你共享的数据。请注意所有数据集的名称。理想情况下,在进行拆分之前,所有数据帧中都应该出现关键列,如DATE
和Sampling.Year
。还有一些变量,如风险
,也不存在,所以我添加了一个同名的示例var。在这段代码中,我为您想要的绘图添加了一个函数:
library(tidyverse)
library(readxl)
#Data
df1 <- read_excel('Sample data.xlsx',1)
#Create var
df1$Risk <- c(rep(c("Very Low","Low","Moderate","High","Very High"),67),"Very High")
#Other data
df2 <- read_excel('Sample data.xlsx',2)
df3 <- read_excel('Sample data.xlsx',3)
#Split 1
L1 <- split(df1,df1$SITENAME)
L2 <- split(df2,df2$SITENAME)
L3 <- split(df3,df3$`Site Name`)
#Function to create plots
myplot <- function(x,y,z)
{
#Merge x and y
#Check for duplicates and avoid column
y <- y[!duplicated(paste(y$DATE,y$Sampling.Year)),]
y$SITENAME <- NULL
xy <- merge(x,y,by.x = c('Sampling.Year','DATE'),by.y = c('Sampling.Year','DATE'),all.x=T)
#Format to dates
xy$DATE <- as.Date(xy$DATE)
#Scale factor
scaleFactor <- max(xy$Daily.Ave.PAF) / max(xy$Height)
#Rename for consistency in names
names(z)[4] <- 'DATE'
#Format date
z$DATE <- as.Date(z$DATE)
#Plot
#Plot
G <- ggplot(xy, aes(DATE, Daily.Ave.PAF)) +
geom_point(aes(colour = Risk), size = 3) +
scale_colour_manual(values=c("Very Low" = "dark green","Low" = "light green",
"Moderate" = "yellow", "High" = "orange", "Very High" = "red"), drop = FALSE) +
scale_x_date(breaks = "1 month", labels = scales::date_format("%b %Y")) +
geom_line(aes(x=DATE,y=Height*scaleFactor))+
scale_y_continuous(name="Total PAF (% affected)", sec.axis=sec_axis(~./scaleFactor, name="Water level (m)"))+
labs(x = "Month") +
geom_vline(data = z,aes(xintercept = DATE),linetype="dashed")+
facet_wrap(~Sampling.Year, ncol = 1, scales = "free")+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
theme(legend.text=element_text(size=15)) +
theme(axis.text=element_text(size=15),
axis.title=element_text(size=15,face="bold")) +
guides(color = guide_legend(reverse = TRUE))+
theme_bw() +
ggtitle(unique(xy$SITENAME))
return(G)
}
#Create a list of plots
Lplots <- mapply(FUN = myplot,x=L1,y=L2,z=L3,SIMPLIFY = FALSE)
#Now format names
vnames <- paste0(names(Lplots),'.png')
mapply(ggsave, Lplots,filename = vnames,width = 30,units = 'cm')
库(tidyverse)
图书馆(readxl)
#资料
df1鸭子又来营救了!你太棒了,非常感谢你!我现在正在试用。@CatN请注意数据,我在您的样本中发现了重复的数据!其中一个变量(上面代码中的水位=df2)的数据点比另一个在x轴上绘制的数据点多,这有关系吗?我一直收到这个错误消息,但在代码中找不到修复它的地方:警告消息:在mapply中(FUN=myplot,x=L1,y=L2,z=L3,SIMPLIFY=FALSE):较长的参数不是较短长度的倍数。实际上,忽略我最后的注释-我意识到我做错了什么。再次感谢Duck,此代码有效。你救了我一天!我学到了一些新的东西:)@CatN太棒了,它起作用了,而且总是很乐意帮助你:)
SITENAME_plot_write <- function(name, g, dir = "N:/abc/"){
flname <- file.path(dir, name)
flname <- paste0(flname, ".jpg")
png(filename = flname, width = 1500, height = 1000)
print(g)
dev.off()
flname
}
sp1 <- split(AllDates_TPAF, AllDates_TPAF$SITENAME)
gg_list <- sapply(sp1, SITENAME_plot, simplify = FALSE)
mapply(SITENAME_plot_write, names(gg_list), gg_list, MoreArgs = list(dir = getwd()))
dev.off()
library(tidyverse)
library(readxl)
#Data
df1 <- read_excel('Sample data.xlsx',1)
#Create var
df1$Risk <- c(rep(c("Very Low","Low","Moderate","High","Very High"),67),"Very High")
#Other data
df2 <- read_excel('Sample data.xlsx',2)
df3 <- read_excel('Sample data.xlsx',3)
#Split 1
L1 <- split(df1,df1$SITENAME)
L2 <- split(df2,df2$SITENAME)
L3 <- split(df3,df3$`Site Name`)
#Function to create plots
myplot <- function(x,y,z)
{
#Merge x and y
#Check for duplicates and avoid column
y <- y[!duplicated(paste(y$DATE,y$Sampling.Year)),]
y$SITENAME <- NULL
xy <- merge(x,y,by.x = c('Sampling.Year','DATE'),by.y = c('Sampling.Year','DATE'),all.x=T)
#Format to dates
xy$DATE <- as.Date(xy$DATE)
#Scale factor
scaleFactor <- max(xy$Daily.Ave.PAF) / max(xy$Height)
#Rename for consistency in names
names(z)[4] <- 'DATE'
#Format date
z$DATE <- as.Date(z$DATE)
#Plot
#Plot
G <- ggplot(xy, aes(DATE, Daily.Ave.PAF)) +
geom_point(aes(colour = Risk), size = 3) +
scale_colour_manual(values=c("Very Low" = "dark green","Low" = "light green",
"Moderate" = "yellow", "High" = "orange", "Very High" = "red"), drop = FALSE) +
scale_x_date(breaks = "1 month", labels = scales::date_format("%b %Y")) +
geom_line(aes(x=DATE,y=Height*scaleFactor))+
scale_y_continuous(name="Total PAF (% affected)", sec.axis=sec_axis(~./scaleFactor, name="Water level (m)"))+
labs(x = "Month") +
geom_vline(data = z,aes(xintercept = DATE),linetype="dashed")+
facet_wrap(~Sampling.Year, ncol = 1, scales = "free")+
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1)) +
theme(legend.text=element_text(size=15)) +
theme(axis.text=element_text(size=15),
axis.title=element_text(size=15,face="bold")) +
guides(color = guide_legend(reverse = TRUE))+
theme_bw() +
ggtitle(unique(xy$SITENAME))
return(G)
}
#Create a list of plots
Lplots <- mapply(FUN = myplot,x=L1,y=L2,z=L3,SIMPLIFY = FALSE)
#Now format names
vnames <- paste0(names(Lplots),'.png')
mapply(ggsave, Lplots,filename = vnames,width = 30,units = 'cm')