R基于聚合数据生成多个Excel文件_R_Dplyr

R基于聚合数据生成多个Excel文件
R基于聚合数据生成多个Excel文件,r,dplyr,R,Dplyr,我正在R上做一个小项目，我的目标是在我的数据框架中为每个站点创建多个Excel文件。数据框由来自调查的注释组成，其中每一行代表给定站点的响应。总共有10列，第一列是针对网站的，另外9列是针对每个主题的评论这些注释列可以分组到以下块中- 第一区：整体=座位+装饰+接待+卫生间块2:舒适性和速度=舒适性+速度块3：操作=效率+礼貌+响应可复制的数据帧如下所示 #Load libraries library(dplyr) library(xlsx) #Reproducible Data
我正在R上做一个小项目，我的目标是在我的数据框架中为每个
站点创建多个Excel文件。数据框由来自调查的注释组成，其中每一行代表给定站点的响应。总共有10列，第一列是针对网站的，另外9列是针对每个主题的评论
这些注释列可以分组到以下块中-
第一区：整体=座位+装饰+接待+卫生间
块2:舒适性和速度=舒适性+速度
块3：操作=效率+礼貌+响应
可复制的数据帧如下所示
#Load libraries
 library(dplyr)
 library(xlsx)
 
#Reproducible Data Frame

df=data.frame(Site=c("Tokyo Harbor","Tokyo Harbor","Tokyo Harbor","Arlington","Arlington","Cairo Skyline","Cairo Skyline"),
       Seating=c("comfy never a problem to find","difficult","ease and quick","nobody to help","nice n comfy","old seats","nt bad"),
         Decor=c("very beautiful","i loved it!!!","nice","great","nice thanks","no response","yea nice"),
     Reception=c("always neat","I wasn't happy with the decor on this site","great!","immaculate","happy very helpful","","I wont bother again"),
       Toilets=c("well maintained","nicely managed","long queues could do better","","cleaner toilets needed!","no toilet roll in the mens loo","flush for god's sake!!!"),
       Comfort=c("very comfortable and heated","I felt like I was home","","couldn't be better","very nice and kush","not comment","fresh eyes needed"),
         Speed=c("rapid service","no delays ever got everything I needed on time","","","I have grown accustomed to the speed of service","machines","super duper quick"),
    Efficiency=c("very efficient, the servers were great","spot on","","I was quite disappointed in the efficiency","clockwork","parfait",""),
      Courtesy=c("Staff were very polite","smiling faces everywhere, loved it","very welcoming and kind","the hostess was a bit rude","trés impoli","noo",""),
Responsiveness=c("On the ball all the time","super quick whenever help was needed","","","","want more service like this",""))

#Transform all columns with empty cells to NAs

df[df==""]  <- NA 

然后，我根据这些块对注释进行分组，并过滤掉数据
###############################################
#STEP 2: Group comments based on defined blocks

#Group Overall
Data_Overall= df %>%
select(BlockOverall)

Data_Overall = Data_Overall %>%
do(.,data.frame(Comments_Overall=unlist(Data_Overall,use.names = F))) %>%
filter(complete.cases(.))

#Group Comfort & Speed
Data_ComfortSpeed= df %>%
select(BlockComfortSpeed)

Data_ComfortSpeed = Data_ComfortSpeed %>%
do(.,data.frame(Comments_ComfortSpeed=unlist(Data_ComfortSpeed,use.names = F))) %>%
filter(complete.cases(.))

#Group Operations
Data_Operations= df %>%
select(BlockOps)

Data_Operations = Data_Operations %>%
do(.,data.frame(Comments_Operations=unlist(Data_Operations,use.names = F))) 
%>%
filter(complete.cases(.))

最后，我将数据写入Excel
#Write each group to an individual tab in an Excel file

 library(xlsx)
 write.xlsx(Data_Overall,"Comments_Global_2017.xlsx",sheetName = 
'Overall',row.names = F) #Tab 1
 write.xlsx(Data_ComfortSpeed,"Comments_Global_2017.xlsx",sheetName = 
'Comfort_&_Speed',row.names = F,append = T) #Tab 2
 write.xlsx(Data_Operations,"Comments_Global_2017.xlsx",sheetName = 
'Operations',row.names = F,append = T) #Tab 3

在全球层面上，这很好。顺便提一下，我不知道如何将其转换为for
循环，该循环在dataframe中的所有站点上循环，并生成站点级Excel文件
作为一名新手程序员，任何建议都将受到高度重视
 如果使用tidyverse
中的purrr
，可以避免for循环
如果您使用上面的代码并将其包装成一个基本函数，您可以使用purrr:：map
对每个站点名称的函数进行迭代
您的设置：
使用我提供的可复制数据框可以很好地工作，但是当我在另一个具有相同结构但有更多站点的数据框上测试时，最初的几个Excel文件生成得很好，然后我得到了这个错误：mapply中的错误（setCellValue，cells[seq_len（nrow（cells）），colIndex[ic]]，：零长度的输入不能与非零长度的输入混合Hm，您可以检查它挂起的站点，并查看是否可以一步一步地完成该站点的流程。如果必须猜测，Data\u-total
，Data\u-costspeed
，或者Data\u-Operations没有导致挂起的行在循环过程中发送错误的函数。我想我已经解决了，现在可以正常工作了。我将write.xlsx函数与一个自定义write.xlsx函数进行了调整，该函数可以在具有空行的工作表中进行计算。感谢您的帮助！
#Write each group to an individual tab in an Excel file

 library(xlsx)
 write.xlsx(Data_Overall,"Comments_Global_2017.xlsx",sheetName = 
'Overall',row.names = F) #Tab 1
 write.xlsx(Data_ComfortSpeed,"Comments_Global_2017.xlsx",sheetName = 
'Comfort_&_Speed',row.names = F,append = T) #Tab 2
 write.xlsx(Data_Operations,"Comments_Global_2017.xlsx",sheetName = 
'Operations',row.names = F,append = T) #Tab 3

#Load libraries
library(dplyr)
library(xlsx)
library(purrr)

#Reproducible Data Frame

df=data.frame(Site=c("Tokyo Harbor","Tokyo Harbor","Tokyo Harbor","Arlington","Arlington","Cairo Skyline","Cairo Skyline"),
              Seating=c("comfy never a problem to find","difficult","ease and quick","nobody to help","nice n comfy","old seats","nt bad"),
              Decor=c("very beautiful","i loved it!!!","nice","great","nice thanks","no response","yea nice"),
              Reception=c("always neat","I wasn't happy with the decor on this site","great!","immaculate","happy very helpful","","I wont bother again"),
              Toilets=c("well maintained","nicely managed","long queues could do better","","cleaner toilets needed!","no toilet roll in the mens loo","flush for god's sake!!!"),
              Comfort=c("very comfortable and heated","I felt like I was home","","couldn't be better","very nice and kush","not comment","fresh eyes needed"),
              Speed=c("rapid service","no delays ever got everything I needed on time","","","I have grown accustomed to the speed of service","machines","super duper quick"),
              Efficiency=c("very efficient, the servers were great","spot on","","I was quite disappointed in the efficiency","clockwork","parfait",""),
              Courtesy=c("Staff were very polite","smiling faces everywhere, loved it","very welcoming and kind","the hostess was a bit rude","trés impoli","noo",""),
              Responsiveness=c("On the ball all the time","super quick whenever help was needed","","","","want more service like this",""))

#Transform all columns with empty cells to NAs

df[df==""]  <- NA 

export_site_data <- function(site.name){
  ###########################
  #STEP 0: filter by block site
  df <- df %>% filter(Site %in% site.name)


  ###########################
  #STEP 1: Define the blocks

  #Block 1: Overall = Seating + Decor + Reception + Toilets
  BlockOverall=c(names(df)[2],names(df)[3],names(df)[4],names(df)[5])

  #Block 2: Comfort & Speed = Comfort + Speed
  BlockComfortSpeed=c(names(df)[6],names(df)[7])

  #Block 3: Operations = Efficiency + Courtesy + Responsiveness
  BlockOps=c(names(df)[8],names(df)[9],names(df)[10])



  ###############################################
  #STEP 2: Group comments based on defined blocks

  #Group Overall
  Data_Overall= df %>%
    select(BlockOverall)

  Data_Overall = Data_Overall %>%
    do(.,data.frame(Comments_Overall=unlist(Data_Overall,use.names = F))) %>%
    filter(complete.cases(.))

  #Group Comfort & Speed
  Data_ComfortSpeed= df %>%
    select(BlockComfortSpeed)

  Data_ComfortSpeed = Data_ComfortSpeed %>%
    do(.,data.frame(Comments_ComfortSpeed=unlist(Data_ComfortSpeed,use.names = F))) %>%
    filter(complete.cases(.))

  #Group Operations
  Data_Operations= df %>%
    select(BlockOps)

  Data_Operations = Data_Operations %>%
    do(.,data.frame(Comments_Operations=unlist(Data_Operations,use.names = F))) %>%  filter(complete.cases(.))

  library(xlsx)
  write.xlsx(Data_Overall, paste0("Comments_",site.name,"_2017.xlsx"), sheetName = 
               'Overall',row.names = F) #Tab 1
  write.xlsx(Data_ComfortSpeed, paste0("Comments_",site.name,"_2017.xlsx"), sheetName = 
               'Comfort_&_Speed',row.names = F,append = T) #Tab 2
  write.xlsx(Data_Operations, paste0("Comments_",site.name,"_2017.xlsx"), sheetName = 
               'Operations',row.names = F,append = T) #Tab 3
}

site.name <- unique(df$Site)
site.name %>% map(export_site_data )

list.files(pattern = "Comments_")
[1] "Comments_Arlington_2017.xlsx"     "Comments_Cairo Skyline_2017.xlsx"
[3] "Comments_Tokyo Harbor_2017.xlsx"