R 遍历数据并创建新的数据帧_R_Rstudio

R 遍历数据并创建新的数据帧

R 遍历数据并创建新的数据帧,r,rstudio,R,Rstudio,我正在以以下方式使用来自数据库的数据帧： username elements username1 """interfaces"".""dual()""" username1 """interfaces"".""f_capitalaccrualcurrentyear""" username2 """interfaces"".""dnow_completion"",""interfaces"".""dnow_s_daily_prod_ta""" username2 """int

我正在以以下方式使用来自数据库的数据帧：

username    elements
username1   """interfaces"".""dual()"""
username1   """interfaces"".""f_capitalaccrualcurrentyear"""
username2   """interfaces"".""dnow_completion"",""interfaces"".""dnow_s_daily_prod_ta"""
username2   """interfaces"".""dnow_completion"",""interfaces"".""dnow_s_daily_prod_ta"""
username2   """interfaces"".""dnow_completion"",""interfaces"".""dnow_s_daily_prod_ta"""
username4   """interfaces"".""dnow_s_downtime_stat_with_lat_long"""
username3   """interfaces"".""dnow_completion"",""interfaces"".""dnow_s_daily_prod_ta"""

所以，有两列，用户名和元素。因此，用户可以在一个事务中使用一个或多个元素。当有多个元素时，它们在事务中用逗号分隔。我需要将元素分开，每行一个，但仍然用用户名标记。最后，我希望是这样：

username    elements
username1   """interfaces"".""dual()"""
username1   """interfaces"".""f_capitalaccrualcurrentyear"""
username2   """interfaces"".""dnow_completion""
username2   ""interfaces"".""dnow_s_daily_prod_ta"""
username2   """interfaces"".""dnow_completion""
username2   ""interfaces"".""dnow_s_daily_prod_ta"""
username2   """interfaces"".""dnow_completion""
username2   ""interfaces"".""dnow_s_daily_prod_ta"""
username4   """interfaces"".""dnow_s_downtime_stat_with_lat_long"""
username3   """interfaces"".""dnow_completion""
username3   ""interfaces"".""dnow_s_daily_prod_ta"""

我一直在尝试遍历数据帧，拆分带有逗号的元素，然后将它们与相应的用户名放回一起

我一直在尝试下面的代码，但效率非常低。我是新手，所以我想应该有一个更有效的方法来做到这一点

interface.data <-data.frame(
    username = c(),
    elements = c()
)
for (row in 1:nrow(input)) { ##input is the frame that comes from the database
     myrowbrk<-input[row,"elements"]
     myrowelements<-chartr(",", "\n", myrowbrk)      
     user<-input[row,"username"]
     interface.newdata <- data.frame(
         username = user,
         elements = c(myrowelements)         
     )
     interface.final<- rbind(interface.data,interface.newdata )
}

output<-interface.final

您可以使用tidyrpackage来实现这一点。我的解决方案使用两个步骤来获取所需格式的数据：1使用逗号字符分隔元素列，2将格式从宽更改为长

library(tidyr)

#Separate the 'elements' column from your 'df' data frame using the comma character
#Set the new variable names as a sequence of 1 to the max number of expected columns
df2 <- separate(data = df, 
                   col = elements, 
                   into = as.character(seq(1,2,1)),
                   sep = ",")
#This code gives a warning because not every row has a string with a comma. 
#Empty entries are filled with NA

#Then change from wide to long format, dropping NA entries
#Drop the column that indicates the name of the column from which the elements entry was obtained (i.e., 1 or 2)
df2 <- df2 %>%
  pivot_longer(cols = "1":"2",
               values_to = "elements",
               values_drop_na = TRUE) %>%
  select(-name)

试试tidyr:：separate_rowsinput，elements，sep=，？太棒了！这就解决了问题。非常感谢。谢谢你的回复！非常感谢！