在R中重新格式化数据帧
我有多个COL,希望重新格式化dataframe以减少COL 这是我的建议:在R中重新格式化数据帧,r,dataframe,dplyr,melt,R,Dataframe,Dplyr,Melt,我有多个COL,希望重新格式化dataframe以减少COL 这是我的建议: # Dataframe df <- data.frame( ~Location, ~Product_Name, ~Category, ~Machine1, ~Machine2 ~Machine1_adds, ~Machine2_adds, ~Sales1, ~Saless2, Spoils1, Spoils2 A, "Snickers", Candy, 0, 1, $2.5, $3,
# Dataframe
df <- data.frame(
~Location, ~Product_Name, ~Category, ~Machine1, ~Machine2 ~Machine1_adds, ~Machine2_adds, ~Sales1, ~Saless2, Spoils1, Spoils2
A, "Snickers", Candy, 0, 1, $2.5, $3, 2, 1
A, "Kitcat", Candy, 0, 1, $3, $3, 2, 1
A, "Pepsi", Bev, 1, 1, $5, $4, 3, 0
B, "Coke", Bev, 1, 0, $5, $6.45, 1, 1
B, "Gatoraid", Bev, 0, 1, $4, $4.45, 1, 0
B, "Sprite", Bev, 1, 1, $8, $6, 1, 0
)
df
#数据帧
df这里有一个选项,带有melt
fromdata.table
library(data.table)
melt(setDT(df), measure = patterns("^Machine", "^Sales", "^Spoils"),
value.name = c("Machine_adds", "Sales", "Spoils"))[, variable := NULL][]
# Location Product_Name Category Machine_adds Sales Spoils
# 1: A Snickers Candy 0 $2.5 2
# 2: A Kitcat Candy 0 $3 2
# 3: A Pepsi Bev 1 $5 3
# 4: B Coke Bev 1 $5 1
# 5: B Gatoraid Bev 0 $4 1
# 6: B Sprite Bev 1 $8 1
# 7: A Snickers Candy 1 $3 1
# 8: A Kitcat Candy 1 $3 1
# 9: A Pepsi Bev 1 $4 0
#10: B Coke Bev 0 $6.45 1
#11: B Gatoraid Bev 1 $4.45 0
#12: B Sprite Bev 1 $6 0
更新
根据OP的更新示例,如果有“Machine”和Machine_adds”列,我们可以稍微将模式更改为
# creating new columns in the dataset
df[c('Machine1', 'Machine2')] <- df[c("Machine1_adds", "Machine2_adds")]
melt(setDT(df), measure = patterns("^Machine\\d+$",
"^Machine\\d+_adds$", "^Sales", "^Spoils"),
value.name = c("Machine", "Machine_adds", "Sales", "Spoils"))[,
variable := NULL][]
更新
数据
df根据您所拥有的,您正在寻找重塑
函数,其中变量对齐为列表(c(4,5),c(6,7),c(8,9))
。您可以使用:
reshape(df,t(matrix(4:ncol(df),2)),idvar = 1:3,dir="long")
或
要获得您拥有的名称,我将使用v.names
参数
reshape(df,list(c(4,5),c(6,7),c(8,9)),idvar = 1:3,dir="long",
v.names = c("Machine_adds","Sales","Spoils"))[-4]# -4 removes the time variable.
Location Product_Name Category Machine_adds Sales Spoils
A.Snickers.Candy.1 A Snickers Candy 0 $2.5 2
A.Kitcat.Candy.1 A Kitcat Candy 0 $3 2
A.Pepsi.Bev.1 A Pepsi Bev 1 $5 3
B.Coke.Bev.1 B Coke Bev 1 $5 1
B.Gatoraid.Bev.1 B Gatoraid Bev 0 $4 1
B.Sprite.Bev.1 B Sprite Bev 1 $8 1
A.Snickers.Candy.2 A Snickers Candy 1 $3 1
A.Kitcat.Candy.2 A Kitcat Candy 1 $3 1
A.Pepsi.Bev.2 A Pepsi Bev 1 $4 0
B.Coke.Bev.2 B Coke Bev 0 $6.45 1
B.Gatoraid.Bev.2 B Gatoraid Bev 1 $4.45 0
B.Sprite.Bev.2 B Sprite Bev 1 $6 0
您的示例是给出错误扫描,您将dput(df)
粘贴到此处。这是否回答了您的问题?如果您的示例数据采用可以直接运行的格式,则会更容易提供帮助。您将此设置为对tribble
的调用,但使用实际不起作用的语法,例如,不带引号的字符串,$2.5
``除了获取机器标签(1、2或3)之外,其他一切都在起作用……它只是显示变量。@Dinho它是variable
而不是variable
@Dinho得到了它,我想你既有机器
又有机器
。我使用了一个仅包含Machine\u adds
column@Dinho您可以将代码更改为melt(setDT(df),measure=patterns(“^Machine\\d+$”、“^Machine\\d+\u adds$”、“^Sales”、“^battles”)、value.name=c(“Machine”、“Machine\u adds”、“Sales”、“battles”)[,变量:=NULL][
df %>%
rename_at(vars(matches('^Machine.*adds$')), ~
str_replace(., '(\\d+)_(\\w+)$', '_\\2\\1')) %>%
rename_at(3:ncol(.), ~ str_replace(., "(\\d+)_?.*", ":\\1")) %>%
pivot_longer(cols = matches("^(Machine|Sales|Spoils)"),
names_to = c(".value", "group"), names_sep = ":") %>%
select(-group)
df <- structure(list(Location = c("A", "A", "A", "B", "B", "B"),
Product_Name = c("Snickers",
"Kitcat", "Pepsi", "Coke", "Gatoraid", "Sprite"), Category = c("Candy",
"Candy", "Bev", "Bev", "Bev", "Bev"), Machine1_adds = c(0, 0,
1, 1, 0, 1), Machine2_adds = c(1, 1, 1, 0, 1, 1), Sales1 = c("$2.5",
"$3", "$5", "$5", "$4", "$8"), Sales2 = c("$3", "$3", "$4", "$6.45",
"$4.45", "$6"), Spoils1 = c(2, 2, 3, 1, 1, 1), Spoils2 = c(1,
1, 0, 1, 0, 0)), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
reshape(df,t(matrix(4:ncol(df),2)),idvar = 1:3,dir="long")
reshape(df,list(c(4,5),c(6,7),c(8,9)),idvar = 1:3,dir="long")
reshape(df,list(c(4,5),c(6,7),c(8,9)),idvar = 1:3,dir="long",
v.names = c("Machine_adds","Sales","Spoils"))[-4]# -4 removes the time variable.
Location Product_Name Category Machine_adds Sales Spoils
A.Snickers.Candy.1 A Snickers Candy 0 $2.5 2
A.Kitcat.Candy.1 A Kitcat Candy 0 $3 2
A.Pepsi.Bev.1 A Pepsi Bev 1 $5 3
B.Coke.Bev.1 B Coke Bev 1 $5 1
B.Gatoraid.Bev.1 B Gatoraid Bev 0 $4 1
B.Sprite.Bev.1 B Sprite Bev 1 $8 1
A.Snickers.Candy.2 A Snickers Candy 1 $3 1
A.Kitcat.Candy.2 A Kitcat Candy 1 $3 1
A.Pepsi.Bev.2 A Pepsi Bev 1 $4 0
B.Coke.Bev.2 B Coke Bev 0 $6.45 1
B.Gatoraid.Bev.2 B Gatoraid Bev 1 $4.45 0
B.Sprite.Bev.2 B Sprite Bev 1 $6 0