R 将变量交互创建为一个例程
从我的数据集中,R 将变量交互创建为一个例程,r,regression,interaction,lasso-regression,R,Regression,Interaction,Lasso Regression,从我的数据集中,df变量大小是数值的(可以转换为数值,其中小=1,中=2,大=3) 有什么线索吗?一个选项是使用pivot\u longer将格式改为“long”,将值重新编码为“size”,对“size”,“via”列进行乘法,将格式改回“wide”,并与原始数据连接 library(dplyr) library(tidyr) df <- df %>% # // columns were all character. So, changed the type
df
变量大小是数值的(可以转换为数值,其中小=1,中=2,大=3)
有什么线索吗?一个选项是使用
pivot\u longer
将格式改为“long”,将值重新编码为“size”,对“size”,“via”列进行乘法,将格式改回“wide”,并与原始数据连接
library(dplyr)
library(tidyr)
df <- df %>%
# // columns were all character. So, changed the type
type.convert(as.is = TRUE)
df %>%
# // reshape to long format
pivot_longer(cols = dest1:via2, names_to = c(".value", 'grp'),
names_sep="(?<=[a-z])(?=[0-9])") %>%
# // recode the size column
mutate(size = setNames(1:3, c('small', 'medium', 'large'))[size],
# // loop over the 'dest', 'via' columns, multiply with size
across(c(dest, via), ~ . * size, .names = "size_{.col}")) %>%
# // remove the columns not needed
select(-size, -dest, -via) %>%
# // reshape to wide format
pivot_wider(names_from = grp, values_from = c(size_dest, size_via)) %>%
# // join with the original dataset
right_join(df) %>%
# // reorder the columns in select
select(names(df), everything())
库(dplyr)
图书馆(tidyr)
df%
#//列都是字符。所以,改变了类型
type.convert(as.is=TRUE)
df%>%
#//重塑为长格式
pivot_longer(cols=dest1:via2,name_to=c(“.value”,“grp”),
names_sep=“(?一个选项是使用pivot_longer
将格式重塑为'long',将值重新编码为'size',对'size','via'列进行乘法,将格式重塑为'wide',并与原始数据连接
library(dplyr)
library(tidyr)
df <- df %>%
# // columns were all character. So, changed the type
type.convert(as.is = TRUE)
df %>%
# // reshape to long format
pivot_longer(cols = dest1:via2, names_to = c(".value", 'grp'),
names_sep="(?<=[a-z])(?=[0-9])") %>%
# // recode the size column
mutate(size = setNames(1:3, c('small', 'medium', 'large'))[size],
# // loop over the 'dest', 'via' columns, multiply with size
across(c(dest, via), ~ . * size, .names = "size_{.col}")) %>%
# // remove the columns not needed
select(-size, -dest, -via) %>%
# // reshape to wide format
pivot_wider(names_from = grp, values_from = c(size_dest, size_via)) %>%
# // join with the original dataset
right_join(df) %>%
# // reorder the columns in select
select(names(df), everything())
库(dplyr)
图书馆(tidyr)
df%
#//列都是字符。因此,更改了类型
type.convert(as.is=TRUE)
df%>%
#//重塑为长格式
pivot_longer(cols=dest1:via2,name_to=c(“.value”,“grp”),
姓名_sep=”(?请阅读,包括您尝试过的任何解决方案和预期结果,并更新您的问题。正如目前所写的,人们无法回答。我已经包含了我尝试过的解决方案。此外,我还包含了我的问题的可复制示例以及预期结果的外观。尽管有fa,问题也会被编辑ct从我收到的第一条评论来看,我的问题对读者来说似乎是重点明确的。@Lengreski请阅读,包括您尝试过的任何解决方案和预期输出,并更新您的问题。正如目前所写的,它太不集中,人们无法回答。我已经包括了我尝试过的解决方案。此外,我还包括了一个我的问题和预期结果的重现性示例。尽管从我收到的第一条评论来看,我的问题对读者来说似乎是重点明确的,但问题还是经过了编辑。@Lengreskail一直很好,直到:“错误:无法对不存在的列进行子集。x Columnsize\u dest
不存在。”出现在“#//重塑为宽格式pivot_-wide(names_-from=grp,values_-from=c(size_-dest,size_-via))%%>%”行中。有什么想法吗?@vog不确定这个错误。我使用了来自输入的相同数据。是否有不同版本的tidyr
直到:“错误:无法子集不存在的列。x列size\u dest
不存在。”出现在“#//重塑为宽格式pivot\u wide(names\u from=grp,values\u from=c(size\u dest,size\u via))%%>行中“。有什么想法吗?@vog不确定那个错误。我使用了来自输入的相同数据。您是否有不同版本的tidyr
library(dplyr)
library(tidyr)
df <- df %>%
# // columns were all character. So, changed the type
type.convert(as.is = TRUE)
df %>%
# // reshape to long format
pivot_longer(cols = dest1:via2, names_to = c(".value", 'grp'),
names_sep="(?<=[a-z])(?=[0-9])") %>%
# // recode the size column
mutate(size = setNames(1:3, c('small', 'medium', 'large'))[size],
# // loop over the 'dest', 'via' columns, multiply with size
across(c(dest, via), ~ . * size, .names = "size_{.col}")) %>%
# // remove the columns not needed
select(-size, -dest, -via) %>%
# // reshape to wide format
pivot_wider(names_from = grp, values_from = c(size_dest, size_via)) %>%
# // join with the original dataset
right_join(df) %>%
# // reorder the columns in select
select(names(df), everything())
# A tibble: 5 x 10
# id size dest1 dest2 via1 via2 size_dest_1 size_dest_2 size_via_1 size_via_2
# <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
#1 1 small 1 0 1 1 1 0 1 1
#2 2 large 0 1 1 0 0 3 3 0
#3 3 small 1 1 0 1 1 1 0 1
#4 4 small 0 0 0 0 0 0 0 0
#5 5 medium 1 1 0 1 2 2 0 2