R 如何生成多个虚拟变量? data1% 排列(1区,1区是)%>% 排列(时间,时间是)%>% 排列(名称、名称是)%>% 排列(2区,2区是)
但是我发现有一些错误,我不知道原因。我如何才能实现它?使用R 如何生成多个虚拟变量? data1% 排列(1区,1区是)%>% 排列(时间,时间是)%>% 排列(名称、名称是)%>% 排列(2区,2区是),r,R,但是我发现有一些错误,我不知道原因。我如何才能实现它?使用dplyr和tidyr您可以: dums <- data1 %>% select("locate", everything()) %>% mutate(zone1yes = 1, timeyes = 1, zone2yes = 1, nameyes = 1) %>% spread(zone1, zone1yes) %&
dplyr
和tidyr
您可以:
dums <- data1 %>%
select("locate", everything()) %>%
mutate(zone1yes = 1,
timeyes = 1,
zone2yes = 1,
nameyes = 1) %>%
spread(zone1, zone1yes) %>%
spread(time, timeyes) %>%
spread(name, nameyes) %>%
spread(zone2, zone2yes)
库(dplyr)
图书馆(tidyr)
数据1%>%
变异(跨(.fns=as.character))%>%
枢轴长度(cols=-locate)%>%
pivot\u更宽(名称\u from=value,值\u fill=0,id\u cols=locate,
值_fn=函数(x)为.integer(长度(x)>0))
#找到A苹果'2000'梨松树香蕉B橙色'2001`
#
#1差11100 0 0 0
#2房间11101
由于您有不同类型的数据,我们需要首先将它们转换为一种类型,即字符。以长格式获取它们,并通过在值存在的位置指定1,否则指定0,以宽格式获取它们
数据
library(dplyr)
library(tidyr)
data1 %>%
mutate(across(.fns = as.character)) %>%
pivot_longer(cols = -locate) %>%
pivot_wider(names_from = value, values_fill = 0, id_cols = locate,
values_fn = function(x) as.integer(length(x) > 0))
# locate A a apple `2000` pear pine banana B b orange `2001`
# <chr> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
#1 poor 1 1 1 1 1 0 0 0 0 0 0
#2 room 1 1 0 1 0 1 1 1 1 1 1
df带有melt/dcastfromdata.table
df <- structure(list(zone1 = c("A", "A", "A", "A", "B"), zone2 = c("a",
"a", "a", "a", "b"), name = c("apple", "pear", "pine", "banana",
"orange"), locate = c("poor", "poor", "room", "room", "room"),
time = c(2000, 2000, 2000, 2000, 2001)), class = "data.frame",
row.names = c(NA, -5L))
-输出
library(data.table)
dcast(melt(setDT(data1), id.var = 'locate'), locate ~ value, function(x) +(length(x) > 0))
数据
df data1(time)中的最后一个列表有6个元素,而其余的有5个元素-这是有意的吗?您是否试图执行类似于一个热编码的操作?我相信,通过将所需的列转换为因子,可以创建一个虚拟变量本身,但看起来您在这里也做得更多。
library(data.table)
dcast(melt(setDT(data1), id.var = 'locate'), locate ~ value, function(x) +(length(x) > 0))
# locate 2000 2001 A B a apple b banana orange pear pine
#1: poor 1 0 1 0 1 1 0 0 0 1 0
#2: room 1 1 1 1 1 0 1 1 1 0 1
df <- structure(list(zone1 = c("A", "A", "A", "A", "B"), zone2 = c("a",
"a", "a", "a", "b"), name = c("apple", "pear", "pine", "banana",
"orange"), locate = c("poor", "poor", "room", "room", "room"),
time = c(2000, 2000, 2000, 2000, 2001)), class = "data.frame",
row.names = c(NA, -5L))