R 如何生成多个虚拟变量? data1% 排列(1区,1区是)%>% 排列(时间,时间是)%>% 排列(名称、名称是)%>% 排列(2区,2区是)

R 如何生成多个虚拟变量? data1% 排列(1区,1区是)%>% 排列(时间,时间是)%>% 排列(名称、名称是)%>% 排列(2区,2区是),r,R,但是我发现有一些错误,我不知道原因。我如何才能实现它?使用dplyr和tidyr您可以: dums <- data1 %>% select("locate", everything()) %>% mutate(zone1yes = 1, timeyes = 1, zone2yes = 1, nameyes = 1) %>% spread(zone1, zone1yes) %&

但是我发现有一些错误,我不知道原因。我如何才能实现它?

使用
dplyr
tidyr
您可以:

dums <- data1 %>%
   select("locate", everything()) %>%
   mutate(zone1yes = 1,
          timeyes = 1,
          zone2yes = 1,
          nameyes = 1) %>%
   spread(zone1, zone1yes) %>%
   spread(time, timeyes) %>%
   spread(name, nameyes) %>%
   spread(zone2, zone2yes)
库(dplyr)
图书馆(tidyr)
数据1%>%
变异(跨(.fns=as.character))%>%
枢轴长度(cols=-locate)%>%
pivot\u更宽(名称\u from=value,值\u fill=0,id\u cols=locate,
值_fn=函数(x)为.integer(长度(x)>0))
#找到A苹果'2000'梨松树香蕉B橙色'2001`
#                  
#1差11100 0 0 0
#2房间11101
由于您有不同类型的数据,我们需要首先将它们转换为一种类型,即字符。以长格式获取它们,并通过在值存在的位置指定1,否则指定0,以宽格式获取它们

数据

library(dplyr)
library(tidyr)

data1 %>%
  mutate(across(.fns = as.character)) %>%
  pivot_longer(cols = -locate) %>%
  pivot_wider(names_from = value, values_fill = 0, id_cols = locate, 
              values_fn = function(x) as.integer(length(x) > 0))

#  locate     A     a apple `2000`  pear  pine banana     B     b orange `2001`
#  <chr>  <int> <int> <int>  <int> <int> <int>  <int> <int> <int>  <int>  <int>
#1 poor       1     1     1      1     1     0      0     0     0      0      0
#2 room       1     1     0      1     0     1      1     1     1      1      1

df带有
melt/dcast
from
data.table

df <- structure(list(zone1 = c("A", "A", "A", "A", "B"), zone2 = c("a", 
"a", "a", "a", "b"), name = c("apple", "pear", "pine", "banana", 
"orange"), locate = c("poor", "poor", "room", "room", "room"), 
    time = c(2000, 2000, 2000, 2000, 2001)), class = "data.frame", 
row.names = c(NA, -5L))
-输出

library(data.table)
dcast(melt(setDT(data1), id.var = 'locate'), locate ~ value, function(x) +(length(x) > 0))
数据
df data1(time)中的最后一个列表有6个元素,而其余的有5个元素-这是有意的吗?您是否试图执行类似于一个热编码的操作?我相信,通过将所需的列转换为因子,可以创建一个虚拟变量本身,但看起来您在这里也做得更多。
library(data.table)
dcast(melt(setDT(data1), id.var = 'locate'), locate ~ value, function(x) +(length(x) > 0))
#   locate 2000 2001 A B a apple b banana orange pear pine
#1:   poor    1    0 1 0 1     1 0      0      0    1    0
#2:   room    1    1 1 1 1     0 1      1      1    0    1
df <- structure(list(zone1 = c("A", "A", "A", "A", "B"), zone2 = c("a", 
"a", "a", "a", "b"), name = c("apple", "pear", "pine", "banana", 
"orange"), locate = c("poor", "poor", "room", "room", "room"), 
    time = c(2000, 2000, 2000, 2000, 2001)), class = "data.frame", 
row.names = c(NA, -5L))