将二进制列转换为R中的1行
我对R很陌生,到目前为止只有基本技能,即使我检查了将二进制列转换为R中的1行,r,multiple-columns,R,Multiple Columns,我对R很陌生,到目前为止只有基本技能,即使我检查了melt()和gather()之类的函数,它们对我还是不起作用。 我想做的是转换这些数据(考虑到所有关于拥有住房/租赁和无家可归的选项只有1和0,并且你不能拥有超过1(你不能同时拥有租赁和无家可归者) 例如 我希望这些数据如下所示: Passenger ID /// Housing /// Age /// Gender 1 Has own house 21 Male 2
melt()
和gather()
之类的函数,它们对我还是不起作用。
我想做的是转换这些数据(考虑到所有关于拥有住房/租赁和无家可归的选项只有1
和0
,并且你不能拥有超过1
(你不能同时拥有租赁和无家可归者)
例如
我希望这些数据如下所示:
Passenger ID /// Housing /// Age /// Gender
1 Has own house 21 Male
2 Renting 24 Female
当谈到预测时,请你告诉我,上述方法(含二元因素)在速度方面是否更有效,还是将所有因素都纳入一列将是更好的解决方案?试试这个
library(tidyverse)
# importing your data
df <- read_table("Passenger_ID Has_Own_House Renting Homeless Age Gender
1 1 0 0 21 Male
2 0 1 0 24 Female")
输出为:
# A tibble: 2 x 4
# Passenger_ID Age Gender Housing
# <int> <int> <chr> <chr>
# 1 1 21 Male Has_Own_House
# 2 2 24 Female Renting
#一个tible:2 x 4
#乘客身份证年龄性别住房
#
#男性拥有自己的房子
#2 24女性租赁
在带ifelse的基本R中:
# Load Data
dat <- structure(list(Passenger_ID = 1:2, Has_Own_House = c(1L, 0L),
Renting = 0:1, Homeless = c(0L, 0L), Age = c(21L, 24L), Gender = structure(c(2L,
1L), .Label = c("Female", "Male"), class = "factor")), .Names = c("Passenger_ID",
"Has_Own_House", "Renting", "Homeless", "Age", "Gender"), class = "data.frame", row.names = c(NA,
-2L))
# Assign new column "Housing" based on testing nested ifelse statements:
dat2 <- within(dat, Housing <- ifelse(Has_Own_House==1, "Has_Own_House",
ifelse(Renting==1, "Renting",
ifelse(Homeless==1, "Homeless", NA))))
# Remove extra columns
dat2$Has_Own_House <- NULL
dat2$Renting <- NULL
dat2$Homeless <- NULL
在base R中,您可以通过将返回适当列名的函数应用于数据帧的所有行(
1
argument)来简单地将一个新列分配到一行中(由于which
),其中值为1:
您应该提供一个可复制的示例。请参阅。如果
df
是您的data.frame,您可以制作类似于df$HOUSING=apply(df[,2:4],1,function(x)names(df)[2:4][which(x==1)])的内容。
# A tibble: 2 x 4
# Passenger_ID Age Gender Housing
# <int> <int> <chr> <chr>
# 1 1 21 Male Has_Own_House
# 2 2 24 Female Renting
# Load Data
dat <- structure(list(Passenger_ID = 1:2, Has_Own_House = c(1L, 0L),
Renting = 0:1, Homeless = c(0L, 0L), Age = c(21L, 24L), Gender = structure(c(2L,
1L), .Label = c("Female", "Male"), class = "factor")), .Names = c("Passenger_ID",
"Has_Own_House", "Renting", "Homeless", "Age", "Gender"), class = "data.frame", row.names = c(NA,
-2L))
# Assign new column "Housing" based on testing nested ifelse statements:
dat2 <- within(dat, Housing <- ifelse(Has_Own_House==1, "Has_Own_House",
ifelse(Renting==1, "Renting",
ifelse(Homeless==1, "Homeless", NA))))
# Remove extra columns
dat2$Has_Own_House <- NULL
dat2$Renting <- NULL
dat2$Homeless <- NULL
>dat2
Passenger_ID Age Gender Housing
1 21 Male Has_Own_House
2 24 Female Renting
df = data.frame('Passenger ID' = 1:5,
'Has Own House' = c(1,0,0,1,0),
'Renting' = c(0,1,0,0,0),
'Homeless' = c(0,0,1,0,1),
'Age'=21:25,
'Gender' = c('Male', 'Female', 'Male', 'Female', 'Male'))
df$HOUSING = apply(df[, 2:4], 1, function(x) names(df)[2:4][which(x==1)])
df
# Passenger.ID Has.Own.House Renting Homeless Age Gender HOUSING
# 1 1 1 0 0 21 Male Has.Own.House
# 2 2 0 1 0 22 Female Renting
# 3 3 0 0 1 23 Male Homeless
# 4 4 1 0 0 24 Female Has.Own.House
# 5 5 0 0 1 25 Male Homeless