R 创建另一个具有“是/否”值的列
我有一个数据集,我想在其中创建一个新列,其名称为R 创建另一个具有“是/否”值的列,r,R,我有一个数据集,我想在其中创建一个新列,其名称为small、medium和large,这些将取决于2014年的值。其中1:5=small和6:9=中等,而10:14=大。值将是yes或no,具体取决于年份的值 以下是我的数据集的外观: A tibble: 330 x 4 LOC_ID season `2014` `2015` <chr> <chr> <int> <int> 1 LOC1002793 Summe
small
、medium
和large
,这些将取决于2014年的值。其中1:5
=small
和6:9
=中等
,而10:14
=大
。值将是yes
或no
,具体取决于年份的值
以下是我的数据集的外观:
A tibble: 330 x 4
LOC_ID season `2014` `2015`
<chr> <chr> <int> <int>
1 LOC1002793 Summer 12 NA
2 LOC1002793 Winter 6 NA
3 LOC1004001 Winter NA 1
4 LOC1004488 Winter 8 NA
5 LOC1012349 Summer 12 12
6 LOC1012349 Winter 11 12
7 LOC1019836 Summer 14 10
8 LOC1019836 Winter 12 12
9 LOC1022032 Winter NA 1
10 LOC1034172 Summer 13 11
# ... with 320 more rows
这里有一个
tidyverse
方法
library(dplyr)
library(tidyr)
data %>%
mutate(names = cut(`2014`, # cut converts a numeric vector into a factor
breaks = c(1, 5, 9, 14),
labels = c("small", "medium", "large")),
values = if_else(is.na(names), "NA" , "yes")) %>% # adds your "yes" values
pivot_wider(names_from = names,
values_from = values) %>% # makes the cut labels into columns
select(-`NA`) %>% # an undesired `NA` column was created and needs to be removed
mutate(across(large:small,
~replace_na(., "no"))) # replace NA values with "no"
# A tibble: 50 x 7
LOC_ID season `2014` `2015` large medium small
<chr> <chr> <int> <int> <chr> <chr> <chr>
1 LOC1002793 Summer 12 NA yes no no
2 LOC1002793 Winter 6 NA no yes no
3 LOC1004001 Winter NA 1 no no no
4 LOC1004488 Winter 8 NA no yes no
5 LOC1012349 Summer 12 12 yes no no
6 LOC1012349 Winter 11 12 yes no no
7 LOC1019836 Summer 14 10 yes no no
8 LOC1019836 Winter 12 12 yes no no
9 LOC1022032 Winter NA 1 no no no
10 LOC1034172 Summer 13 11 yes no no
# ... with 40 more rows
库(dplyr)
图书馆(tidyr)
数据%>%
mutate(names=cut(`2014`,#cut)将数字向量转换为因子
断裂=c(1,5,9,14),
标签=c(“小”、“中”、“大”),
values=if_else(is.na(name),“na”,“yes”)%>%#添加您的“yes”值
枢轴(名称)从=名称,
values_from=values)%>%#将剪切标签制成列
选择(`NA`)%>%#创建了一个不需要的`NA`列,需要删除
突变(跨越(大:小,
~replace_na(,“no”)#将na值替换为“no”
#一个tibble:50x7
LOC_ID季节'2014 ``2015 `大中型小型
1 LOC1002793夏季12不适用是否否
2 LOC1002793冬季6不适用是否
3 LOC1004001冬季不适用1号
4 LOC1004488冬季8不适用是否
5 LOC1012349夏季12 12是否
6 LOC1012349冬季11 12是否
7 LOC1019836夏季14 10是否
8 LOC1019836冬季12 12是否
9 LOC1022032冬季不适用1号
10 LOC1034172夏季13 11是否
# ... 还有40行
能否运行dput(head(data,10))
并将输出粘贴到问题中?这使得将您的数据复制到R中变得更容易,以便我们为您提供最佳答案。哎呀,我本来打算这么做的,现在看看NAs是否应该在3个“大小”列中触发一个空白“”而不是“否”?优雅的回答无论如何,这在2014年非常有效,尽管2015年它无法提供yes
给定值1
,但它也会在中间列返回许多NAs
,而不是no
。如果你想看一看,我只更新了2015年的dput
代码。我用这个2015年[which(year_2015[,4]==1,),6]
#replicate the years and name those replicates test1 = 2014 and test2 = 2015
A tibble: 330 x 6
LOC_ID season test1 `2014` test2 `2015`
<chr> <chr> <int> <int> <int> <int>
1 LOC1002793 Summer 12 12 NA NA
2 LOC1002793 Winter 6 6 NA NA
3 LOC1004001 Winter NA NA 1 1
4 LOC1004488 Winter 8 8 NA NA
5 LOC1012349 Summer 12 12 12 12
6 LOC1012349 Winter 11 11 12 12
7 LOC1019836 Summer 14 14 10 10
8 LOC1019836 Winter 12 12 12 12
9 LOC1022032 Winter NA NA 1 1
10 LOC1034172 Summer 13 13 11 11
# ... with 320 more rows
ld <- d %>% mutate(test1 = recode(test1, `1:5` = 'low', `6:9` = 'medium', `10:14` = 'high')) %>% pivot_wider(names_from = test1, values_from = '2014')
structure(list(LOC_ID = c("LOC1002793", "LOC1002793", "LOC1004001",
"LOC1004488", "LOC1012349", "LOC1012349", "LOC1019836", "LOC1019836",
"LOC1022032", "LOC1034172", "LOC1034172", "LOC1039789", "LOC1040038",
"LOC1040038", "LOC1047222314194", "LOC1047222314194", "LOC1048553080056",
"LOC1049318", "LOC1049318", "LOC1049970899816", "LOC1049970899816",
"LOC1066628", "LOC1066628", "LOC1071566", "LOC1071566", "LOC1071569",
"LOC1071569", "LOC1073191", "LOC1073191", "LOC1073423", "LOC1073423",
"LOC1079978", "LOC1079978", "LOC1083442", "LOC1083442", "LOC1086293",
"LOC1086293", "LOC1087213", "LOC1087213", "LOC1088795", "LOC1088795",
"LOC1122438", "LOC1122438", "LOC1139319877260", "LOC1139319877260",
"LOC1153084541859", "LOC1153084541859", "LOC1155749", "LOC1163128",
"LOC1163128", "LOC1234081", "LOC1234081", "LOC1289919", "LOC1289919",
"LOC1294966340210", "LOC1300602115", "LOC1300602115", "LOC1300602122",
"LOC1300602122", "LOC1300602135", "LOC1300602135", "LOC1300602161",
"LOC1300602161", "LOC1300602184", "LOC1300602184", "LOC1300602196",
"LOC1300602196", "LOC1300602243", "LOC1300602243", "LOC1300602306",
"LOC1300602306", "LOC1300604079", "LOC1300604079", "LOC1300604135",
"LOC1300604135", "LOC1300604635", "LOC1300604635", "LOC1300604699",
"LOC1300604699", "LOC1300604713"), season = c("Summer", "Winter",
"Winter", "Winter", "Summer", "Winter", "Summer", "Winter", "Winter",
"Summer", "Winter", "Winter", "Summer", "Winter", "Summer", "Winter",
"Summer", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter",
"Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer",
"Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter",
"Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer",
"Winter", "Summer", "Winter", "Winter", "Summer", "Winter", "Summer",
"Winter", "Summer", "Winter", "Winter", "Summer", "Winter", "Summer",
"Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter",
"Summer", "Winter", "Summer", "Winter", "Summer", "Winter", "Summer",
"Winter", "Summer", "Winter", "Summer", "Winter", "Summer", "Winter",
"Summer"), `2015` = c(NA, NA, 1L, NA, 12L, 12L, 10L, 12L, 1L,
11L, 12L, NA, 2L, 5L, 11L, 9L, NA, 5L, 7L, 13L, 12L, 9L, 11L,
9L, 11L, 11L, 7L, 7L, 12L, 8L, 12L, 12L, 7L, 13L, 12L, 12L, 12L,
4L, 8L, 7L, 10L, 7L, 4L, 12L, 12L, 2L, 5L, NA, NA, 9L, 12L, 7L,
11L, 4L, 8L, 11L, 12L, 13L, 12L, 10L, 12L, 12L, 12L, 11L, 4L,
13L, 12L, 12L, 12L, 10L, 10L, 11L, 10L, 12L, 11L, 9L, 11L, 9L,
10L, 13L)), row.names = c(NA, -80L), class = c("tbl_df", "tbl",
"data.frame"))
library(dplyr)
library(tidyr)
data %>%
mutate(names = cut(`2014`, # cut converts a numeric vector into a factor
breaks = c(1, 5, 9, 14),
labels = c("small", "medium", "large")),
values = if_else(is.na(names), "NA" , "yes")) %>% # adds your "yes" values
pivot_wider(names_from = names,
values_from = values) %>% # makes the cut labels into columns
select(-`NA`) %>% # an undesired `NA` column was created and needs to be removed
mutate(across(large:small,
~replace_na(., "no"))) # replace NA values with "no"
# A tibble: 50 x 7
LOC_ID season `2014` `2015` large medium small
<chr> <chr> <int> <int> <chr> <chr> <chr>
1 LOC1002793 Summer 12 NA yes no no
2 LOC1002793 Winter 6 NA no yes no
3 LOC1004001 Winter NA 1 no no no
4 LOC1004488 Winter 8 NA no yes no
5 LOC1012349 Summer 12 12 yes no no
6 LOC1012349 Winter 11 12 yes no no
7 LOC1019836 Summer 14 10 yes no no
8 LOC1019836 Winter 12 12 yes no no
9 LOC1022032 Winter NA 1 no no no
10 LOC1034172 Summer 13 11 yes no no
# ... with 40 more rows