R 特定范围的垂直查找
我有一个数据框,看起来像这样,这是前24行: row Name Age 1 John 22 2 Alice 29 3 Michael 33 4 Briefing NA 5 Class A 6 Year 2016 7 Observation 0 8 End NA 9 Steward 35 10 Louis 20 11 Josh 22 12 Marie 39 13 Briefing NA 14 Observation 2 15 Year 2017 16 End NA 17 Adam 27 18 Joseph 26 19 Andrew 26 20 Briefing NA 21 Observation 2 22 Year 2017 23 Class B 24 End NAR 特定范围的垂直查找,r,R,我有一个数据框,看起来像这样,这是前24行: row Name Age 1 John 22 2 Alice 29 3 Michael 33 4 Briefing NA 5 Class A 6 Year 2016 7 Observation 0 8 End NA 9 St
基本上,有两个数据帧按行排列在一起。我们将它们分开,然后按列将它们重新连接在一起
library(dplyr)
library(tidyr)
# flag rows as being part of the "lookup" data frame
# and add a grouping so we know which lookup values belong with which data values
df = mutate(df, group = cumsum(df$Name == "End"),
is_lookup = cumsum(Name == "Briefing") > cumsum(Name == "End") | Name == "End") %>%
select(-row)
# break off the lookup data and make it wide
lookup = filter(df, is_lookup,
! Name %in% c("Briefing", "End")) %>%
spread(key = Name, value = Age)
# break off the non-lookup data and join it to the wide lookup
df %>% filter(!is_lookup) %>%
select(Name, group) %>%
left_join(lookup) %>%
select(-group, -is_lookup)
# Joining, by = "group"
# Name Class Observation Year
# 1 John A 0 2016
# 2 Alice A 0 2016
# 3 Michael A 0 2016
# 4 Steward <NA> 2 2017
# 5 Louis <NA> 2 2017
# 6 Josh <NA> 2 2017
# 7 Marie <NA> 2 2017
# 8 Adam B 2 2017
# 9 Joseph B 2 2017
# 10 Andrew B 2 2017
你能用dput发布你的数据,这样我们就可以使用它了吗?嗨,加布里埃尔。绝对地我已经编辑了这篇文章并将dput放在那里。谢谢。如果您正在阅读csv或excel表格中的数据,请确认Hi Ossan,csv。谢谢。您好,我已经更新了示例和dput。我相信现在解释得更好。谢谢你的帮助!你好,Gregor,很抱歉我反应太晚了。感谢您帮助完成这项任务。这个过程非常有效,因为它跟踪到了需要从被弄脏的数据帧转置的任何可能的列。
structure(list(row = 1:24, Name = structure(c(7L, 2L, 12L, 4L,
5L, 15L, 13L, 6L, 14L, 10L, 9L, 11L, 4L, 13L, 15L, 6L, 1L, 8L,
3L, 4L, 13L, 15L, 5L, 6L), .Label = c("Adam", "Alice", "Andrew",
"Briefing", "Class", "End", "John", "Joseph", "Josh", "Louis",
"Marie", "Michael", "Observation", "Steward", "Year"), class = "factor"),
Age = structure(c(6L, 9L, 10L, NA, 13L, 4L, 1L, NA, 11L,
3L, 6L, 12L, NA, 2L, 5L, NA, 8L, 7L, 7L, NA, 2L, 5L, 14L,
NA), .Label = c("0", "2", "20", "2016", "2017", "22", "26",
"27", "29", "33", "35", "39", "A", "B"), class = "factor")), .Names = c("row",
"Name", "Age"), class = "data.frame", row.names = c(NA, -24L))
library(dplyr)
library(tidyr)
# flag rows as being part of the "lookup" data frame
# and add a grouping so we know which lookup values belong with which data values
df = mutate(df, group = cumsum(df$Name == "End"),
is_lookup = cumsum(Name == "Briefing") > cumsum(Name == "End") | Name == "End") %>%
select(-row)
# break off the lookup data and make it wide
lookup = filter(df, is_lookup,
! Name %in% c("Briefing", "End")) %>%
spread(key = Name, value = Age)
# break off the non-lookup data and join it to the wide lookup
df %>% filter(!is_lookup) %>%
select(Name, group) %>%
left_join(lookup) %>%
select(-group, -is_lookup)
# Joining, by = "group"
# Name Class Observation Year
# 1 John A 0 2016
# 2 Alice A 0 2016
# 3 Michael A 0 2016
# 4 Steward <NA> 2 2017
# 5 Louis <NA> 2 2017
# 6 Josh <NA> 2 2017
# 7 Marie <NA> 2 2017
# 8 Adam B 2 2017
# 9 Joseph B 2 2017
# 10 Andrew B 2 2017