如何将NA填入R中的下一行?
我想把NA填到下一排。这是数据集 structurelisttimestamp=结构C1L、2L、3L、4L、5L、6L, 7L,8L,9L,10L,11L,1L,2L,3L,4L,5L,6L,7L,8L,9L,10L, 11L,.标签=c2019-07-07 00:00:002019-07-07 00:00:01, 2019-07-07 00:00:02, 2019-07-07 00:00:03, 2019-07-07 00:00:04, 2019-07-07 00:00:05, 2019-07-07 00:00:06, 2019-07-07 00:00:07, 2019-07-07 00:00:08, 2019-07-07 00:00:09, 2019-07-07 00:00:10 ,类别=系数,来源=结构CNA,NA,NA,1L,NA, NA,1L,NA,NA,NA,NA,2L,NA,NA,2L,NA,NA,NA,NA,2L, NA、.Label=cUSER\u A、USER\u B、class=factor、value=cNA、, NA,NA,1L,NA,NA,1L,NA,NA,NA,NA,NA,1L,NA,1L,NA,1L,NA,NA,NA, 2L,NA,NA,3L,NA,class=data.frame,row.names=cNA, -22L 该表是时间和源之间的各种循环。在这种情况下,每个源A和B都有固定的00:00:00到00:00:10行 以下是预期结果表如何将NA填入R中的下一行?,r,date,dataframe,dplyr,na,R,Date,Dataframe,Dplyr,Na,我想把NA填到下一排。这是数据集 structurelisttimestamp=结构C1L、2L、3L、4L、5L、6L, 7L,8L,9L,10L,11L,1L,2L,3L,4L,5L,6L,7L,8L,9L,10L, 11L,.标签=c2019-07-07 00:00:002019-07-07 00:00:01, 2019-07-07 00:00:02, 2019-07-07 00:00:03, 2019-07-07 00:00:04, 2019-07-07 00:00:05, 2019-
timestamp source value
1 2019-07-07 00:00:00 <NA> NA
2 2019-07-07 00:00:01 <NA> NA
3 2019-07-07 00:00:02 <NA> NA
4 2019-07-07 00:00:03 USER_A 1
5 2019-07-07 00:00:04 USER_A 1
6 2019-07-07 00:00:05 USER_A 1
7 2019-07-07 00:00:06 USER_A 1
8 2019-07-07 00:00:07 <NA> NA
9 2019-07-07 00:00:08 <NA> NA
10 2019-07-07 00:00:09 <NA> NA
11 2019-07-07 00:00:10 <NA> NA
12 2019-07-07 00:00:00 <NA> NA
13 2019-07-07 00:00:01 USER_B 1
14 2019-07-07 00:00:02 USER_B 1
15 2019-07-07 00:00:03 USER_B 1
16 2019-07-07 00:00:04 USER_B 2
17 2019-07-07 00:00:05 USER_B 2
18 2019-07-07 00:00:06 USER_B 2
19 2019-07-07 00:00:07 USER_B 3
20 2019-07-07 00:00:08 USER_B 3
21 2019-07-07 00:00:09 USER_B 3
22 2019-07-07 00:00:10 <NA> NA
第5行和第6行的值和源根据用户_A替换为第7行的值和源。用户_B行也根据下一行以相同的方式替换
如何在R中进行此处理?这里有一种使用dplyr的方法,因为每个源都有固定数量的行。我们首先为每n行创建一个组,并添加一个新的列group2,该列中的非NA值的最小索引和最大索引之间只有1。然后,我们将_按group2分组,并按组用以前的非缺失值填充缺失值
n <- 11
library(dplyr)
df %>%
group_by(group1 = gl(n()/n, n)) %>%
mutate(group2 = 0,
group2 = replace(group2, min(which(!is.na(source))) :
max(which(!is.na(source))), 1)) %>%
group_by(group2) %>%
tidyr::fill(source, value) %>%
ungroup() %>%
select(-group1, -group2)
# A tibble: 22 x 3
# timestamp source value
# <fct> <fct> <int>
# 1 2019-07-07 00:00:00 NA NA
# 2 2019-07-07 00:00:01 NA NA
# 3 2019-07-07 00:00:02 NA NA
# 4 2019-07-07 00:00:03 USER_A 1
# 5 2019-07-07 00:00:04 USER_A 1
# 6 2019-07-07 00:00:05 USER_A 1
# 7 2019-07-07 00:00:06 USER_A 1
# 8 2019-07-07 00:00:07 NA NA
# 9 2019-07-07 00:00:08 NA NA
#10 2019-07-07 00:00:09 NA NA
# … with 12 more rows
谢谢你的回答。实际上,我在fill函数中添加了向上的方向选项。无论如何,谢谢你much@Juhyeon默认方向是向下,但在这种情况下,我认为无论方向如何,它都会给出相同的答案。
n <- 11
library(dplyr)
df %>%
group_by(group1 = gl(n()/n, n)) %>%
mutate(group2 = 0,
group2 = replace(group2, min(which(!is.na(source))) :
max(which(!is.na(source))), 1)) %>%
group_by(group2) %>%
tidyr::fill(source, value) %>%
ungroup() %>%
select(-group1, -group2)
# A tibble: 22 x 3
# timestamp source value
# <fct> <fct> <int>
# 1 2019-07-07 00:00:00 NA NA
# 2 2019-07-07 00:00:01 NA NA
# 3 2019-07-07 00:00:02 NA NA
# 4 2019-07-07 00:00:03 USER_A 1
# 5 2019-07-07 00:00:04 USER_A 1
# 6 2019-07-07 00:00:05 USER_A 1
# 7 2019-07-07 00:00:06 USER_A 1
# 8 2019-07-07 00:00:07 NA NA
# 9 2019-07-07 00:00:08 NA NA
#10 2019-07-07 00:00:09 NA NA
# … with 12 more rows