Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/81.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 有条件地用其他行的值替换NA_R_Tidyverse - Fatal编程技术网

R 有条件地用其他行的值替换NA

R 有条件地用其他行的值替换NA,r,tidyverse,R,Tidyverse,我得到了一个大数据集,其中一个变量中有一组相对较大的缺失变量值。但是,由于我知道变量取决于时间和空间方面,我可以通过从另一行获取具有精确匹配的时间和空间值的值来轻松地估算缺失值。假设生成的数据如下: temporal <- c("Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "

我得到了一个大数据集,其中一个变量中有一组相对较大的缺失变量值。但是,由于我知道变量取决于时间和空间方面,我可以通过从另一行获取具有精确匹配的时间和空间值的值来轻松地估算缺失值。假设生成的数据如下:

temporal <- c("Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday","Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday")
spatial <- c("North", "South","North", "South","North", "South","North", "South","North", "South", "North", "South","North", "South","North", "South","North", "South","North", "South")
value <- c(NA,2,3,4,5,6,7,NA,9,10,1,NA,3,4,5,6,7,8,9,NA)

df <- as.data.frame(cbind(temporal, spatial, value))
在本例中,我想用另一行中的
值替换
值==NA
,该行在
空间
时间
上具有匹配值

因此,最终结果应如下所示:

    temporal spatial value
1     Monday   North     1
2     Monday   South     2
3    Tuesday   North     3
4    Tuesday   South     4
5  Wednesday   North     5
6  Wednesday   South     6
7   Thursday   North     7
8   Thursday   South     8
9     Friday   North     9
10    Friday   South    10
11    Monday   North     1
12    Monday   South     2
13   Tuesday   North     3
14   Tuesday   South     4
15 Wednesday   North     5
16 Wednesday   South     6
17  Thursday   North     7
18  Thursday   South     8
19    Friday   North     9
20    Friday   South    10
我尝试使用
tidyverse
中的
group\u by
功能来实现这一点:

library(tidyverse)
df <- df %>%
  group_by(temporal, spatial) %>%
  mutate(value, unique(value[is.na(value)]))
我是否以正确的方式处理这个问题?如果是的话,为什么我的代码不能(我相信)正常工作?如果没有,什么方法是合适的


谢谢!:)

这里有一个
dplyr
方法。我们按
时间
空间
进行分组,然后按
时间
空间
进行排列,因为NA值将自动置于任何非NA值之下。然后我们使用
mutate
根据
value
第一行中的数字创建
value

library(dplyr)

df %>%
  group_by(temporal, spatial) %>% 
  arrange(temporal, spatial, value) %>% 
  mutate(value = value[1])
使用
tidyr::fill
的更简洁的方法保留了行的结构:

library(tidyverse)

df %>%
  group_by(temporal, spatial) %>% 
  fill(value, .direction = "downup")

# A tibble: 20 x 3
# Groups:   temporal, spatial [10]
   temporal  spatial value
   <chr>     <chr>   <chr>
 1 Monday    North   1    
 2 Monday    South   2    
 3 Tuesday   North   3    
 4 Tuesday   South   4    
 5 Wednesday North   5    
 6 Wednesday South   6    
 7 Thursday  North   7    
 8 Thursday  South   8    
 9 Friday    North   9    
10 Friday    South   10   
11 Monday    North   1    
12 Monday    South   2    
13 Tuesday   North   3    
14 Tuesday   South   4    
15 Wednesday North   5    
16 Wednesday South   6    
17 Thursday  North   7    
18 Thursday  South   8    
19 Friday    North   9    
20 Friday    South   10   
库(tidyverse)
df%>%
分组依据(时间、空间)%>%
填充(值,.direction=“向下”)
#一个tibble:20x3
#组:时间、空间[10]
时空值
1星期一北1
2星期一南2
3星期二北3
4星期二南4
星期三北5
6星期三南6
星期四北7
8星期四南8
9星期五北9
星期五南10
11星期一北1
12星期一南2
13星期二北3
14星期二南4
15星期三北5
16星期三南6
17星期四北7
18星期四南8
19星期五北9
20星期五南10

这里有一个
dplyr
方法。我们按
时间
空间
进行分组,然后按
时间
空间
进行排列,因为NA值将自动置于任何非NA值之下。然后我们使用
mutate
根据
value
第一行中的数字创建
value

library(dplyr)

df %>%
  group_by(temporal, spatial) %>% 
  arrange(temporal, spatial, value) %>% 
  mutate(value = value[1])
使用
tidyr::fill
的更简洁的方法保留了行的结构:

library(tidyverse)

df %>%
  group_by(temporal, spatial) %>% 
  fill(value, .direction = "downup")

# A tibble: 20 x 3
# Groups:   temporal, spatial [10]
   temporal  spatial value
   <chr>     <chr>   <chr>
 1 Monday    North   1    
 2 Monday    South   2    
 3 Tuesday   North   3    
 4 Tuesday   South   4    
 5 Wednesday North   5    
 6 Wednesday South   6    
 7 Thursday  North   7    
 8 Thursday  South   8    
 9 Friday    North   9    
10 Friday    South   10   
11 Monday    North   1    
12 Monday    South   2    
13 Tuesday   North   3    
14 Tuesday   South   4    
15 Wednesday North   5    
16 Wednesday South   6    
17 Thursday  North   7    
18 Thursday  South   8    
19 Friday    North   9    
20 Friday    South   10   
库(tidyverse)
df%>%
分组依据(时间、空间)%>%
填充(值,.direction=“向下”)
#一个tibble:20x3
#组:时间、空间[10]
时空值
1星期一北1
2星期一南2
3星期二北3
4星期二南4
星期三北5
6星期三南6
星期四北7
8星期四南8
9星期五北9
星期五南10
11星期一北1
12星期一南2
13星期二北3
14星期二南4
15星期三北5
16星期三南6
17星期四北7
18星期四南8
19星期五北9
20星期五南10

您的mutate将不起作用,因为您没有为变量赋值。您的
mutate()
应该是这样的
mutate(value=unique(value[is.na(value)])
。虽然这不是我的方法。我在下面做的是创建一个包含不同非NA值的查找表,然后将它们加入原始数据集。valuedis应该是您想要的值

temporal <- c("Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday","Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday")
spatial <- c("North", "South","North", "South","North", "South","North", "South","North", "South", "North", "South","North", "South","North", "South","North", "South","North", "South")
value <- c(NA,2,3,4,5,6,7,NA,9,10,1,NA,3,4,5,6,7,8,9,NA)

df <- as.data.frame(cbind(temporal, spatial, value))

library(dplyr)


dfdis <- df %>% 
          filter(!is.na(value)) %>% 
          distinct(temporal,spatial,value) %>% 
          rename(valuedis = value)

df2 <- left_join(df,dfdis, by = c("temporal","spatial"))

temporal您的mutate将不起作用,因为您没有为变量赋值。您的
mutate()
应该是这样的
mutate(value=unique(value[is.na(value)])
。虽然这不是我的方法。我在下面做的是创建一个包含不同非NA值的查找表,然后将它们加入原始数据集。valuedis应该是您想要的值

temporal <- c("Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday","Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday")
spatial <- c("North", "South","North", "South","North", "South","North", "South","North", "South", "North", "South","North", "South","North", "South","North", "South","North", "South")
value <- c(NA,2,3,4,5,6,7,NA,9,10,1,NA,3,4,5,6,7,8,9,NA)

df <- as.data.frame(cbind(temporal, spatial, value))

library(dplyr)


dfdis <- df %>% 
          filter(!is.na(value)) %>% 
          distinct(temporal,spatial,value) %>% 
          rename(valuedis = value)

df2 <- left_join(df,dfdis, by = c("temporal","spatial"))
时态