更换NA和x27；在R中具有特定条件的s_R_Replace_Na

更换NA和x27；在R中具有特定条件的s

r replace

更换NA和x27；在R中具有特定条件的s,r,replace,na,R,Replace,Na,如果2017年为NA，且2015和2016列具有值，则我希望基于同一行将其平均值分配给2017年 Index 2015 2016 2017 1 NA 6355698 10107023 2 13000000 73050000 NA 4 NA NA NA 5 10500000

如果2017年为NA，且2015和2016列具有值，则我希望基于同一行将其平均值分配给2017年

Index   2015            2016            2017
1       NA              6355698         10107023
2       13000000        73050000        NA
4       NA              NA              NA
5       10500000        NA              8000000
6       331000000       659000000       1040000000
7       55500000        NA              32032920
8       NA              NA              20000000
9       2521880         5061370         7044288
...

这是我试过的，没用

ind <- which(is.na(df), arr.ind=TRUE)
df[ind] <- rowMeans(df,  na.rm = TRUE)[ind[,1]]

ind免责声明：我不完全清楚您的预期输出是什么。我下面的解决方案基于这样一个假设，即您希望用每年所有值的平均值或每个索引的所有值的平均值替换NA
值
这里有一个tidyverse
选项，首先从宽到长进行扩展，用每年的平均值替换NA
s，最后从长到宽进行转换
library(tidyverse)
df %>%
    gather(year, value, -Index) %>%
    group_by(year) %>%
    mutate(value = ifelse(is.na(value), mean(value, na.rm = T), value)) %>%
    spread(year, value)
## A tibble: 8 x 4
#  Index     `2015`     `2016`      `2017`
#  <int>      <dbl>      <dbl>       <dbl>
#1     1 115507293.   6355698.   10107023.
#2     2  13000000. 223472356.  186197372.
#3     4 115507293. 223472356.  186197372.
#4     5 115507293. 223472356.    8000000.
#5     6 331000000. 659000000. 1040000000.
#6     7 115507293. 223472356.   32032920.
#7     8 115507293. 223472356.   20000000.
#8     9   2521880.   5061370.    7044288.


更新
要仅使用基于2015
的行平均值替换2017
列中的NA
s，可以执行2016
值
df <- read_table("Index   2015            2016            2017
1       NA              6355698         10107023
2       13000000        73050000        NA
4       NA              NA              NA
5       10500000        NA              8000000
6       331000000       659000000       1040000000
7       55500000        NA              32032920
8       NA              NA              20000000
9       2521880         5061370         7044288")


df %>%
    mutate(`2017` = ifelse(is.na(`2017`), 0.5 * (`2015` + `2016`), `2017`))
## A tibble: 8 x 4
#  Index    `2015`    `2016`      `2017`
#  <int>     <int>     <int>       <dbl>
#1     1        NA   6355698   10107023.
#2     2  13000000  73050000   43025000.
#3     4        NA        NA         NA
#4     5  10500000        NA    8000000.
#5     6 331000000 659000000 1040000000.
#6     7  55500000        NA   32032920.
#7     8        NA        NA   20000000.
#8     9   2521880   5061370    7044288.

df%
突变（`2017`=ifelse（is.na（`2017`）、0.5*（`2015`+`2016`）、`2017`）
##一个tibble:8x4
#指数'2015'`2016'`2017`
#                   
#1 NA 6355698 10107023。
#2     2  13000000  73050000   43025000.
#3 4娜娜娜娜
#4.5 10500000纳800万纳。
#5     6 331000000 659000000 1040000000.
#6755500000 NA 32032920。
#7.8纳2000万纳。
#8     9   2521880   5061370    7044288.


样本数据
df应该是分组依据（索引）
？就像OP在搜索rowMeans一样？@Jimbou嗯，也许是的。我有点困惑，因为在这种情况下，Index=4
将包含所有的NA
s。我会记下来的。谢谢你们！我编辑了示例数据，因为前一个数据没有我希望您显示为示例的内容example@kimi-finn379那么您只想在2017年和2016年更换NAs。为了澄清，列的平均值也必须基于同一行
df <- read_table("Index   2015            2016            2017
1       NA              6355698         10107023
2       13000000        73050000        NA
4       NA              NA              NA
5       10500000        NA              8000000
6       331000000       659000000       1040000000
7       55500000        NA              32032920
8       NA              NA              20000000
9       2521880         5061370         7044288")


df %>%
    mutate(`2017` = ifelse(is.na(`2017`), 0.5 * (`2015` + `2016`), `2017`))
## A tibble: 8 x 4
#  Index    `2015`    `2016`      `2017`
#  <int>     <int>     <int>       <dbl>
#1     1        NA   6355698   10107023.
#2     2  13000000  73050000   43025000.
#3     4        NA        NA         NA
#4     5  10500000        NA    8000000.
#5     6 331000000 659000000 1040000000.
#6     7  55500000        NA   32032920.
#7     8        NA        NA   20000000.
#8     9   2521880   5061370    7044288.

df <- read_table("Index   2015            2016            2017
1       NA              6355698         10107023
2       13000000        NA              NA
4       NA              NA              NA
5       NA              NA              8000000
6       331000000       659000000       1040000000
7       NA              NA              32032920
8       NA              NA              20000000
9       2521880         5061370         7044288")