R 用条件替换NA_R - Fatal编程技术网

R 用条件替换NA

R 用条件替换NA,r,R,我实际上是在做计量经济学分析。我在分析中遇到了一个问题。我正在使用Rstudio 我的数据库由1408（类型1为704，类型2为704）个观察值和49个变量组成 Gender Period Matching group Group Type Overcharging 1 1 73 1 1 NA 0 2 73 1 1

我实际上是在做计量经济学分析。我在分析中遇到了一个问题。我正在使用Rstudio

我的数据库由1408（类型1为704，类型2为704）个观察值和49个变量组成

Gender    Period   Matching group   Group  Type  Overcharging
1           1            73            1       1    NA
0           2            73            1       1    NA
1           1            77            2       1    NA
1           2            77            2       1    NA
...        ...          ...           ...     ...   ...
0           1            73            1       2    1
0           2            73            1       2    0
1           1            77            2       2    0
1           2            77            2       2    1
...        ...          ...           ...     ...   ...

您可以看到NA值和代理类型相关（若代理类型为1）。我想做的是：如果类型1的代理属于类型2的代理的相同匹配组、组和时段，那么用类型2的代理的相同值替换NA（对于每一行）

感谢您的时间和考虑！非常感谢您的帮助。

这里是一个包含

数据的解决方案。表

：

library("data.table")
dt <- fread(header=TRUE,
'Gender    Period   Matching.group   Group  Type  Overcharging
1           1            73            1       1    NA
0           2            73            1       1    NA
1           1            77            2       1    NA
1           2            77            2       1    NA
0           1            73            1       2    1
0           2            73            1       2    0
1           1            77            2       2    0
1           2            77            2       2    1')

d2 <- dt[Type!=1, Overcharging, .(Group,Period)]
rbind(dt[Type==1][d2, on=.(Group, Period), Overcharging:=i.Overcharging],dt[Type!=1])

# > rbind(dt[Type==1][d2, on=.(Group, Period), Overcharging:=i.Overcharging],dt[Type!=1])
#    Gender Period Matching.group Group Type Overcharging
# 1:      1      1             73     1    1            1
# 2:      0      2             73     1    1            0
# 3:      1      1             77     2    1            0
# 4:      1      2             77     2    1            1
# 5:      0      1             73     1    2            1
# 6:      0      2             73     1    2            0
# 7:      1      1             77     2    2            0
# 8:      1      2             77     2    2            1

（如果

Type！=1

的组和周期顺序与

Type==1

的顺序相同）

我们可以使用

dplyr

和

tidyr

（来自

tidyverse

）中的函数来执行此任务。来自

tidyr

的

fill

函数可以基于上一行或下一行来插补缺失的值。因此，我们的想法是先安排数据帧，然后使用

fill

来填充

Overcharging

列中的所有

NA

library(tidyverse)

dt <- read.csv(text = "Gender,Period,Matching.group,Group,Type,Overcharging
1,1,73,1,1,NA
0,2,73,1,1,NA
1,1,77,2,1,NA
1,2,77,2,1,NA
0,1,73,1,2,1
0,2,73,1,2,0
1,1,77,2,2,0
1,2,77,2,2,1",
               stringsAsFactors = FALSE)

dt2 <- dt %>%
  mutate(ID = 1:n()) %>%                             # Create a column with ID starting 1
  arrange(Period, `Matching.group`, Group, Type) %>% # Arrange the columns
  fill(Overcharging, .direction = c("up")) %>%       # Fill the missing values, the direction is "up"
  arrange(ID) %>%                                    # Arrange the columns based on ID
  select(-ID)                                        # Remove the ID column

库（tidyverse）
dt%#创建一个ID从1开始的列
排列（句点，`Matching.group`，group，Type）%>%#排列列
填充（过度充电，.direction=c（“向上”））%>%#填充缺少的值，方向为“向上”
排列（ID）%>%#根据ID排列列
选择（-ID）#删除ID列

谢谢您的回答。我不理解代码中的某些内容（是的，我真的是一个初学者！）。与“.”before.（Group，Period）相对应的是什么？

（）

是

list（）

的缩写，如果您使用的是包

data.table

。很抱歉，我收到一条错误消息：“如果（删除）警告C中出错（“忽略删除”）：参数不能解释为逻辑附加：警告消息：如果（删除）警告C（“忽略删除”）“”：条件的长度大于1，只使用第一个元素。您知道这意味着什么吗？我的代码正在运行，没有任何错误。最终您使用的是其他数据。请使用

dput（）编辑您的问题以显示您的数据

。我将发布另一个问题，询问如何解决问题，因为您的解决方案没有在我的计算机上运行（这是我的错，我没有足够的背景来解决此问题…。谢谢：）！

dt[Type==1, Overcharging:=dt[Type!=1, Overcharging]]

library(tidyverse)

dt <- read.csv(text = "Gender,Period,Matching.group,Group,Type,Overcharging
1,1,73,1,1,NA
0,2,73,1,1,NA
1,1,77,2,1,NA
1,2,77,2,1,NA
0,1,73,1,2,1
0,2,73,1,2,0
1,1,77,2,2,0
1,2,77,2,2,1",
               stringsAsFactors = FALSE)

dt2 <- dt %>%
  mutate(ID = 1:n()) %>%                             # Create a column with ID starting 1
  arrange(Period, `Matching.group`, Group, Type) %>% # Arrange the columns
  fill(Overcharging, .direction = c("up")) %>%       # Fill the missing values, the direction is "up"
  arrange(ID) %>%                                    # Arrange the columns based on ID
  select(-ID)                                        # Remove the ID column