R !is.na在其他列中创建NAs_R_Dataframe_Na

R !is.na在其他列中创建NAs

r dataframe

R !is.na在其他列中创建NAs,r,dataframe,na,R,Dataframe,Na,在合并多个数据集的过程中，我试图删除数据帧中一个特定变量缺少值的所有行（我希望暂时将NAs保留在其他一些列中）。我用了下面这句话： data.frame <- data.frame[!is.na(data.frame$year),] 我是否错误地使用了is.na？在这种情况下，是否有is.na的替代方案？任何帮助都将不胜感激编辑以下是重现问题的代码： #data tc <- read.csv("http://dl.dropbox.com/u/4115584/tc2008.csv"

在合并多个数据集的过程中，我试图删除数据帧中一个特定变量缺少值的所有行（我希望暂时将NAs保留在其他一些列中）。我用了下面这句话：

data.frame <- data.frame[!is.na(data.frame$year),]

我是否错误地使用了

is.na

？在这种情况下，是否有

is.na

的替代方案？任何帮助都将不胜感激

编辑以下是重现问题的代码：

#data
tc <- read.csv("http://dl.dropbox.com/u/4115584/tc2008.csv")
frame <- read.csv("http://dl.dropbox.com/u/4115584/frame.csv")

#standardize NA codes
tc[tc == "."] <- NA
tc[tc == -9] <- NA

#standardize spatial units
colnames(frame)[1] <- "loser"
colnames(frame)[2] <- "gainer"
frame$dyad <- paste(frame$loser,frame$gainer,sep="")
tc$dyad <- paste(tc$loser,tc$gainer,sep="")
drops <- c("loser","gainer")
tc <- tc[,!names(tc) %in% drops]
frame <- frame[,!names(frame) %in% drops]
rm(drops)

#merge tc into frame
data <- merge(tc, frame, by.x = "year", by.y = "dyad", all.x=T, all.y=T) #year column is duplicated in       this process. I haven't had this problem with nearly identical code using other data.

rm(tc,frame)

#the first column in the new data frame is the duplicate year, which does not actually contain years.   I'll rename it.
colnames(data)[1] <- "double"

summary(data$year) #shows 833 NA's

summary(data$procedur) #note that at this point there are non-NA values

#later, I want to create 20 year windows following the events in the tc data. For simplicity, I want to remove cases with NA in the year column.

new.data <- data[!is.na(data$year),]

#now let's see what the above operation did
summary(new.data$year) #missing years were successfully removed
summary(new.data$procedur) #this variable is now entirely NA's

#数据
tc尝试完成。案例
：
data.frame.clean <- data.frame[complete.cases(data.frame$year),]

data.frame.clean我认为实际问题在于您的合并

合并并将数据放入数据中后，如果执行以下操作：
# > table(data$procedur, useNA="always")

#   1      2      3      4      5      6   <NA> 
# 122    112    356     59     39     19 192258 

因此，基本上，procedur
的所有值也会被删除，因为您删除了在年中检查NA
的行
为了解决这个问题，我认为您应该使用merge
as：
merge(tc, frame, all=T) # it'll automatically calculate common columns
# also this will not result in duplicated year column.

检查此合并是否为您提供了所需的结果。
请提供可复制的数据。请不要将您的data.frame
命名为data.frame
。因为已经有一个名为data.frame
@Arun的函数，但是他能给自己的data.frame
function
，还是已经有一个名为data.frame
的函数（
？：）：）我头晕目眩。对不起，我想这可能是一个概念上的答案。我用代码和数据进行了编辑，应该可以重现问题。@davy，在合并步骤后，您是否检查了数据？使用is.na是正确的。所以，我想这会有任何不同。谢谢你的建议。但是，结果是完全一样的。
> all(is.na(data$year[!is.na(data$procedur)]))
# [1] TRUE # every value of procedur occurs where year = NA

merge(tc, frame, all=T) # it'll automatically calculate common columns
# also this will not result in duplicated year column.