使用'fread'对#N/A的错误解释`

使用'fread'对#N/A的错误解释`,r,data.table,R,Data.table,我正在使用data.tablefread()函数读取一些缺失值的数据,这些数据是在Excel中生成的,因此缺失值字符串为“#N/A”。但是,当我使用na.strings命令时,读取数据的最后str仍然是字符。为了复制这一点,这里是代码和数据 数据: Date,a,b,c,d,e,f,g 1/1/03,#N/A,0.384650146,0.992190069,0.203057232,0.636296656,0.271766148,0.347567706 1/2/03,#N/A,0.46148697

我正在使用
data.table
fread()
函数读取一些缺失值的数据,这些数据是在Excel中生成的,因此缺失值字符串为“#N/A”。但是,当我使用
na.strings
命令时,读取数据的最后
str
仍然是字符。为了复制这一点,这里是代码和数据

数据:

Date,a,b,c,d,e,f,g
1/1/03,#N/A,0.384650146,0.992190069,0.203057232,0.636296656,0.271766148,0.347567706
1/2/03,#N/A,0.461486974,0.500702057,0.234400718,0.072789936,0.060900352,0.876749487
1/3/03,#N/A,0.573541006,0.478062582,0.840918789,0.061495666,0.64301024,0.939575302
1/4/03,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A
1/5/03,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A
1/6/03,#N/A,0.66678429,0.897482818,0.569609033,0.524295691,0.132941158,0.194114347
1/7/03,#N/A,0.576835985,0.982816576,0.605408973,0.093177815,0.902145012,0.291035649
1/8/03,#N/A,0.100952961,0.205491093,0.376410642,0.775917986,0.882827749,0.560508499
1/9/03,#N/A,0.350174456,0.290225065,0.428637309,0.022947911,0.7422805,0.354776101
1/10/03,#N/A,0.834345466,0.935128099,0.163158666,0.301310627,0.273928596,0.537167776
1/11/03,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A
1/12/03,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A
1/13/03,#N/A,0.325914633,0.68192633,0.320222677,0.249631582,0.605508964,0.739263677
1/14/03,#N/A,0.715104989,0.639040211,0.004186366,0.351412982,0.243570606,0.098312443
1/15/03,#N/A,0.750380716,0.264929325,0.782035411,0.963814327,0.93646428,0.453694758
1/16/03,#N/A,0.282389354,0.762102103,0.515151803,0.194083842,0.102386764,0.569730516
1/17/03,#N/A,0.367802161,0.906878948,0.848538256,0.538705673,0.707436236,0.186222899
1/18/03,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A
1/19/03,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A,#N/A
1/20/03,#N/A,0.79933188,0.214688799,0.37011313,0.189503843,0.294051763,0.503147404
1/21/03,#N/A,0.620066341,0.329949446,0.123685075,0.69027192,0.060178071,0.599825005
(以临时csv格式保存的数据) 代码:

库(data.table)

a来自
?fread
na.strings
文档内容如下:

na.strings要转换为na_字符的字符串的字符向量。默认情况下,对于读作类型的列,字符“,”读作空白字符串(“”),而“NA”读作不带字符。典型的备选方案可能是na.strings=NULL,也可能是na.strings=c(“na”、“N/A”、“N”)

我想以后你应该自己把它们转换成数字。至少我从文档中了解到了这一点

像这样的

cbind(a[, 1], a[, lapply(.SD[, -1], as.numeric)])

是的,但我发现将其余部分转换为
数值
有些不直观。(或者说,我在15分钟内没有弄明白。你能提供代码吗?我尝试了
temp[,a:=as.numeric(a)]
的方法,但它给了我一些错误。)所以我回滚到
read.csv
,我想普通用户也会这么做,考虑到潜在的好处,我觉得这很不幸。@Tomaskrelik是的,这很不幸。谢谢你的提高。此处已存档:@MatthewDowle
base::read.table
func有一个
as.is
参数,
as.is=FALSE
通常保证数字作为数字读入。
fread
是否有类似的参数?
Classes ‘data.table’ and 'data.frame':  144 obs. of  8 variables:
 $ Date: chr  "1/1/03" "1/2/03" "1/3/03" "1/4/03" ...
 $ a   : chr  NA NA NA NA ...
 $ b   : chr  "0.384650146" "0.461486974" "0.573541006" NA ...
 $ c   : chr  "0.992190069" "0.500702057" "0.478062582" NA ...
 $ d   : chr  "0.203057232" "0.234400718" "0.840918789" NA ...
 $ e   : chr  "0.636296656" "0.072789936" "0.061495666" NA ...
 $ f   : chr  "0.271766148" "0.060900352" "0.64301024" NA ...
 $ g   : chr  "0.347567706" "0.876749487" "0.939575302" NA ...
 - attr(*, ".internal.selfref")=<externalptr> 
 a <- read.csv("temp.csv", header=TRUE, na.strings="#N/A")
cbind(a[, 1], a[, lapply(.SD[, -1], as.numeric)])