R 请阅读带“的表格”；右对齐“；数据_R_Read.table_Data Import

R 请阅读带“的表格”；右对齐“；数据

R 请阅读带“的表格”；右对齐“；数据,r,read.table,data-import,R,Read.table,Data Import,我有一个文本文件要在R中读取，但该文件似乎不是以制表符分隔的。文件的唯一结构是列始终在某个点结束（即列右对齐）那么，首先，这种类型的数据结构有名字吗？那么，怎样才能用R来读呢 2.37 2.03 2.38 5,397 5,082 5,609 13.0 21.6 15.2 15.2 128.0 1

我有一个文本文件要在R中读取，但该文件似乎不是以制表符分隔的。文件的唯一结构是列始终在某个点结束（即列右对齐）

那么，首先，这种类型的数据结构有名字吗？那么，怎样才能用R来读呢

    2.37      2.03                          2.38
   5,397     5,082                         5,609
    13.0      21.6          15.2            15.2
   128.0     103.1         134.2           133.4

仅使用read.table（）不起作用，缺少的值将不会放在正确的位置

# download data:
tmp <- tempfile()
f <- download.file("http://usda.mannlib.cornell.edu/usda/waob/wasde//1990s/1995/wasde-01-12-1995.txt", tmp)
D <- file(tmp)
data_enc <- readLines(D, warn=FALSE)
close(D)
dat <- sapply(strsplit(data_enc[232:236], ":"), function(x) x[2])
writeLines(dat, tmp)

## try to read data:
read.table(tmp, fill = TRUE, sep ="", header=FALSE)

可以尝试使用

read.fwf

读取固定宽度格式化数据表：

widths <- gregexpr("\\.\\d", readLines(tmp)[5])[[1]]+1L # line 5 looks complete
widths <- c(widths[1], diff(widths)) # posis after the decimal points as widths
read.fwf(tmp, widths = widths)
#         V1         V2    V3               V4
# 1     2.37       2.03    NA             2.38
# 2    5,397      5,082    NA            5,609
# 3     13.0       21.6  15.2             15.2
# 4    128.0      103.1 134.2            133.4
# 5    146.4      130.9 156.5            155.7

widths相关帖子：哦，太好了！那么这个数据仍然是固定宽度的数据吗？我被这样的想法欺骗了：固定宽度不仅包括同一个端点，还包括同一个起点！？那么也许哈德利的软件包readr会让它变得非常简单：read_fwf（tmp，fwf_empty（tmp））。。。是的，甚至更好。：-）
widths <- gregexpr("\\.\\d", readLines(tmp)[5])[[1]]+1L # line 5 looks complete
widths <- c(widths[1], diff(widths)) # posis after the decimal points as widths
read.fwf(tmp, widths = widths)
#         V1         V2    V3               V4
# 1     2.37       2.03    NA             2.38
# 2    5,397      5,082    NA            5,609
# 3     13.0       21.6  15.2             15.2
# 4    128.0      103.1 134.2            133.4
# 5    146.4      130.9 156.5            155.7