R 从指定的值范围从CSV导入

R 从指定的值范围从CSV导入,r,csv,data-manipulation,R,Csv,Data Manipulation,我试图读取CSV文件,但遇到以下错误 Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 1097 did not have 5 elements 在进一步检查CSV文件后,我发现在第1097行附近,行中出现了一个中断,并用年度化数据开始了一个新的标题(我现在对每月数据感兴趣) 。。。1100行 Annual Factors: January-December ,Mk

我试图读取CSV文件,但遇到以下错误

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 1097 did not have 5 elements
在进一步检查CSV文件后,我发现在第1097行附近,行中出现了一个中断,并用年度化数据开始了一个新的标题(我现在对每月数据感兴趣)

。。。1100行

 Annual Factors: January-December 
,Mkt-RF,SMB,HML,RF
  1927,   29.47,   -2.46,   -3.75,    3.12
  1928,   35.39,    4.20,   -6.15,    3.56

我使用了
read.csv
而不是
read.table

French <- read.csv("F-F_Research_Data_Factors.CSV", sep = ",", skip = 3, 
header = T )

French我用
read.csv
而不是
read.table

French <- read.csv("F-F_Research_Data_Factors.CSV", sep = ",", skip = 3, 
header = T )

French文件中的第一个表位于前两个零长度行之间,因此这将在没有前后垃圾的情况下读取该表,然后在指定日期将其子集:

# read first table in file
Lines <- readLines("F-F_Research_Data_Factors.CSV")
ix <- which(Lines == "")
DF0 <- read.csv(text = Lines[ix[1]:ix[2]])  # all rows in first table

# subset it to indicated dates
DF <- subset(DF0, X >= 192607 & X <= 195007)

文件中的第一个表位于前两个零长度行之间,因此这将在没有之前和之后的垃圾的情况下读取它,然后在指定的日期将其子集:

# read first table in file
Lines <- readLines("F-F_Research_Data_Factors.CSV")
ix <- which(Lines == "")
DF0 <- read.csv(text = Lines[ix[1]:ix[2]])  # all rows in first table

# subset it to indicated dates
DF <- subset(DF0, X >= 192607 & X <= 195007)

您可以使用
scan
来探测数据并找到要在1926-1950中读取的行。您可以使用
scan
来探测数据并找到要在1926-1950中读取的行。@user113156乐意帮助:)@user113156乐意帮助:)
st <- grep("^,", Lines)  # starting line numbers
en <- which(Lines == "")[-1]  # ending line numbers
L <- Map(function(st, en) read.csv(text = Lines[st:en]), st, en)