读取不规则格式文本文件的有效方法，无需在R中使用FOR循环_R_Loops_Text

读取不规则格式文本文件的有效方法，无需在R中使用FOR循环

r loops text

读取不规则格式文本文件的有效方法，无需在R中使用FOR循环,r,loops,text,R,Loops,Text,我有一个输入文件，其模式如下： -前两行包含模型的参数&月份和年份：我只需要从这两行中获取月份和年份 -下一个块包含实际数据：172行和256列数据持续N次另一个文件是ArcGIS ascii网格，它的前6行是ArcGIS参数，一个172行x 256列的块在区域内等于1，其他地方为-9999 我想做的是读取数据并根据掩码网格计算每个时间步的平均值。目前我能想到的唯一方法是使用3个嵌套循环 for (i in 1:N) { for (j in 1:172) { for (k

我有一个输入文件，其模式如下： -前两行包含模型的参数&月份和年份：我只需要从这两行中获取月份和年份 -下一个块包含实际数据：172行和256列数据持续N次

另一个文件是ArcGIS ascii网格，它的前6行是ArcGIS参数，一个172行x 256列的块在区域内等于1，其他地方为-9999

我想做的是读取数据并根据掩码网格计算每个时间步的平均值。目前我能想到的唯一方法是使用3个嵌套循环

for (i in 1:N) {
   for (j in 1:172) {
      for (k in 1:252) {
           Read the data
           Do calculation
           Write the result out
      }
   }
{

有没有更好的方法来避免使用如此复杂的FOR循环？如有任何建议，将不胜感激

编辑：根据agstudy的建议和其他来源，我逐行读取数据，并使用strsplit获取值。我的最终代码如下

########################## Read the grid first ########################
con <- file(gisGrid) 
open(con);

# Read the first 6 lines
gisData <- readLines(textConnection(
'ncols         256
nrows         172
xllcorner     730000
yllcorner     227000
cellsize      1320
NODATA_value  -9999'),n=6)

# Extract value
gisPara <- matrix(unlist(strsplit(gisData,' +')), ncol=2, byrow=TRUE)
summary(gisPara)

# Read the mask grid:
gridValue <- read.table(file = gisGrid, header = FALSE, skip=6)
gridValueVec <- as.vector(as.matrix(gridValue))

close(con)

######################### Read the data file #########################
con <- file(dataFile) 
open(con);

# Define number of timesteps, nRow, nCol
nTime <- 3
nRow <- 172
nCol <- 256

# Read the whole file in
data.lines <- scan(con, what=character(), sep='\n')
data <- NULL
# Create 3D object to store data for each timestep
data$timestep <- list( rep( matrix(nrow=nRow, ncol=nCol), nTime ) )
data$month <- list( rep( nTime ) )
data$year <- list( rep( nTime ) )
data$multiply <- list( rep( nTime ) )

# Loop over all timestep
for (i in 1:nTime) {
  # Read the first 2 lines
  data.lines <- data.lines[-1] # remove line from the dataset

  data.line2 <- strsplit(data.lines[1],' ')
  data$month[[i]] <- data.line2[[1]][24]
  data$year[[i]] <- as.numeric(data.line2[[1]][25])
  data.lines <- data.lines[-1]

  dataRead <- matrix(nrow=nRow, ncol=nCol)
  for(j in 1:nRow) 
  {
    dataRead[j,] <- as.numeric(strsplit(data.lines[1],' ')[[1]])
    data.lines <- data.lines[-1]
  }         

  # Multiply data with the mask grid
  dataReadVector <- as.vector(dataRead)
  gridMultiplication <- ifelse(gridValueVec==-9999, NA,
                               dataReadVector * gridValueVec )

  data$multiply[[i]] <- mean(gridMultiplication, na.rm=TRUE)
  data$timestep[[i]] <- dataRead
}

close(con)

对于可以使用的第一个文件，请执行以下操作：

使用readLines解析您仅读取2行的日期

对于第二个文件，可以使用`strsplit读取文件前6行中的参数：

dd <- readLines(textConnection('ncols         256
nrows         172
xllcorner     730000
yllcorner     227000
cellsize      1320
NODATA_value  -9999
'),n=6)
matrix(unlist(strsplit(dd,' +')),ncol=2,byrow=TRUE)

    [,1]           [,2]    
[1,] "ncols"        "256"   
[2,] "nrows"        "172"   
[3,] "xllcorner"    "730000"
[4,] "yllcorner"    "227000"
[5,] "cellsize"     "1320"  
[6,] "NODATA_value" "-9999"

您可能希望包含用于读取数据的代码；不过，其他部分可能不相关。谢谢托马斯！代码包括在内！

 read.table(...,skip=2)

dd <- readLines(textConnection('ncols         256
nrows         172
xllcorner     730000
yllcorner     227000
cellsize      1320
NODATA_value  -9999
'),n=6)
matrix(unlist(strsplit(dd,' +')),ncol=2,byrow=TRUE)

    [,1]           [,2]    
[1,] "ncols"        "256"   
[2,] "nrows"        "172"   
[3,] "xllcorner"    "730000"
[4,] "yllcorner"    "227000"
[5,] "cellsize"     "1320"  
[6,] "NODATA_value" "-9999"