注释字符和导入带有read.table的DF的标题之间存在冲突_R_Import_Read.table

注释字符和导入带有read.table的DF的标题之间存在冲突

r import

注释字符和导入带有read.table的DF的标题之间存在冲突,r,import,read.table,R,Import,Read.table,如何导入文件：以未定义数量的注释行开始后面是一行标题，其中一些包含注释字符，用于标识上面的注释行例如，对于这样的文件： # comment 1 # ... # comment X c01,c#02,c03,c04 1,2,3,4 5,6,7,8 然后： read.table（myfile，sep=“，”，header=T）中出错：更多列而不是列名明显的问题是，#被用作注释字符来宣布注释行，但也在标题中（无可否认，这是一种不好的做法，但我无法控制）注释行的数量是未知的，我甚至不

如何导入文件：

以未定义数量的注释行开始
后面是一行标题，其中一些包含注释字符，用于标识上面的注释行

例如，对于这样的文件：

# comment 1
# ...
# comment X
c01,c#02,c03,c04
1,2,3,4
5,6,7,8

然后：

read.table（myfile，sep=“，”，header=T）中出错：更多列而不是列名

明显的问题是，

被用作注释字符来宣布注释行，但也在标题中（无可否认，这是一种不好的做法，但我无法控制）

注释行的数量是未知的，我甚至不能使用

skip

参数。另外，在导入之前我不知道列名（甚至不知道它们的编号），所以我真的需要从文件中读取它们

除了手动操作文件之外，还有什么解决方案吗？

计算以注释开头的行数，然后跳过它们可能很容易

csvfile <- "# comment 1
# ...
# comment X
c01,c#02,c03,c04
1,2,3,4
5,6,7,8"

# return a logical for whether the line starts with a comment.
# remove everything from the first FALSE and afterward
# take the sum of what's left
start_comment <- grepl("^#", readLines(textConnection(csvfile)))
start_comment <- sum(head(start_comment, which(!start_comment)[1] - 1))

# skip the lines that start with the comment character
Data <- read.csv(textConnection(csvfile),
                 skip = start_comment,
                 stringsAsFactors = FALSE)

readLines

将整个内容作为字符串导入，然后将其清理为标准格式。在将文件导入R之前，请先清理文件。也许您可以转到源代码并在那里处理。

csvfile <- "# comment 1
# ...
# comment X
c01,c#02,c03,c04
1,2,3,4
5,6,7,8"

# return a logical for whether the line starts with a comment.
# remove everything from the first FALSE and afterward
# take the sum of what's left
start_comment <- grepl("^#", readLines(textConnection(csvfile)))
start_comment <- sum(head(start_comment, which(!start_comment)[1] - 1))

# skip the lines that start with the comment character
Data <- read.csv(textConnection(csvfile),
                 skip = start_comment,
                 stringsAsFactors = FALSE)

start_comment <- grepl("^#", readLines(textConnection(csvfile)))
start_comment <- sum(head(start_comment, which(!start_comment)[1] - 1))

# Get the headers by themselves.
Head <- read.table(textConnection(csvfile),
                   skip = start_comment,
                   header = FALSE,
                   sep = ",",
                   comment.char = "",
                   nrows = 1)

Data <- read.table(textConnection(csvfile),
                   sep = ",",
                   header = FALSE,
                   skip = start_comment + 1,
                   stringsAsFactors = FALSE)

# apply column names to Data
names(Data) <- unlist(Head)