在R中自动获取复杂标题
我想请求一个脚本来检测和合并R中的标题行,当有多行标题时,如示例中所示。普遍的答案应该是: 1.标识标题2到更多的行数 2.填补页眉空白请参见示例中的NAs 3.将所有标题行合并为一行 我只能手动操作,请参见下文。这可能适用于具有任意行数的标题在R中自动获取复杂标题,r,algorithm,R,Algorithm,我想请求一个脚本来检测和合并R中的标题行,当有多行标题时,如示例中所示。普遍的答案应该是: 1.标识标题2到更多的行数 2.填补页眉空白请参见示例中的NAs 3.将所有标题行合并为一行 我只能手动操作,请参见下文。这可能适用于具有任意行数的标题 text1<-"NA h_row1a NA NA NA h_row1b NA NA NA NA h_row2a NA h_row2b NA h
text1<-"NA h_row1a NA NA NA h_row1b NA NA NA
NA h_row2a NA h_row2b NA h_row2c NA h_row2d NA
NA h_row3a h_row3b h_row3c h_row3d h_row3e h_row3f h_row3g h_row3h
element1 2 24% 25 40 23 44% 76 34
element2 3 26% 40 86 233 12% 55 12"
table1<-read.table(text=text1, skip=3,header=FALSE)
cat(text1, file = "ex.data")
header<-scan("ex.data", nlines = 1, what = character(), sep="", na.strings = "NA")
library(zoo)
header<-na.locf(header, na.rm=FALSE) # this fills the header gaps
header2 <- scan("ex.data", skip = 1, nlines = 1, what = character(), sep="", na.strings = "NA")
header2<-na.locf(header2, na.rm=FALSE)
header3 <- scan("ex.data", skip = 2, nlines = 1, what = character(), sep="", na.strings = "NA")
names(table1) <- paste0(header, header2, header3)
table1
# NANANA h_row1ah_row2ah_row3a h_row1ah_row2ah_row3b h_row1ah_row2bh_row3c h_row1ah_row2bh_row3d h_row1bh_row2ch_row3e h_row1bh_row2ch_row3f, etc.
#1 element1 2 24% 25 40 23 44%, etc.
#2 element2 3 26% , etc.
你可以这样做。它使用rle查看有多少行不能强制为数字,并假设这些是标题。我还将第一列设置为rownames——不确定您是否需要它。在完成此过程后,您可能还希望将剩余的值转换为数字-此时它们仍然是字符
到目前为止你试过什么?您在实施它时的具体问题是什么?给我们看看你的代码!
tab <- read.table(text=text1, header=FALSE,stringsAsFactors = FALSE)
#estimate no of header rows
headrows <- rle(apply(tab,1,function(x)(any(!is.na(as.numeric(x))))))$lengths[1]
#fill in blanks in headers
tab[1:headrows,] <- t(apply(tab[1:headrows,],1,na.locf,na.rm=FALSE))
names(tab) <- apply(tab[1:headrows,],2,paste0,collapse="_")
tab <- tab[-c(1:headrows),] #remove header rows (now set as column names)
rownames(tab) <- tab[,1]
tab <- tab[,-1] #remove first column (now set as rownames)
tab
h_row1a_h_row2a_h_row3a h_row1a_h_row2a_h_row3b h_row1a_h_row2b_h_row3c h_row1a_h_row2b_h_row3d
element1 2 24% 25 40
element2 3 26% 40 86
h_row1b_h_row2c_h_row3e h_row1b_h_row2c_h_row3f h_row1b_h_row2d_h_row3g h_row1b_h_row2d_h_row3h
element1 23 44% 76 34
element2 233 12% 55 12