Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/70.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R中的svm文件处理_R_Dataset_Analysis - Fatal编程技术网

R中的svm文件处理

R中的svm文件处理,r,dataset,analysis,R,Dataset,Analysis,我有一个多标签分类问题。我在以下链接中提供了一个数据集: 此数据集最初来自2007年暹罗大赛。数据集包括描述某些航班中发生的问题的航空安全报告。这是一个多分类、高维的问题。它有21519行和30438列 数据集包含.svm格式的文件。 我在R中的“read.delim”的帮助下阅读了该文件。 之后,我得到以下输出: 头(数据[,1]) 18 2:0.136082763488 6:0.136082763488 7:0.136082763488 12:0.136082763488 20:0.136

我有一个多标签分类问题。我在以下链接中提供了一个数据集:

此数据集最初来自2007年暹罗大赛。数据集包括描述某些航班中发生的问题的航空安全报告。这是一个多分类、高维的问题。它有21519行和30438列

数据集包含.svm格式的文件。 我在R中的“read.delim”的帮助下阅读了该文件。 之后,我得到以下输出:

头(数据[,1])

18 2:0.136082763488 6:0.136082763488 7:0.136082763488 12:0.136082763488 20:0.136082763488 23:0.136082763488 32:0.136082763488 37:0.136082763488 39:0.136082763488 43:0.136082763488 53:0.136082763488 57:0.136082763488 58:0.136082763488 59:0.136082763488 60:0.136082763488 61:0.136082763488 62:0.136082763488 63:0.136082763488 64:0.136082763488 65:0.136082763488 66:0.136082763488 67:0.136082763488 68:0.136082763488 69:0.136082763488 70:0.136082763488 71:0.136082763488 72:0.136082763488 73:0.136082763488 74:0.136082763488 75:0.136082763488 76:0.136082763488 77:0.136082763488 78:0.136082763488 79:0.136082763488 80:0.136082763488 81:0.136082763488 82:0.136082763488 83:0.136082763488 84:0.136082763488 85:0.136082763488 86:0.136082763488 87:0.136082763488 88:0.136082763488 89:0.136082763488 90:0.136082763488 91:0.136082763488 92:0.136082763488 93:0.136082763488 94:0.136082763488 95:0.136082763488 96:0.136082763488 97:0.136082763488 98:0.136082763488 99:0.136082763488
[2] 1,12,13,18,20 2:0.0916698497028 4:0.0916698497028 6:0.0916698497028 12:0.0916698497028 14:0.0916698497028 16:0.0916698497028 19:0.0916698497028 23:0.0916698497028 26:0.0916698497028 31:0.0916698497028 32:0.0916698497028 33:0.0916698497028 37:0.0916698497028 53:0.0916698497028 57:0.0916698497028 66:0.0916698497028 71:0.0916698497028 72:0.0916698497028 81:0.0916698497028 83:0.0916698497028 84:0.0916698497028 86:0.0916698497028 90:0.0916698497028 92:0.0916698497028 100:0.0916698497028 101:0.0916698497028 102:0.0916698497028 103:0.0916698497028 104:0.0916698497028 105:0.0916698497028 106:0.0916698497028 107:0.0916698497028 108:0.0916698497028 109:0.0916698497028 110:0.0916698497028 111:0.0916698497028 112:0.0916698497028 113:0.0916698497028 114:0.0916698497028 115:0.0916698497028 116:0.0916698497028 117:0.0916698497028 118:0.0916698497028 119:0.0916698497028 120:0.0916698497028 121:0.0916698497028 122:0.0916698497028 123:0.0916698497028 124:0.0916698497028 125:0.0916698497028 126:0.0916698497028 127:0.0916698497028 128:0.0916698497028 129:0.0916698497028 130:0.0916698497028 131:0.0916698497028 132:0.0916698497028 133:0.0916698497028 134:0.0916698497028 135:0.0916698497028 136:0.0916698497028 137:0.0916698497028 138:0.0916698497028 139:0.0916698497028 140:0.0916698497028 141:0.0916698497028 142:0.0916698497028 143:0.0916698497028 144:0.0916698497028 145:0.0916698497028 146:0.0916698497028 147:0.0916698497028 148:0.0916698497028 149:0.0916698497028 150:0.0916698497028 151:0.0916698497028 152:0.0916698497028 153:0.0916698497028 154:0.0916698497028 155:0.0916698497028 156:0.0916698497028 157:0.0916698497028 158:0.0916698497028 159:0.0916698497028 160:0.0916698497028 161:0.0916698497028 162:0.0916698497028 163:0.0916698497028 164:0.0916698497028 165:0.0916698497028 166:0.0916698497028 167:0.0916698497028 168:0.0916698497028 169:0.0916698497028 170:0.0916698497028 171:0.0916698497028 172:0.0916698497028 173:0.0916698497028 174:0.0916698497028 175:0.0916698497028 176:0.0916698497028 177:0.0916698497028 178:0.0916698497028 179:0.0916698497028 180:0.0916698497028 181:0.0916698497028 182:0.0916698497028 183:0.0916698497028 184:0.0916698497028 185:0.0916698497028 186:0.0916698497028 187:0.0916698497028 188:0.0916698497028 189:0.0916698497028 190:0.0916698497028 191:0.0916698497028 192:0.0916698497028 193:0.0916698497028 194:0.0916698497028

如何将其转换为常规数据集


除了读取.delim以外的任何其他方法都有助于读取R中的.svm文件。

该解决方案可能包含许多循环。但它解决了我的问题

以下是R代码:

rm(list=ls())

data <- read.delim(file.choose(),header=F)

# Now using strsplit function to create a regular dataser

temp <- list()

for(i in 1:length(data$V1)){
temp[i] <- strsplit(as.character(data$V1[i]),c(" "))
}

response <- list()

for(i in 1:length(temp)){
response[[i]] <- as.numeric(strsplit(temp[[i]][1],",")[[1]])
}

# Now working for responses
l.response <- 0

for (i in 1:length(response)){
l.response[i] <- length(response[[i]])
}

col.names <- paste(rep("R",22),1:22,sep="")



l.r <- length(temp)

df.response <- data.frame(R1=rep(0,l.r),R2=rep(0,l.r),R3=rep(0,l.r),R4=rep(0,l.r),R5=rep(0,l.r)
                         ,R6=rep(0,l.r),R7=rep(0,l.r),R8=rep(0,l.r),R9=rep(0,l.r),R10=rep(0,l.r)
                         ,R11=rep(0,l.r),R12=rep(0,l.r),R13=rep(0,l.r),R14=rep(0,l.r),R15=rep(0,l.r)
                         ,R16=rep(0,l.r),R17=rep(0,l.r),R18=rep(0,l.r),R19=rep(0,l.r),R20=rep(0,l.r)
                         ,R21=rep(0,l.r),R22=rep(0,l.r))



for(i in 1:length(response)){
df.response[i,(response[[i]]+1)] <- 1
}

feature <- c(0)
value <- c(0)

v.l <- 21519

v.list <- list()
list.name <- paste(rep("V",v.l),1:v.l,sep="")

f.vec <- 0
v.vec <- 0

for(i in 1:length(temp)){
for(j in 2:length(temp[[i]])){

f.vec[j-1] <- as.numeric(strsplit(temp[[i]][j],":")[[1]])[1]
v.vec[j-1] <- as.numeric(strsplit(temp[[i]][j],":")[[1]])[2]

}

v.list[[i]] <- data.frame(f.vec,v.vec)

}

feature.name <- paste(rep("V",30438),1:30438,sep="")

v.l <- 21519

variables <- data.frame(temp = rep(0,v.l))

for(i in 1:length(feature.name)){

variables[,feature.name[i]] <- rep(0,v.l)

}


variables <- variables[,-1]

copy.variables <- variables

for(i in 1:100){

pos <- v.list[[i]][,"f.vec"]
replace <- v.list[[i]][,"v.vec"]

if(length(unique(pos))!=length(pos)){
repeat{

uni <- as.numeric(attr(which(table(pos)>1), "names"))

for(k in 1:length(uni)){

t.pos <- which(pos==uni[k])

pos <- pos[-t.pos[1]]

replace <- replace[-t.pos[1]]
}

if(length(unique(pos))==length(pos)) break
}
}
variables[i,pos]<- replace


}


dim(df.response)
dim(variables)
rm(list=ls())

来自你发布的URL的数据,看起来svm文件应该由你用来下载数据的页面顶部的引用文件读取。你试过用LIBSVM软件读取文件吗?是的,我试过了,但是没有用。该网站提供的数据集很少。但这次不行@伦格里斯基
final.data <- cbind(variables[1:100,],df.response[1:100,])