从R中的数据库中删除模式
以下是我的示例数据集:从R中的数据库中删除模式,r,dataset,gsub,R,Dataset,Gsub,以下是我的示例数据集: > head(d3) V1 V2 V3 V4 V5 V6 2 Bacteria(100) Proteobacteria(100) Gammaproteobacteria(100) Pseudomonadales(100) Pseudomonad
> head(d3)
V1 V2 V3 V4 V5 V6
2 Bacteria(100) Proteobacteria(100) Gammaproteobacteria(100) Pseudomonadales(100) Pseudomonadaceae(100) Pseudomonas(98)
3 Bacteria(100) Bacteroidetes(100) Bacteroidia(93) Bacteroidales(93) unclassified(93) unclassified(93)
4 Bacteria(100) Firmicutes(100) Bacilli(100) Lactobacillales(100) Streptococcaceae(100) Streptococcus(100)
5 Bacteria(100) Proteobacteria(100) Gammaproteobacteria(100) Pasteurellales(100) Pasteurellaceae(100) unclassified(68)
6 Bacteria(100) Proteobacteria(100) Gammaproteobacteria(100) Enterobacteriales(100) Enterobacteriaceae(100) unclassified(90)
7 Bacteria(100) Bacteroidetes(100) Bacteroidia(100) Bacteroidales(100) Porphyromonadaceae(100) unclassified(100)
我试图从每个字符串中删除(100)。
我试过:
>d3 <- gsub("[(0-9)]", "", d3)
这给了我“无效因子级别,NA生成”和一个混乱的数据集,几乎所有内容都替换为NA!我找不到任何与我所寻找的完全相同的问题。这里有一种方法:
d3[] <- sapply(d3,function(x){
gsub("\\(\\d+\\)","",as.character(x))
})
##
> d3
V1 V2 V3 V4 V5 V6
2 Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Pseudomonadaceae Pseudomonas
3 Bacteria Bacteroidetes Bacteroidia Bacteroidales unclassified unclassified
4 Bacteria Firmicutes Bacilli Lactobacillales Streptococcaceae Streptococcus
5 Bacteria Proteobacteria Gammaproteobacteria Pasteurellales Pasteurellaceae unclassified
6 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae unclassified
7 Bacteria Bacteroidetes Bacteroidia Bacteroidales Porphyromonadaceae unclassified
d3[] <- sapply(d3,function(x){
gsub("\\(\\d+\\)","",as.character(x))
})
##
> d3
V1 V2 V3 V4 V5 V6
2 Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Pseudomonadaceae Pseudomonas
3 Bacteria Bacteroidetes Bacteroidia Bacteroidales unclassified unclassified
4 Bacteria Firmicutes Bacilli Lactobacillales Streptococcaceae Streptococcus
5 Bacteria Proteobacteria Gammaproteobacteria Pasteurellales Pasteurellaceae unclassified
6 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae unclassified
7 Bacteria Bacteroidetes Bacteroidia Bacteroidales Porphyromonadaceae unclassified
d3[] <- sapply(d3,function(x){
gsub("\\(100\\)","",as.character(x))
})
##
> d3
V1 V2 V3 V4 V5 V6
2 Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Pseudomonadaceae Pseudomonas(98)
3 Bacteria Bacteroidetes Bacteroidia(93) Bacteroidales(93) unclassified(93) unclassified(93)
4 Bacteria Firmicutes Bacilli Lactobacillales Streptococcaceae Streptococcus
5 Bacteria Proteobacteria Gammaproteobacteria Pasteurellales Pasteurellaceae unclassified(68)
6 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae unclassified(90)
7 Bacteria Bacteroidetes Bacteroidia Bacteroidales Porphyromonadaceae unclassified