从R中的数据库中删除模式

从R中的数据库中删除模式,r,dataset,gsub,R,Dataset,Gsub,以下是我的示例数据集: > head(d3) V1 V2 V3 V4 V5 V6 2 Bacteria(100) Proteobacteria(100) Gammaproteobacteria(100) Pseudomonadales(100) Pseudomonad

以下是我的示例数据集:

 > head(d3)
V1                  V2                       V3                     V4                      V5                     V6
2 Bacteria(100) Proteobacteria(100) Gammaproteobacteria(100)   Pseudomonadales(100)           Pseudomonadaceae(100)    Pseudomonas(98)
3 Bacteria(100)  Bacteroidetes(100)          Bacteroidia(93)      Bacteroidales(93)        unclassified(93)   unclassified(93)
4 Bacteria(100)     Firmicutes(100)             Bacilli(100)   Lactobacillales(100)   Streptococcaceae(100) Streptococcus(100)
5 Bacteria(100) Proteobacteria(100) Gammaproteobacteria(100)    Pasteurellales(100)    Pasteurellaceae(100)   unclassified(68)
6 Bacteria(100) Proteobacteria(100) Gammaproteobacteria(100) Enterobacteriales(100) Enterobacteriaceae(100)   unclassified(90)
7 Bacteria(100)  Bacteroidetes(100)         Bacteroidia(100)     Bacteroidales(100) Porphyromonadaceae(100)  unclassified(100)
我试图从每个字符串中删除(100)。 我试过:

>d3 <- gsub("[(0-9)]", "", d3)
这给了我“无效因子级别,NA生成”和一个混乱的数据集,几乎所有内容都替换为NA!我找不到任何与我所寻找的完全相同的问题。

这里有一种方法:

d3[] <- sapply(d3,function(x){
  gsub("\\(\\d+\\)","",as.character(x))
})
##
> d3
        V1             V2                  V3                V4                 V5            V6
2 Bacteria Proteobacteria Gammaproteobacteria   Pseudomonadales   Pseudomonadaceae   Pseudomonas
3 Bacteria  Bacteroidetes         Bacteroidia     Bacteroidales       unclassified  unclassified
4 Bacteria     Firmicutes             Bacilli   Lactobacillales   Streptococcaceae Streptococcus
5 Bacteria Proteobacteria Gammaproteobacteria    Pasteurellales    Pasteurellaceae  unclassified
6 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae  unclassified
7 Bacteria  Bacteroidetes         Bacteroidia     Bacteroidales Porphyromonadaceae  unclassified
d3[] <- sapply(d3,function(x){
  gsub("\\(\\d+\\)","",as.character(x))
})
##
> d3
        V1             V2                  V3                V4                 V5            V6
2 Bacteria Proteobacteria Gammaproteobacteria   Pseudomonadales   Pseudomonadaceae   Pseudomonas
3 Bacteria  Bacteroidetes         Bacteroidia     Bacteroidales       unclassified  unclassified
4 Bacteria     Firmicutes             Bacilli   Lactobacillales   Streptococcaceae Streptococcus
5 Bacteria Proteobacteria Gammaproteobacteria    Pasteurellales    Pasteurellaceae  unclassified
6 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae  unclassified
7 Bacteria  Bacteroidetes         Bacteroidia     Bacteroidales Porphyromonadaceae  unclassified
d3[] <- sapply(d3,function(x){
  gsub("\\(100\\)","",as.character(x))
})
##
> d3
        V1             V2                  V3                V4                 V5               V6
2 Bacteria Proteobacteria Gammaproteobacteria   Pseudomonadales   Pseudomonadaceae  Pseudomonas(98)
3 Bacteria  Bacteroidetes     Bacteroidia(93) Bacteroidales(93)   unclassified(93) unclassified(93)
4 Bacteria     Firmicutes             Bacilli   Lactobacillales   Streptococcaceae    Streptococcus
5 Bacteria Proteobacteria Gammaproteobacteria    Pasteurellales    Pasteurellaceae unclassified(68)
6 Bacteria Proteobacteria Gammaproteobacteria Enterobacteriales Enterobacteriaceae unclassified(90)
7 Bacteria  Bacteroidetes         Bacteroidia     Bacteroidales Porphyromonadaceae     unclassified