从R中的data.frame中删除整个列_R_Dataframe

从R中的data.frame中删除整个列

r dataframe

从R中的data.frame中删除整个列,r,dataframe,R,Dataframe,有人知道如何从R中的data.frame中删除整个列吗？例如，如果给我这个data.frame： > head(data) chr genome region 1 chr1 hg19_refGene CDS 2 chr1 hg19_refGene exon 3 chr1 hg19_refGene CDS 4 chr1 hg19_refGene exon 5 chr1 hg19_refGene CDS 6 chr1 hg19_refGene e

有人知道如何从R中的data.frame中删除整个列吗？例如，如果给我这个data.frame：

> head(data)
   chr       genome region
1 chr1 hg19_refGene    CDS
2 chr1 hg19_refGene   exon
3 chr1 hg19_refGene    CDS
4 chr1 hg19_refGene   exon
5 chr1 hg19_refGene    CDS
6 chr1 hg19_refGene   exon

我想删除第二列。

您可以将其设置为

NULL

> Data$genome <- NULL
> head(Data)
   chr region
1 chr1    CDS
2 chr1   exon
3 chr1    CDS
4 chr1   exon
5 chr1    CDS
6 chr1   exon

>数据$genome head（数据）
chr区
1 chr1光盘
2 chr1外显子
3张chr1光盘
4 chr1外显子
5张chr1光盘
6 chr1外显子

正如评论中所指出的，以下是一些其他可能性：

Data[2] <- NULL    # Wojciech Sobala
Data[[2]] <- NULL  # same as above
Data <- Data[,-2]  # Ian Fellows
Data <- Data[-2]   # same as above

Data[2]要按名称删除一个或多个列，当列名已知时（而不是在运行时确定），我喜欢subset（）
语法。例如，对于数据帧
df <- data.frame(a=1:3, d=2:4, c=3:5, b=4:6)

要删除b
和d
列，可以执行以下操作
Data <- subset( Data, select = -a )

Data <- subset( Data, select = -c(d, b ) )

正如我上面所说的，这种语法只有在列名已知时才起作用。如果说列名是通过编程方式确定的（即，分配给变量），那么它将不起作用。我将从子集文档中重现此警告：
警告:
这是一个便于交互使用的功能。
对于编程，最好使用标准子集
像“[”这样的函数，尤其是非标准求值
参数的“子集”可能会产生意想不到的后果
当使用数据.frame
s时，发布的答案非常好。但是，从内存角度来看，这些任务可能非常低效。对于大数据，删除列可能需要异常长的时间和/或由于内存不足错误而失败。包数据。表有助于解决此问题：=
运算符：
library(data.table)
> dt <- data.table(a = 1, b = 1, c = 1)
> dt[,a:=NULL]
     b c
[1,] 1 1

库（data.table）
>dt[，a:=NULL]
b c
[1,] 1 1

我应该用一个更大的例子来说明差异。我会在某个时候更新这个答案。
（为了完整性）如果你想按名称删除列，你可以这样做：
cols.dont.want <- "genome"
cols.dont.want <- c("genome", "region") # if you want to remove multiple columns

data <- data[, ! names(data) %in% cols.dont.want, drop = F]

cols.dont.want使用此选项，您可以删除列
并将变量
存储到另一个变量
中
df = subset(data, select = -c(genome) )

使用dplyr:：select（）
和一些助手函数可以删除一个或多个列。助手函数可能很有用，因为有些函数不需要命名要删除的所有特定列。请注意，使用select（）删除列
您需要使用前导的-
来否定列名
使用dplyr:：starwars
示例数据获取一些不同的列名：
library(dplyr)

starwars %>% 
  select(-height) %>%                  # a specific column name
  select(-one_of('mass', 'films')) %>% # any columns named in one_of()
  select(-(name:hair_color)) %>%       # the range of columns from 'name' to 'hair_color'
  select(-contains('color')) %>%       # any column name that contains 'color'
  select(-starts_with('bi')) %>%       # any column name that starts with 'bi'
  select(-ends_with('er')) %>%         # any column name that ends with 'er'
  select(-matches('^v.+s$')) %>%       # any column name matching the regex pattern
  select_if(~!is.list(.)) %>%          # not by column name but by data type
  head(2)

# A tibble: 2 x 2
homeworld species
  <chr>     <chr>  
1 Tatooine  Human  
2 Tatooine  Droid 

或者您可以使用：Data和逗号您还可以控制“drop”参数，如果为FALSE，则表示Data.frame保持为Data.frame，如果结果仅包含一列-如果没有逗号，您将始终获得Data.frame，无论[-2]只剩下多列还是只有一列-drop都将被忽略extraction@mdsumner数据[-2]
不需要删除参数，因为它总是从data.frame
返回data.frame
。我认为这是在data.frame
中本地化列（且仅列）的更好方法（而且速度更快）。检查：cars[-1]
（一列data.frame
）或更好的cars[-（1:2）]
：包含0列和50行的数据框
。您还可以写入数据[2]小提示：删除多列数据[c（1,2）]data.table:：set
函数可用于data.frames以立即删除或修改列，而无需复制。请参阅可能的伟大答案的副本。关于如何删除任何行中包含特定值的列（而不是您上面建议的列名称）的任何想法？df[，-其中（sapply）（df，function（x）any（x==a））），其中df是您的数据帧，a是您的特定值，例如：mtcars[，-which（sappy（mtcars，函数（x）any（x==4）））]
cols.dont.want <- "genome"
cols.dont.want <- c("genome", "region") # if you want to remove multiple columns

data <- data[, ! names(data) %in% cols.dont.want, drop = F]

df = subset(data, select = -c(genome) )

library(dplyr)

starwars %>% 
  select(-height) %>%                  # a specific column name
  select(-one_of('mass', 'films')) %>% # any columns named in one_of()
  select(-(name:hair_color)) %>%       # the range of columns from 'name' to 'hair_color'
  select(-contains('color')) %>%       # any column name that contains 'color'
  select(-starts_with('bi')) %>%       # any column name that starts with 'bi'
  select(-ends_with('er')) %>%         # any column name that ends with 'er'
  select(-matches('^v.+s$')) %>%       # any column name matching the regex pattern
  select_if(~!is.list(.)) %>%          # not by column name but by data type
  head(2)

# A tibble: 2 x 2
homeworld species
  <chr>     <chr>  
1 Tatooine  Human  
2 Tatooine  Droid 

starwars %>% 
  select(-2, -(4:10)) # column 2 and columns 4 through 10