R data.table中重复ColName的集合名
出于某些原因(无关紧要),excel导入的数据框中的colname具有以下重复项(DT是从数据框DF转换而来的数据表)。但是,这些是唯一的colname,因此需要使用R data.table中重复ColName的集合名,r,data.table,R,Data.table,出于某些原因(无关紧要),excel导入的数据框中的colname具有以下重复项(DT是从数据框DF转换而来的数据表)。但是,这些是唯一的colname,因此需要使用setnames DF<-structure(list(X1 = c("", "15 May 2014", "16 May 2014", "18 May 2014", "19 May 2014"), X2 = c(NaN, 746.18, 746.18, 744.34, 739.95), X3 = c(NaN, 5
setnames
DF<-structure(list(X1 = c("", "15 May 2014", "16 May 2014", "18 May 2014",
"19 May 2014"), X2 = c(NaN, 746.18, 746.18, 744.34, 739.95),
X3 = c(NaN, 549.9, 549.9, 546.5, 549.65), X1 = c(NaN, 406.57,
406.57, 406.66, 404.73), X1 = c(NaN, 1788.86, 1788.86, 1767.69,
1772.34), X1 = c(NaN, 2286, 2286, 2302.37, 2313.14), X2 = c(NaN,
3639.25, 3639.25, 3622.08, 3569.53), X3 = c(NaN, 1160.13,
1160.13, 1144.77, 1129.72), X1 = c(NaN, 182.83, 182.83, 182.83,
182.83), X2 = c(NaN, 787.13, 787.13, 775.39, 764.82), X1 = c(NaN,
853.2, 853.2, 849.67, 844.49)), .Names = c("X1", "X2", "X3",
"X1", "X1", "X1", "X2", "X3", "X1", "X2", "X1"), class = c("data.table",
"data.frame"), row.names = c(NA, -5L))
DT<-as.data.table(DF)
>DT
X1 X2 X3 X1 X1 X1 X2 X3 X1 X2 X1
1: NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2: 15 May 2014 746.18 549.90 406.57 1788.86 2286.00 3639.25 1160.13 182.83 787.13 853.20
3: 16 May 2014 746.18 549.90 406.57 1788.86 2286.00 3639.25 1160.13 182.83 787.13 853.20
4: 18 May 2014 744.34 546.50 406.66 1767.69 2302.37 3622.08 1144.77 182.83 775.39 849.67
5: 19 May 2014 739.95 549.65 404.73 1772.34 2313.14 3569.53 1129.72 182.83 764.82 844.49
所以,我求助于data.frame来更改colname
names(DT)<-new_names # this doesn't give any error but still gives warnings
Warning message:
In `names<-.data.table`(`*tmp*`, value = c("Date", "BOD", "DO", :
The names(x)<-value syntax copies the whole table. This is due to <- in R itself. Please change to setnames(x,old,new) which does not copy and is faster. See help('setnames'). You can safely ignore this warning if it is inconvenient to change right now. Setting options(warn=2) turns this warning into an error, so you can then use traceback() to find and change your names<- calls.
> DT
Date BOD DO FI HT HY IN MA SE OR RA
1: NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2: 15 May 2014 746.18 549.90 406.57 1788.86 2286.00 3639.25 1160.13 182.83 787.13 853.20
3: 16 May 2014 746.18 549.90 406.57 1788.86 2286.00 3639.25 1160.13 182.83 787.13 853.20
4: 18 May 2014 744.34 546.50 406.66 1767.69 2302.37 3622.08 1144.77 182.83 775.39 849.67
5: 19 May 2014 739.95 549.65 404.73 1772.34 2313.14 3569.53 1129.72 182.83 764.82 844.49
名称(DT)您可以省略旧的名称:
setnames(DT, new_names)
如果新名称
中所有名称的顺序正确,则此功能将正常工作。从?集合名:
设置名称(x、旧、新)
:
旧版
:提供新版时,将更改列名的字符名或数字位置未提供new时,新列名的长度必须与列数相同。参见示例
我遇到了同样的问题,但我不关心列可以得到什么新名称,所以我只需要唯一的名称。将make.unique
或make.names
(如建议)与setnames
(如@BrodieG所指出)相结合解决了我的问题:
# considering your DT object:
setnames(DT, make.unique(names(DT)))
# The new column names are:
names(DT)
## [1] "X1" "X2" "X3" "X1.1" "X1.2" "X1.3" "X2.1" "X3.1" "X1.4" "X2.2" "X1.5"
# Same can be achieved with:
setnames(DT, make.names(names(DT), unique = TRUE))
# considering your DT object:
setnames(DT, make.unique(names(DT)))
# The new column names are:
names(DT)
## [1] "X1" "X2" "X3" "X1.1" "X1.2" "X1.3" "X2.1" "X3.1" "X1.4" "X2.2" "X1.5"
# Same can be achieved with:
setnames(DT, make.names(names(DT), unique = TRUE))