R:将一个数据帧的结构转换为另一个数据帧的相同结构
目前我有两个大型数据帧,包含超过300000个观察值和100多个变量,但为了简单起见,让我们假设我有df1:R:将一个数据帧的结构转换为另一个数据帧的相同结构,r,string,dataframe,structure,R,String,Dataframe,Structure,目前我有两个大型数据帧,包含超过300000个观察值和100多个变量,但为了简单起见,让我们假设我有df1: > str(df1) 'data.frame': 3000 obs. of 3 variables: $ Name : chr "AAA" "BBB" "CCC" "DDD" ... $ DateTime : POSIXct, format: "2014-01-01 00:00:00" "2014-01-01 00:10:00" "2014-01
> str(df1)
'data.frame': 3000 obs. of 3 variables:
$ Name : chr "AAA" "BBB" "CCC" "DDD" ...
$ DateTime : POSIXct, format: "2014-01-01 00:00:00" "2014-01-01 00:10:00" "2014-01-01 00:20:00" ...
$ Age : num 27 25 27 30 ...
df2:
两个数据帧的列数和行数相同,但它们的结构与df2中的所有内容在因子上不同
我想将df2中的结构转换为与df1相同的结构。请提前告知,谢谢假设两个数据帧的列的顺序与描述的完全相同,您可以在
映射
方法中使用类
功能
df2[] <- Map(function(x, y) {
if (any(grepl("POS", y)))
ISOdate(as.Date(x), 0, 0, 0)
else if (y == "Date")
as.Date(x)
else
`class<-`(as.character(x), y)
}, df2, lapply(df1, class))
转化
资料
df1谢谢您的建议!然而,df2中的十六进制和日期时间列分别更改为数字(在chr str中)和1970-01-01。我看到了问题。是否有更多的类,或者chr、num、POSIXct
是df1
中唯一的类?在我最初的数据帧中,还有一列的结构为Date
我们可能需要异常处理。看看编辑是否有用。太棒了!但是,df2中“年龄”列中的值发生了更改…(非常抱歉打扰您)
df2[] <- Map(function(x, y) {
if (any(grepl("POS", y)))
ISOdate(as.Date(x), 0, 0, 0)
else if (y == "Date")
as.Date(x)
else
`class<-`(as.character(x), y)
}, df2, lapply(df1, class))
lapply(df1, class)
# $name
# [1] "character"
#
# $date
# [1] "POSIXct" "POSIXt"
#
# $age
# [1] "numeric"
#
# $date2
# [1] "Date"
lapply(df2, class)
# $HEX
# [1] "factor"
#
# $date
# [1] "factor"
#
# $age
# [1] "factor"
#
# $date2
# [1] "factor"
df2[] <- Map(function(x, y) {
if (any(grepl("POS", y)))
ISOdate(as.Date(x), 0, 0, 0)
else if (y == "Date")
as.Date(x)
else
`class<-`(as.character(x), y)
}, df2, lapply(df1, class))
lapply(df2, class)
# $HEX
# [1] "character"
#
# $date
# [1] "POSIXct" "POSIXt"
#
# $age
# [1] "numeric"
#
# $date2
# [1] "Date"
df1 <- structure(list(name = c("A", "B", "C", "D", "E"), date = structure(c(1577836800,
1580515200, 1583020800, 1585699200, 1588291200), class = c("POSIXct",
"POSIXt")), age = c(30, 27, 25, 28, 23), date2 = structure(c(18262,
18293, 18322, 18353, 18383), class = "Date")), row.names = c(NA,
-5L), class = "data.frame")
df2 <- structure(list(HEX = structure(1:5, .Label = c("A", "B", "C",
"D", "E"), class = "factor"), date = structure(1:5, .Label = c("2020-01-01 01:00:00",
"2020-02-01 01:00:00", "2020-03-01 01:00:00", "2020-04-01 02:00:00",
"2020-05-01 02:00:00"), class = "factor"), age = structure(c(5L,
3L, 2L, 4L, 1L), .Label = c("23", "25", "27", "28", "30"), class = "factor"),
date2 = structure(1:5, .Label = c("2020-01-01", "2020-02-01",
"2020-03-01", "2020-04-01", "2020-05-01"), class = "factor")), row.names = c(NA,
-5L), class = "data.frame")