R 当POSIXct列的唯一时间部分不同时,如何查找重复的行?
我有一些行,其中缺少POSIXct列的时间部分(=等于00:00:00)。我应该如何找到只有时间不同的重复行 如果我使用如下代码:R 当POSIXct列的唯一时间部分不同时,如何查找重复的行?,r,data.table,R,Data.table,我有一些行,其中缺少POSIXct列的时间部分(=等于00:00:00)。我应该如何找到只有时间不同的重复行 如果我使用如下代码: dataDuplicates <- data[duplicated(data, by = NULL) | duplicated(data, by = NULL, fromLast = TRUE), ] dataDuplicates我们可以将'ddate'列转换为'Date'类,并使用它来查找duplicate行 d1 <- cbind(data[-3]
dataDuplicates <- data[duplicated(data, by = NULL) | duplicated(data, by = NULL, fromLast = TRUE), ]
dataDuplicates我们可以将'ddate'列转换为'Date'类,并使用它来查找duplicate
行
d1 <- cbind(data[-3],as.Date(data$ddate))
data[duplicated(d1)|duplicated(d1, fromLast=TRUE),]
# or d ddate rdate changes class price fdate
#2 VA1 VA2 2014-05-26 14:00:01 <NA> 0 0 2124 2014-05-22 15:03:44
#3 VA1 VA2 2014-05-26 00:00:00 <NA> 0 0 2124 2014-05-22 15:03:44
# company number minutes added source
#2 <NA> <NA> NA 2014-05-22 12:20:03 s1
#3 <NA> <NA> NA 2014-05-22 12:20:03 s1
d1data[duplicated(format(data$ddate,%Y-%M-%d)))| duplicated(format(data$ddate,%Y-%M-%d)),fromLast=TRUE),data[duplicated(as.Date(ddate))| duplicated(as.Date(ddate),fromLast=TRUE)]
虽然这段代码可以解决这个问题,但它如何以及为什么能够真正帮助提高你的文章质量,而且可能会得到更多的支持票。请记住,你是在将来回答读者的问题,而不仅仅是现在提问的人。请在回答中添加解释,并说明适用的限制和假设。
zz <- "or,d,ddate,rdate,changes,class,price,fdate,company,number,minutes,added,source
VA3,VA4,2014-05-24 12:23:00,,0,0,2124,2014-05-22 15:50:16,,,,2014-05-22 12:20:03,ss
VA1,VA2,2014-05-26 14:00:01,,0,0,2124,2014-05-22 15:03:44,,,,2014-05-22 12:20:03,s1
VA1,VA2,2014-05-26 00:00:00,,0,0,2124,2014-05-22 15:03:44,,,,2014-05-22 12:20:03,s1
VA1,VA2,2014-05-27 14:00:01,,0,0,2124,2014-05-22 15:03:44,,,,2014-05-22 12:20:03,s1
VA5,VA6,2014-06-05 18:00:04,,0,0,2124,2014-05-22 15:48:24,,,,2014-05-22 12:20:03,s1
VA7,VA8,2014-06-09 18:00:07,,0,0,2124,2014-05-22 15:37:35,,,,2014-05-22 12:20:03,s2
VA9,VA0,2014-06-16 19:00:20,,0,0,2124,2014-05-22 14:17:33,,,,2014-05-22 12:20:03,ss"
columnClasses <- c("factor", "factor", "POSIXct", "factor", "integer", "factor", "integer", "factor", "factor", "factor", "integer", "factor", "factor")
data <- read.table(text=zz, header = TRUE, sep = ",", comment.char = "", quote = "", na.strings = c(""), colClasses = columnClasses)
d1 <- cbind(data[-3],as.Date(data$ddate))
data[duplicated(d1)|duplicated(d1, fromLast=TRUE),]
# or d ddate rdate changes class price fdate
#2 VA1 VA2 2014-05-26 14:00:01 <NA> 0 0 2124 2014-05-22 15:03:44
#3 VA1 VA2 2014-05-26 00:00:00 <NA> 0 0 2124 2014-05-22 15:03:44
# company number minutes added source
#2 <NA> <NA> NA 2014-05-22 12:20:03 s1
#3 <NA> <NA> NA 2014-05-22 12:20:03 s1