Regex 使用gsub创建统一的时间格式
我正在尝试重新格式化当前为字符格式的时间。我想让他们穿制服。现在它们看起来像这样:Regex 使用gsub创建统一的时间格式,regex,r,datetime,gsub,Regex,R,Datetime,Gsub,我正在尝试重新格式化当前为字符格式的时间。我想让他们穿制服。现在它们看起来像这样: [1] "1:00PM" "1:10 PM" "1:10PM" "1:20 PM" "1:30 PM" "1:30PM" [7] "1:40 PM" "10:00AM" "10:10 AM" "10:10AM" "10:30 AM" "10:30AM" [13] "10:45 AM" "10:45AM"
[1] "1:00PM" "1:10 PM" "1:10PM" "1:20 PM" "1:30 PM" "1:30PM"
[7] "1:40 PM" "10:00AM" "10:10 AM" "10:10AM" "10:30 AM" "10:30AM"
[13] "10:45 AM" "10:45AM" "10:50 AM" "10:50AM" "10AM" "11:00AM"
[19] "11:10 AM" "11:10AM" "11:40 AM" "11:40AM" "11AM" "12:00PM"
[25] "12:05 PM" "12:10 PM" "12:10PM" "12:25PM" "12:30 PM" "12:30PM"
[31] "12:45 PM" "12:45:30 PM" "12:45PM" "12:50 PM" "12PM" "1PM"
[37] "2:00PM" "2:10 PM" "2:10PM" "2:20PM" "2:30 PM" "2:30PM"
[43] "2:35 PM" "2:45 PM" "2:45PM" "2:55 PM" "2PM" "3:00PM"
[49] "3:05 PM" "3:10 PM" "3:10PM" "3:20 PM" "3:20PM" "3:25 PM"
[55] "3:25PM" "3:30 PM" "3:35 PM" "3:35PM" "3:45 PM" "3:45PM"
[61] "3PM" "4:00PM" "4:10 PM" "4:10PM" "4:30 PM" "4:30PM"
[67] "4:35 PM" "4:35PM" "4PM" "5:00PM" "5:10 PM" "5:10PM"
[73] "5:20 PM" "5:30 PM" "5:30PM" "5:35 PM" "5:35PM" "5:40 PM"
[79] "5:40PM" "5:45 PM" "5:50 PM" "5:50PM" "6:00PM" "6:05PM"
[85] "6:10 PM" "6:10PM" "6:15PM" "6:30 PM" "6:30PM" "6PM"
[91] "7:00PM" "7:10 AM" "7:10 PM" "7:10AM" "7:10PM" "7:30PM"
[97] "7:35 PM" "7:35PM" "7:45 PM" "7:45PM" "7AM" "7PM"
[103] "8:00AM" "8:10 AM" "8:10AM" "8:25 PM" "8:25PM" "8:50 PM"
[109] "8AM" "9:00AM" "9:10 AM" "9:10AM" "9:15 AM" "9:15AM"
[115] "9:20 AM" "9:30 AM" "9:30AM" "9:35AM" "9:40 AM" "9:40AM"
[121] "9:45 AM" "9:45AM" "9AM"
我希望所有时间都采用这种格式:
下午1点而不是下午1点下午12:45而不是12:45:30 所以基本上HH:MM加上AM或PM 最后,我想将时间从字符格式转换为POSIXct格式。但这只有在使用统一的字符格式时才可能实现。更具体地说:例如,您将如何使用
gsub
将“3PM”更改为“3:00PM”,并将“12:45:30pm”更改为“12:45PM”
我很难理解gsub
中的一些正则表达式语法,尤其是如何引用特定位置,比如字符串中的位置4 我们为没有的元素创建索引(“indx”):
即(上午10点、上午11点等),使用sub
我们更改格式(上午10点、上午11点等)。我们匹配前两个数字,后跟:
,两个数字(\\d{2}
),用括号将其作为一个组捕获,匹配非AM/PM的字符([^AMP]+
),匹配AM/PM字符并选择作为第二个捕获组,使用第一个和第二个捕获组(\\1\\2
)作为替换。现在,我们可以使用strsplit/sprintf
在没有两个数字的元素的开头填充0
indx <- !grepl(':', str1)
str1[indx] <- sub('(\\d+)(.*)', '\\1:00\\2', str1[indx])
str1 <- sub('(^\\d+:\\d{2})[^AMP]+([AMP])', '\\1\\2', str1)
sapply(strsplit(str1, ':'), function(x) paste(sprintf('%02d',
as.numeric(x[1])), x[2], sep=":"))
#[1] "01:00PM" "01:10PM" "01:10PM" "01:20PM" "01:30PM" "01:30PM" "01:40PM"
#[8] "10:00AM" "10:10AM" "10:10AM" "10:30AM" "10:30AM" "10:45AM" "10:45AM"
#[15] "10:50AM" "10:50AM" "10:00AM" "11:00AM" "11:10AM" "11:10AM" "11:40AM"
#[22] "11:40AM" "11:00AM" "12:00PM" "12:05PM" "12:10PM" "12:10PM" "12:25PM"
#[29] "12:30PM" "12:30PM" "12:45PM" "12:45PM" "12:45PM" "12:50PM" "12:00PM"
#[36] "01:00PM" "02:00PM" "02:10PM" "02:10PM" "02:20PM" "02:30PM" "02:30PM"
#[43] "02:35PM" "02:45PM" "02:45PM" "02:55PM" "02:00PM" "03:00PM" "03:05PM"
#[50] "03:10PM" "03:10PM" "03:20PM" "03:20PM" "03:25PM" "03:25PM" "03:30PM"
#[57] "03:35PM" "03:35PM" "03:45PM" "03:45PM" "03:00PM" "04:00PM" "04:10PM"
#[64] "04:10PM" "04:30PM" "04:30PM" "04:35PM" "04:35PM" "04:00PM" "05:00PM"
#[71] "05:10PM" "05:10PM" "05:20PM" "05:30PM" "05:30PM" "05:35PM" "05:35PM"
#[78] "05:40PM" "05:40PM" "05:45PM" "05:50PM" "05:50PM" "06:00PM" "06:05PM"
#[85] "06:10PM" "06:10PM" "06:15PM" "06:30PM" "06:30PM" "06:00PM" "07:00PM"
#[92] "07:10AM" "07:10PM" "07:10AM" "07:10PM" "07:30PM" "07:35PM" "07:35PM"
#[99] "07:45PM" "07:45PM" "07:00AM" "07:00PM" "08:00AM" "08:10AM" "08:10AM"
#[106] "08:25PM" "08:25PM" "08:50PM" "08:00AM" "09:00AM" "09:10AM" "09:10AM"
#[113] "09:15AM" "09:15AM" "09:20AM" "09:30AM" "09:30AM" "09:35AM" "09:40AM"
#[120] "09:40AM" "09:45AM" "09:45AM" "09:00AM"
或
或者我们可以使用具有多种格式字符串选项的lubridate
包
library(lubridate)
paste0(format(parse_date_time(str1, orders=guess_formats(gsub('[APM]',
'', str1), c('hm', 'hms', 'h'))), '%H:%M'), sub('[^AMP]+', '', str1))
#[1] "01:00PM" "01:10PM" "01:10PM" "01:20PM" "01:30PM" "01:30PM" "01:40PM"
#[8] "10:00AM" "10:10AM" "10:10AM" "10:30AM" "10:30AM" "10:45AM" "10:45AM"
#[15] "10:50AM" "10:50AM" "10:00AM" "11:00AM" "11:10AM" "11:10AM" "11:40AM"
#[22] "11:40AM" "11:00AM" "12:00PM" "12:05PM" "12:10PM" "12:10PM" "12:25PM"
#[29] "12:30PM" "12:30PM" "12:45PM" "12:45PM" "12:45PM" "12:50PM" "12:00PM"
#[36] "01:00PM" "02:00PM" "02:10PM" "02:10PM" "02:20PM" "02:30PM" "02:30PM"
#[43] "02:35PM" "02:45PM" "02:45PM" "02:55PM" "02:00PM" "03:00PM" "03:05PM"
#[50] "03:10PM" "03:10PM" "03:20PM" "03:20PM" "03:25PM" "03:25PM" "03:30PM"
#[57] "03:35PM" "03:35PM" "03:45PM" "03:45PM" "03:00PM" "04:00PM" "04:10PM"
#[64] "04:10PM" "04:30PM" "04:30PM" "04:35PM" "04:35PM" "04:00PM" "05:00PM"
#[71] "05:10PM" "05:10PM" "05:20PM" "05:30PM" "05:30PM" "05:35PM" "05:35PM"
#[78] "05:40PM" "05:40PM" "05:45PM" "05:50PM" "05:50PM" "06:00PM" "06:05PM"
#[85] "06:10PM" "06:10PM" "06:15PM" "06:30PM" "06:30PM" "06:00PM" "07:00PM"
#[92] "07:10AM" "07:10PM" "07:10AM" "07:10PM" "07:30PM" "07:35PM" "07:35PM"
#[99] "07:45PM" "07:45PM" "07:00AM" "07:00PM" "08:00AM" "08:10AM" "08:10AM"
#[106] "08:25PM" "08:25PM" "08:50PM" "08:00AM" "09:00AM" "09:10AM" "09:10AM"
#[113] "09:15AM" "09:15AM" "09:20AM" "09:30AM" "09:30AM" "09:35AM" "09:40AM"
#[120] "09:40AM" "09:45AM" "09:45AM" "09:00AM"
数据
str1使用strsplit和sapply的第一个解决方案非常有效。非常感谢您提供的优雅解决方案。
library(stringr)
str_pad(str1, pad='0', width=7)
library(lubridate)
paste0(format(parse_date_time(str1, orders=guess_formats(gsub('[APM]',
'', str1), c('hm', 'hms', 'h'))), '%H:%M'), sub('[^AMP]+', '', str1))
#[1] "01:00PM" "01:10PM" "01:10PM" "01:20PM" "01:30PM" "01:30PM" "01:40PM"
#[8] "10:00AM" "10:10AM" "10:10AM" "10:30AM" "10:30AM" "10:45AM" "10:45AM"
#[15] "10:50AM" "10:50AM" "10:00AM" "11:00AM" "11:10AM" "11:10AM" "11:40AM"
#[22] "11:40AM" "11:00AM" "12:00PM" "12:05PM" "12:10PM" "12:10PM" "12:25PM"
#[29] "12:30PM" "12:30PM" "12:45PM" "12:45PM" "12:45PM" "12:50PM" "12:00PM"
#[36] "01:00PM" "02:00PM" "02:10PM" "02:10PM" "02:20PM" "02:30PM" "02:30PM"
#[43] "02:35PM" "02:45PM" "02:45PM" "02:55PM" "02:00PM" "03:00PM" "03:05PM"
#[50] "03:10PM" "03:10PM" "03:20PM" "03:20PM" "03:25PM" "03:25PM" "03:30PM"
#[57] "03:35PM" "03:35PM" "03:45PM" "03:45PM" "03:00PM" "04:00PM" "04:10PM"
#[64] "04:10PM" "04:30PM" "04:30PM" "04:35PM" "04:35PM" "04:00PM" "05:00PM"
#[71] "05:10PM" "05:10PM" "05:20PM" "05:30PM" "05:30PM" "05:35PM" "05:35PM"
#[78] "05:40PM" "05:40PM" "05:45PM" "05:50PM" "05:50PM" "06:00PM" "06:05PM"
#[85] "06:10PM" "06:10PM" "06:15PM" "06:30PM" "06:30PM" "06:00PM" "07:00PM"
#[92] "07:10AM" "07:10PM" "07:10AM" "07:10PM" "07:30PM" "07:35PM" "07:35PM"
#[99] "07:45PM" "07:45PM" "07:00AM" "07:00PM" "08:00AM" "08:10AM" "08:10AM"
#[106] "08:25PM" "08:25PM" "08:50PM" "08:00AM" "09:00AM" "09:10AM" "09:10AM"
#[113] "09:15AM" "09:15AM" "09:20AM" "09:30AM" "09:30AM" "09:35AM" "09:40AM"
#[120] "09:40AM" "09:45AM" "09:45AM" "09:00AM"
str1 <- c("1:00PM", "1:10 PM", "1:10PM", "1:20 PM", "1:30 PM", "1:30PM",
"1:40 PM", "10:00AM", "10:10 AM", "10:10AM", "10:30 AM", "10:30AM",
"10:45 AM", "10:45AM", "10:50 AM", "10:50AM", "10AM", "11:00AM",
"11:10 AM", "11:10AM", "11:40 AM", "11:40AM", "11AM", "12:00PM",
"12:05 PM", "12:10 PM", "12:10PM", "12:25PM", "12:30 PM", "12:30PM",
"12:45 PM", "12:45:30 PM", "12:45PM", "12:50 PM", "12PM", "1PM",
"2:00PM", "2:10 PM", "2:10PM", "2:20PM", "2:30 PM", "2:30PM",
"2:35 PM", "2:45 PM", "2:45PM", "2:55 PM", "2PM", "3:00PM", "3:05 PM",
"3:10 PM", "3:10PM", "3:20 PM", "3:20PM", "3:25 PM", "3:25PM",
"3:30 PM", "3:35 PM", "3:35PM", "3:45 PM", "3:45PM", "3PM", "4:00PM",
"4:10 PM", "4:10PM", "4:30 PM", "4:30PM", "4:35 PM", "4:35PM",
"4PM", "5:00PM", "5:10 PM", "5:10PM", "5:20 PM", "5:30 PM", "5:30PM",
"5:35 PM", "5:35PM", "5:40 PM", "5:40PM", "5:45 PM", "5:50 PM",
"5:50PM", "6:00PM", "6:05PM", "6:10 PM", "6:10PM", "6:15PM",
"6:30 PM", "6:30PM", "6PM", "7:00PM", "7:10 AM", "7:10 PM", "7:10AM",
"7:10PM", "7:30PM", "7:35 PM", "7:35PM", "7:45 PM", "7:45PM",
"7AM", "7PM", "8:00AM", "8:10 AM", "8:10AM", "8:25 PM", "8:25PM",
"8:50 PM", "8AM", "9:00AM", "9:10 AM", "9:10AM", "9:15 AM", "9:15AM",
"9:20 AM", "9:30 AM", "9:30AM", "9:35AM", "9:40 AM", "9:40AM",
"9:45 AM", "9:45AM", "9AM")