如何整理数据集中的weekyear变量_R_Date

如何整理数据集中的weekyear变量

r date

如何整理数据集中的weekyear变量,r,date,R,Date,我有一个带有weekyear变量的数据集。例如： Weekyear 12016 22016 32016 ... 422016 432016 442016 正如您可能理解的那样，这会造成一些困难，因为将此变量作为整数处理不允许我按降序排序因此，我想将变量从12016更改为201601，以允许desc排序。如果我的值具有相同的字符数，这将很容易，但它们不是（例如12016和432016）有人知道如何处理这个变量吗？提前谢谢 Diederik您可以使用模算术和整数除法提取年份和周 x <

我有一个带有weekyear变量的数据集。例如：

Weekyear
12016
22016
32016
...
422016
432016
442016

正如您可能理解的那样，这会造成一些困难，因为将此变量作为整数处理不允许我按降序排序

因此，我想将变量从

更改为

，以允许desc排序。如果我的值具有相同的字符数，这将很容易，但它们不是（例如

和

）

有人知道如何处理这个变量吗？提前谢谢

Diederik

您可以使用模算术和整数除法提取年份和周

x <- 432016
year <- x %% 10000
week <- x %/% 10000
week <- sprintf("%02d", week) # make sure single digits have leading zeros
new_x <- paste0(year, week)
new_x <- as.integer(new_x) 
new_x

x您可以使用stringr:：str_sub
获得所需的格式：
# Getting the year
years <- stringr::str_sub(text, -4)

# Getting the weeks
weeks <- stringr::str_sub(text, end = nchar(text) - 4)
weeks <- ifelse(nchar(weeks) == 1, paste0(0, weeks), weeks)

as.integer(paste0(years, weeks))
[1] 201601 201602 201603 201642 201643 201644

下面是一个使用regex的非常简短的方法。不需要软件包。
为了更好地理解它，我将其分为两个步骤，但您可以嵌套调用
text <- c(12016, 22016, 32016, 422016, 432016, 442016)

# first add a zero to weeks with one digit
text1 <- gsub("(\\b\\d{5}\\b)", "0\\1", text)

# then change position of first two and last four digits
gsub("([0-9]{2})([0-9]{4})", "\\2\\1", text1)

文本
library(stringr)

text_paded <- str_pad(text, 6, "left", 0)

as.integer(paste0(str_sub(text_paded, start = -4), str_sub(text_paded, end = 2)))
[1] 201601 201602 201603 201642 201643 201644

text <- c(12016, 22016, 32016, 422016, 432016, 442016)

# first add a zero to weeks with one digit
text1 <- gsub("(\\b\\d{5}\\b)", "0\\1", text)

# then change position of first two and last four digits
gsub("([0-9]{2})([0-9]{4})", "\\2\\1", text1)