R 按组ID简单扩展data.table中的行
对于以下给定数据集:R 按组ID简单扩展data.table中的行,r,data.table,grouping,R,Data.table,Grouping,对于以下给定数据集: > df <- data.table(ID=LETTERS[1:4],y_min=c(1970,1973,1976,1971),y_max=c(1974,1975,1980,1974)) > df ID y_min y_max 1: A 1970 1974 2: B 1973 1975 3: C 1976 1980 4: D 1971 1974 提前谢谢 一个选项是通过在“ID”列上分组来创建一个列表列,然后在列表列的“y\
> df <- data.table(ID=LETTERS[1:4],y_min=c(1970,1973,1976,1971),y_max=c(1974,1975,1980,1974))
> df
ID y_min y_max
1: A 1970 1974
2: B 1973 1975
3: C 1976 1980
4: D 1971 1974
提前谢谢 一个选项是通过在“ID”列上分组来创建一个
列表
列,然后在列表
列的“y\u min”、“y\u max”和最新之间执行
library(data.table)
library(tidyr)
df[, year := .(list(y_min:y_max)), ID]
df %>%
unnest(c(year))
-输出
# A tibble: 17 x 4
# ID y_min y_max year
# <chr> <dbl> <dbl> <int>
# 1 A 1970 1974 1970
# 2 A 1970 1974 1971
# 3 A 1970 1974 1972
# 4 A 1970 1974 1973
# 5 A 1970 1974 1974
# 6 B 1973 1975 1973
# 7 B 1973 1975 1974
# 8 B 1973 1975 1975
# 9 C 1976 1980 1976
#10 C 1976 1980 1977
#11 C 1976 1980 1978
#12 C 1976 1980 1979
#13 C 1976 1980 1980
#14 D 1971 1974 1971
#15 D 1971 1974 1972
#16 D 1971 1974 1973
#17 D 1971 1974 1974
# ID y_min y_max year
# 1: A 1970 1974 1970
# 2: A 1970 1974 1971
# 3: A 1970 1974 1972
# 4: A 1970 1974 1973
# 5: A 1970 1974 1974
# 6: B 1973 1975 1973
# 7: B 1973 1975 1974
# 8: B 1973 1975 1975
# 9: C 1976 1980 1976
#10: C 1976 1980 1977
#11: C 1976 1980 1978
#12: C 1976 1980 1979
#13: C 1976 1980 1980
#14: D 1971 1974 1971
#15: D 1971 1974 1972
#16: D 1971 1974 1973
#17: D 1971 1974 1974
-输出
# A tibble: 17 x 4
# ID y_min y_max year
# <chr> <dbl> <dbl> <int>
# 1 A 1970 1974 1970
# 2 A 1970 1974 1971
# 3 A 1970 1974 1972
# 4 A 1970 1974 1973
# 5 A 1970 1974 1974
# 6 B 1973 1975 1973
# 7 B 1973 1975 1974
# 8 B 1973 1975 1975
# 9 C 1976 1980 1976
#10 C 1976 1980 1977
#11 C 1976 1980 1978
#12 C 1976 1980 1979
#13 C 1976 1980 1980
#14 D 1971 1974 1971
#15 D 1971 1974 1972
#16 D 1971 1974 1973
#17 D 1971 1974 1974
# ID y_min y_max year
# 1: A 1970 1974 1970
# 2: A 1970 1974 1971
# 3: A 1970 1974 1972
# 4: A 1970 1974 1973
# 5: A 1970 1974 1974
# 6: B 1973 1975 1973
# 7: B 1973 1975 1974
# 8: B 1973 1975 1975
# 9: C 1976 1980 1976
#10: C 1976 1980 1977
#11: C 1976 1980 1978
#12: C 1976 1980 1979
#13: C 1976 1980 1980
#14: D 1971 1974 1971
#15: D 1971 1974 1972
#16: D 1971 1974 1973
#17: D 1971 1974 1974
或者用自动连接
df[df, .(y_min, y_max, year = y_min:y_max), on = .(ID), by = .EACHI]
谢谢你认为可能有一个纯数据表解决方案,也许是通过使用CJ?@lovestacksflow用数据表解决方案更新了帖子