上个月下一整年的年值（单位：R）_R_Database_Dataframe

上个月下一整年的年值（单位：R）

r database dataframe

上个月下一整年的年值（单位：R）,r,database,dataframe,R,Database,Dataframe,我有这样的月度数据，第一列是日期，下一列是我：我需要做一件简单的事情：我需要创建另一个变量，这样对于整个1994年，它取我1993年12月的值；同样，1995年的价值是1994年12月的价值，依此类推。如果不可用，则给出NA structure(list(date = structure(c(8673, 8702, 8734, 8765, 8796, 8824, 8855, 8884, 8916, 8946, 8975, 9008, 9038, 9069, 9099, 9129, 916

我有这样的月度数据，第一列是日期，下一列是我：

我需要做一件简单的事情：我需要创建另一个变量，这样对于整个1994年，它取我1993年12月的值；同样，1995年的价值是1994年12月的价值，依此类推。如果不可用，则给出NA

structure(list(date = structure(c(8673, 8702, 8734, 8765, 8796, 
8824, 8855, 8884, 8916, 8946, 8975, 9008, 9038, 9069, 9099, 9129, 
9161, 9189, 9220, 9248, 9281, 9311, 9342, 9373, 9402, 9434, 9464, 
9493, 9526, 9555, 9584, 9616, 9647, 9675, 9708, 9738, 9769, 9800, 
9829, 9861, 9892, 9920, 9951, 9981, 10011, 10042, 10073, 10102, 
10134, 10165, 10193, 10226, 10256, 10284, 10316, 10346, 10375, 
10407, 10438, 10469, 10499, 10529, 10560, 10591, 10620, 10648, 
10681, 10711, 10739, 10772, 10802, 10834, 10864, 10893, 10925, 
10956, 10987, 11016, 11047, 11075, 11108, 11138, 11169, 11200, 
11229, 11261, 11291, 11320), class = "Date"), me = c(41535, 39458.25, 
38766, 43611.75, 54687.75, 65763.75, 66456, 92069.25, 89300.25, 
82452.125, 81066.375, 76909.125, 70698.75, 79709.375, 77630, 
71391.875, 69312.5, 69312.5, 70542.8125, 52621.125, 46520.125, 
43469.625, 45757.5, 43850.9375, 40492, 32088, 38964, 35149.75, 
32857.375, 35149.75, 29074.75, 26779.375, 27544.5, 32140.5, 32905.75, 
32905.75, 34436.25, 31375.25, 32140.5, 29878.875, 39838.5, 42519.9375, 
42707.25, 40014, 43861.5, 51615.125, 46992.875, 46992.875, 53996.25, 
47053.875, 47053.875, 46706, 50180, 56356, 65641.25, 69116.375, 
65255.125, 60469.5, 62020, 41863.5, 48919.5, 55908, 57461, 57970.3125, 
59137.5, 53301.5625, 68475, 72365.625, 65751.5625, 71587.5, 85982.8125, 
73921.875, 84496.5, 82149.375, 79019.875, 89973.125, 99752.8125, 
106794.1875, 103425.5625, 123669, 143544.375, 143325, 139668.75, 
143325, 139536, 122820.75, 125001, 101933.0625)), .Names = c("date", 
"me"), class = "data.frame", row.names = c(81L, 80L, 79L, 82L, 
87L, 91L, 92L, 88L, 83L, 90L, 94L, 86L, 84L, 93L, 89L, 85L, 102L, 
101L, 95L, 105L, 96L, 106L, 99L, 100L, 104L, 98L, 97L, 103L, 
108L, 107L, 112L, 111L, 109L, 110L, 114L, 117L, 115L, 116L, 118L, 
113L, 123L, 125L, 130L, 128L, 119L, 122L, 127L, 120L, 126L, 129L, 
121L, 124L, 140L, 136L, 139L, 137L, 134L, 132L, 131L, 141L, 133L, 
135L, 138L, 142L, 146L, 153L, 154L, 150L, 148L, 144L, 149L, 152L, 
143L, 145L, 151L, 147L, 165L, 157L, 156L, 163L, 164L, 160L, 161L, 
158L, 155L, 166L, 162L, 159L))

这是一个简单的解决方案，使用基础R和润滑油

这是一个使用Base R ad lubridate的简单解决方案，这里有一个可能的解决方案：

library(zoo)
library(lubridate)

我们首先创建一些简单有用的变量：

d <- d %>% 
  mutate(date = ymd(date),
         month = month(date),
         year = year(date)) %>% 
  groupby(year) %>% # for each year we fill just the december value in new_var
  mutate(new_var = ifelse(month==12, me, NA)) %>% ungroup()

您只需要定义如何填充这些NA，因为没有上一年。

这里有一个可能的解决方案：

library(zoo)
library(lubridate)

我们首先创建一些简单有用的变量：

d <- d %>% 
  mutate(date = ymd(date),
         month = month(date),
         year = year(date)) %>% 
  groupby(year) %>% # for each year we fill just the december value in new_var
  mutate(new_var = ifelse(month==12, me, NA)) %>% ungroup()

您只需要定义如何填充这些NA，因为没有上一年的数据。

这里有另一个解决方案，只使用dplyr和df作为您的数据

创建2个data.frames：

一个带有日期、me、年份和月份='12' 一个新的_var=me，年={year+1}，月然后合并2个data.frames，我使用data.table:：merge，但可以使用dplyr:：left\u join，两者都可以

然后去掉年，月

df %>% 
  {merge(x = transmute(., date, me, year = as.numeric(substr(date, 1, 4)), month = '12'),
         y = transmute(., new_var = me, year = as.numeric(substr(date, 1, 4)) + 1, month = substr(date, 6, 7)), 
         by = c('year', 'month'), 
         all.x = TRUE)} %>%
  select(-year, -month)

这里是另一个只使用dplyr和df作为数据的解决方案

创建2个data.frames：

然后去掉年，月

df %>% 
  {merge(x = transmute(., date, me, year = as.numeric(substr(date, 1, 4)), month = '12'),
         y = transmute(., new_var = me, year = as.numeric(substr(date, 1, 4)) + 1, month = substr(date, 6, 7)), 
         by = c('year', 'month'), 
         all.x = TRUE)} %>%
  select(-year, -month)

前一个12月的me值是在m行后面找到的值，其中m为1，2为2，依此类推，否则如果没有m行，则为NA。通过计算m和连续的行号ix，我们得到如下结果。没有使用任何软件包

m <- as.numeric(format(DF$date, "%m"))
ix <- seq_len(nrow(DF))
transform(DF, me_dec = me[ifelse(ix - m < 1, NA, ix - m)])

m <- as.numeric(format(DF$date, "%m"))
ix <- seq_len(nrow(DF))
transform(DF, me_dec = me[ifelse(ix - m < 1, NA, ix - m)])

@neeraj，如果您有多个年份，RLave给出的解决方案不起作用，因为12月份的值将不正确。该值保持不变，但应为一年前12月的值，由于该值为NA，因此以这种方式填充

如果创建一个等于新变量前导的变量，则可以使用他的解决方案

df <- df %>%
   mutate(value = lag(new_var,1)

应更正该答案。

@neeraj，如果您有多年的时间，RLave给出的解决方案不起作用，因为12月的值将不正确。该值保持不变，但应为一年前12月的值，由于该值为NA，因此以这种方式填充

如果创建一个等于新变量前导的变量，则可以使用他的解决方案

df <- df %>%
   mutate(value = lag(new_var,1)

应该更正这个答案。

我不需要只需要12月值，但我需要创建新变量，整个t年取t-1年的12月值。我不需要只需要12月值，但我需要创建新变量，整个t年取t-1年的12月值。非常感谢。我创建了一个循环。但这是非常有效的。非常感谢。我创建了一个循环。但这是非常有效的，非常棒。印象真棒。给人印象深刻的