R-国防部变更的分组数据

R-国防部变更的分组数据,r,dataframe,lag,R,Dataframe,Lag,假设我有一个原始数据集(已经在数据帧中,我可以用as.xts.data.table轻松地将其转换为xts.data.table),DF如下所示: Date | City | State | Country | DailyMinTemperature | DailyMaxTemperature | DailyMedianTemperature ------------------------- 2018-02-03 | New York City | NY | US | 18 | 22 | 19

假设我有一个原始数据集(已经在数据帧中,我可以用as.xts.data.table轻松地将其转换为xts.data.table),DF如下所示:

Date | City | State | Country | DailyMinTemperature | DailyMaxTemperature | DailyMedianTemperature
-------------------------
2018-02-03 | New York City | NY | US | 18 | 22 | 19
2018-02-03 | London | LDN |UK | 10 | 25 | 15
2018-02-03 | Singapore | SG | SG | 28 | 32 | 29
2018-02-02 | New York City | NY | US | 12 | 30 | 18
2018-02-02 | London | LDN | UK | 12 | 15 | 14
2018-02-02 | Singapore | SG | SG | 27 | 31 | 30
等等(更多的城市和更多的日子)

我想用它来显示当前一天的温度和与前一天相比的一天比一天的变化,以及城市(州、国家)的其他信息。i、 例如,新的数据帧应该类似于(从上面的示例中):

等等。i、 例如,再添加3列以显示每天的变化

请注意,在数据框中,我可能没有每天的数据,但是我的变化被定义为t天的温度与我有温度数据的最近日期的温度之间的差异

我试图使用shift函数,但R抱怨:=符号

在R有什么办法可以让它工作吗


谢谢

您可以使用
dplyr::mutate_at
lubridate
包以所需格式转换数据。数据需要以日期格式排列,可以借助
dplyr::lag
功能获取当前记录与以前记录的差异

library(dplyr)
library(lubridate)

df %>% mutate_if(is.character, funs(trimws)) %>%  #Trim any blank spaces
  mutate(Date = ymd(Date)) %>%                    #Convert to Date/Time
  group_by(City, State, Country) %>%               
  arrange(City, State, Country, Date) %>%         #Order data date
  mutate_at(vars(starts_with("Daily")), funs(Change = . - lag(.))) %>%
  filter(!is.na(DailyMinTemperature_Change))
结果:

# # A tibble: 3 x 10
# # Groups: City, State, Country [3]
# Date       City          State Country DailyMinTemperature DailyMaxTemperature DailyMedianTemperature DailyMinTemperature_Change DailyMaxT~ DailyMed~
#   <date>     <chr>         <chr> <chr>                 <dbl>               <dbl>                  <int>                      <dbl>      <dbl>     <int>
# 1 2018-02-03 London        LDN   UK                     10.0                25.0                     15                      -2.00      10.0          1
# 2 2018-02-03 New York City NY    US                     18.0                22.0                     19                       6.00     - 8.00         1
# 3 2018-02-03 Singapore     SG    SG                     28.0                32.0                     29                       1.00       1.00        -1
# 
df <- read.table(text = 
"Date | City | State | Country | DailyMinTemperature | DailyMaxTemperature | DailyMedianTemperature
2018-02-03 | New York City | NY | US | 18 | 22 | 19
2018-02-03 | London | LDN |UK | 10 | 25 | 15
2018-02-03 | Singapore | SG | SG | 28 | 32 | 29
2018-02-02 | New York City | NY | US | 12 | 30 | 18
2018-02-02 | London | LDN | UK | 12 | 15 | 14
2018-02-02 | Singapore | SG | SG | 27 | 31 | 30",
header = TRUE, stringsAsFactors = FALSE, sep = "|")
##一个tible:3 x 10
##组别:城市、州、国家[3]
#日期城市州国家每日最低温度每日最高温度每日最低温度每日最低温度每日最低温度变化每日最高温度~
#                                                                                                     
#1 2018-02-03英国伦敦LDN 10.0 25.0 15-2.00 10.0 1
#2 2018-02-03美国纽约市18.0 22.0 19 6.00-8.00 1
#3 2018-02-03新加坡政府公报28.0 32.0 29 1.00 1.00-1
# 
数据:

# # A tibble: 3 x 10
# # Groups: City, State, Country [3]
# Date       City          State Country DailyMinTemperature DailyMaxTemperature DailyMedianTemperature DailyMinTemperature_Change DailyMaxT~ DailyMed~
#   <date>     <chr>         <chr> <chr>                 <dbl>               <dbl>                  <int>                      <dbl>      <dbl>     <int>
# 1 2018-02-03 London        LDN   UK                     10.0                25.0                     15                      -2.00      10.0          1
# 2 2018-02-03 New York City NY    US                     18.0                22.0                     19                       6.00     - 8.00         1
# 3 2018-02-03 Singapore     SG    SG                     28.0                32.0                     29                       1.00       1.00        -1
# 
df <- read.table(text = 
"Date | City | State | Country | DailyMinTemperature | DailyMaxTemperature | DailyMedianTemperature
2018-02-03 | New York City | NY | US | 18 | 22 | 19
2018-02-03 | London | LDN |UK | 10 | 25 | 15
2018-02-03 | Singapore | SG | SG | 28 | 32 | 29
2018-02-02 | New York City | NY | US | 12 | 30 | 18
2018-02-02 | London | LDN | UK | 12 | 15 | 14
2018-02-02 | Singapore | SG | SG | 27 | 31 | 30",
header = TRUE, stringsAsFactors = FALSE, sep = "|")

df卓越解决方案,感谢您关注所有细节:)