使用apply函数在data.frame中进行计算,而不更改“我的日期”列的格式

使用apply函数在data.frame中进行计算,而不更改“我的日期”列的格式,r,lapply,R,Lapply,这是我的data.frame: df<-structure(list(Data = structure(c(18158, 18157, 18156, 18155, 18152), class = "Date"), A = c(19.46, 19.26, 19.43, 19.44, 19.1), B = c(49.72, 49.2, 48.45, 47, 51.34), C = c(45.69, 44.92, 44.12, 43.07, 43), D = c(48.32, 48.02,

这是我的data.frame:

df<-structure(list(Data = structure(c(18158, 18157, 18156, 18155, 
18152), class = "Date"), A = c(19.46, 19.26, 19.43, 19.44, 
19.1), B = c(49.72, 49.2, 48.45, 47, 51.34), C = c(45.69, 
44.92, 44.12, 43.07, 43), D = c(48.32, 48.02, 47.3, 46.65, 
47.14)), row.names = c(NA, 5L), class = "data.frame")

df如果我们知道“日期”列的位置,请使用索引方法删除该列

f1 <- function(x) (-diff(x)/x[-length(x)])
apply(df[-1], 2, f1)
应用
转换为
矩阵
矩阵
只能有一种类型。因此,如果还包括
Date
类,则会将其强制为整数存储模式

如果存在
character
列,则所有元素都将转换为
character
,计算将不起作用

diff
函数返回的输出
length
小于输入的
length
,如果需要更新数据集的原始列,则会出现
length
不匹配。为了避免在开头追加
NA

f2 <- function(x) (c(NA, -diff(x)/x[-length(x)]))
df[-1] <- apply(df[-1],2, f2)

使用
dplyr
,如果

library(dplyr)
df %>%
   mutate_if(is.numeric, ~ c(NA_real_, diff(.)/.[-n()]))
#      Data            A           B            C            D
#1 2019-09-19           NA          NA           NA           NA
#2 2019-09-18 -0.010277492 -0.01045857 -0.016852703 -0.006208609
#3 2019-09-17  0.008826584 -0.01524390 -0.017809439 -0.014993753
#4 2019-09-16  0.000514668 -0.02992776 -0.023798731 -0.013742072
#5 2019-09-13 -0.017489712  0.09234043 -0.001625261  0.010503751
如果我们需要创建新列,请将其放入
列表中
,并按照
列表中的命名方式进行命名

df %>%
    mutate_if(is.numeric, list(diffs = ~ c(NA_real_, diff(.)/.[-n()])))
#        Data     A     B     C     D      A_diffs     B_diffs      C_diffs      D_diffs
#1 2019-09-19 19.46 49.72 45.69 48.32           NA          NA           NA           NA
#2 2019-09-18 19.26 49.20 44.92 48.02 -0.010277492 -0.01045857 -0.016852703 -0.006208609
#3 2019-09-17 19.43 48.45 44.12 47.30  0.008826584 -0.01524390 -0.017809439 -0.014993753
#4 2019-09-16 19.44 47.00 43.07 46.65  0.000514668 -0.02992776 -0.023798731 -0.013742072
#5 2019-09-13 19.10 51.34 43.00 47.14 -0.017489712  0.09234043 -0.001625261  0.010503751

请注意,示例中的列名是
Data
,而不是
Date
df[paste0(names(df)[-1], "_diffs")] <- apply(df[-1],2, f2)
i1 <- sapply(df, is.numeric)
apply(df[i1], 2,  f1)
lapply(df[i1], function(x) -diff(x)/x[-length(x)])
library(dplyr)
df %>%
   mutate_if(is.numeric, ~ c(NA_real_, diff(.)/.[-n()]))
#      Data            A           B            C            D
#1 2019-09-19           NA          NA           NA           NA
#2 2019-09-18 -0.010277492 -0.01045857 -0.016852703 -0.006208609
#3 2019-09-17  0.008826584 -0.01524390 -0.017809439 -0.014993753
#4 2019-09-16  0.000514668 -0.02992776 -0.023798731 -0.013742072
#5 2019-09-13 -0.017489712  0.09234043 -0.001625261  0.010503751
df %>%
    mutate_if(is.numeric, list(diffs = ~ c(NA_real_, diff(.)/.[-n()])))
#        Data     A     B     C     D      A_diffs     B_diffs      C_diffs      D_diffs
#1 2019-09-19 19.46 49.72 45.69 48.32           NA          NA           NA           NA
#2 2019-09-18 19.26 49.20 44.92 48.02 -0.010277492 -0.01045857 -0.016852703 -0.006208609
#3 2019-09-17 19.43 48.45 44.12 47.30  0.008826584 -0.01524390 -0.017809439 -0.014993753
#4 2019-09-16 19.44 47.00 43.07 46.65  0.000514668 -0.02992776 -0.023798731 -0.013742072
#5 2019-09-13 19.10 51.34 43.00 47.14 -0.017489712  0.09234043 -0.001625261  0.010503751