r按月份和年份平均每日库存数据,并附加到数据集
我有以下十年的每日数据:r按月份和年份平均每日库存数据,并附加到数据集,r,dplyr,R,Dplyr,我有以下十年的每日数据: library(lubridate) library(dplyr) head(infy_close_subset,24) date INFY.NS.Close 1 2007-01-02 568.162 2 2007-01-03 577.838 3 2007-01-04 571.325 4 2007-01-05 568.763 5 2007-01-08 551.400 6 2007
library(lubridate)
library(dplyr)
head(infy_close_subset,24)
date INFY.NS.Close
1 2007-01-02 568.162
2 2007-01-03 577.838
3 2007-01-04 571.325
4 2007-01-05 568.763
5 2007-01-08 551.400
6 2007-01-09 547.525
7 2007-01-10 541.112
8 2007-01-11 545.750
9 2007-01-12 555.850
10 2007-01-15 560.737
11 2007-01-16 555.550
12 2007-01-17 551.362
13 2007-01-18 556.037
14 2007-01-19 550.588
15 2007-01-22 563.500
16 2007-01-23 558.787
17 2007-01-24 558.513
18 2007-01-25 560.250
19 2007-01-29 561.100
20 2007-01-31 561.825
21 2007-02-01 567.237
22 2007-02-02 566.388
23 2007-02-05 567.325
24 2007-02-06 568.237
我正在尝试按年份和月份创建一个新的平均列,如下所示:
Infy_monthlyAvg <- infy_close_subset %>%
group_by(yr = year(date), mon = month(date)) %>%
summarize(mean_close = mean(INFY.NS.Close))
head(Infy_monthlyAvg)
mean_close
1 731.6223
我想在infy\u close\u子集
dataframe之后添加一列mean\u close
date INFY.NS.Close yr mon mean_close
<date> <dbl> <dbl> <dbl>
1 2007-01-02 568.162 2007 1 731.6223
2 2007-01-03 577.838 2007 1 731.6223
3 2007-01-04 571.325 2007 1 731.6223
4 2007-01-05 568.763 2007 1 731.6223
5 2007-01-08 551.400 2007 1 731.6223
6 2007-01-09 547.525 2007 1 731.6223
.................
999 2017-09-08 988.400 2007 9 921.3333
1000 2017-09-09 977.525 2007 9 921.3333
date INFY.NS.Close yr mon mean\u Close
1 2007-01-02 568.162 2007 1 731.6223
2 2007-01-03 577.838 2007 1 731.6223
3 2007-01-04 571.325 2007 1 731.6223
4 2007-01-05 568.763 2007 1 731.6223
5 2007-01-08 551.400 2007 1 731.6223
6 2007-01-09 547.525 2007 1 731.6223
.................
999 2017-09-08 988.400 2007 9 921.3333
1000 2017-09-09 977.525 2007 9 921.3333
如果将yr
和mon
列添加到原始数据帧中:
infy_close_subset = infy_close_subset %>%
mutate(yr = year(date), mon = month(date))
然后您可以通过yr
和mon
合并两个结果表:
answer = merge(infy_close_subset, Infy_monthlyAvg, by = c("yr", "mon")
我想你想要每月的收入。如果您想要总体平均值,那么答案很简单:
answer = infy_close_subset %>%
mutate(mean_close = mean(infy_close_subset$INFY.NS.Close))
没有分组、汇总和合并的中间步骤。如果将
yr
和mon
列添加到原始数据帧:
infy_close_subset = infy_close_subset %>%
mutate(yr = year(date), mon = month(date))
然后您可以通过yr
和mon
合并两个结果表:
answer = merge(infy_close_subset, Infy_monthlyAvg, by = c("yr", "mon")
我想你想要每月的收入。如果您想要总体平均值,那么答案很简单:
answer = infy_close_subset %>%
mutate(mean_close = mean(infy_close_subset$INFY.NS.Close))
没有分组、总结和合并的中间步骤。我倾向于制作一个句点栏
df <- left_join(
infy_close_subset %>%
mutate(
period = format(date, "%Y-%m"),
yr = year(date),
mon = month(date)
),
infy_close_subset %>%
mutate(period = format(date, "%Y-%m")) %>%
group_by(period) %>%
summarise(mean_close = mean(INFY.NS.Close)
),
by = "period"
) %>%
select(-period)
# date INFY.NS.Close yr mon mean_close
# 1 2007-01-02 568.162 2007 1 558.2987
# 2 2007-01-03 577.838 2007 1 558.2987
# 3 2007-01-04 571.325 2007 1 558.2987
# 4 2007-01-05 568.763 2007 1 558.2987
# 5 2007-01-08 551.400 2007 1 558.2987
# 6 2007-01-09 547.525 2007 1 558.2987
# 7 2007-01-10 541.112 2007 1 558.2987
# 8 2007-01-11 545.750 2007 1 558.2987
# 9 2007-01-12 555.850 2007 1 558.2987
# 10 2007-01-15 560.737 2007 1 558.2987
# 11 2007-01-16 555.550 2007 1 558.2987
# 12 2007-01-17 551.362 2007 1 558.2987
# 13 2007-01-18 556.037 2007 1 558.2987
# 14 2007-01-19 550.588 2007 1 558.2987
# 15 2007-01-22 563.500 2007 1 558.2987
# 16 2007-01-23 558.787 2007 1 558.2987
# 17 2007-01-24 558.513 2007 1 558.2987
# 18 2007-01-25 560.250 2007 1 558.2987
# 19 2007-01-29 561.100 2007 1 558.2987
# 20 2007-01-31 561.825 2007 1 558.2987
# 21 2007-02-01 567.237 2007 2 567.2967
# 22 2007-02-02 566.388 2007 2 567.2967
# 23 2007-02-05 567.325 2007 2 567.2967
# 24 2007-02-06 568.237 2007 2 567.2967
df%
变异(
期间=格式(日期,“%Y-%m”),
年=年(日期),
周一=月(日)
),
通知关闭子集%>%
变异(期间=格式(日期,“%Y-%m”))%>%
分组单位(期间)%>%
总结(平均值=平均值(信息结束)
),
by=“期间”
) %>%
选择(-period)
#日期信息N.S.Close yr mon mean\U Close
# 1 2007-01-02 568.162 2007 1 558.2987
# 2 2007-01-03 577.838 2007 1 558.2987
# 3 2007-01-04 571.325 2007 1 558.2987
# 4 2007-01-05 568.763 2007 1 558.2987
# 5 2007-01-08 551.400 2007 1 558.2987
# 6 2007-01-09 547.525 2007 1 558.2987
# 7 2007-01-10 541.112 2007 1 558.2987
# 8 2007-01-11 545.750 2007 1 558.2987
# 9 2007-01-12 555.850 2007 1 558.2987
# 10 2007-01-15 560.737 2007 1 558.2987
# 11 2007-01-16 555.550 2007 1 558.2987
# 12 2007-01-17 551.362 2007 1 558.2987
# 13 2007-01-18 556.037 2007 1 558.2987
# 14 2007-01-19 550.588 2007 1 558.2987
# 15 2007-01-22 563.500 2007 1 558.2987
# 16 2007-01-23 558.787 2007 1 558.2987
# 17 2007-01-24 558.513 2007 1 558.2987
# 18 2007-01-25 560.250 2007 1 558.2987
# 19 2007-01-29 561.100 2007 1 558.2987
# 20 2007-01-31 561.825 2007 1 558.2987
# 21 2007-02-01 567.237 2007 2 567.2967
# 22 2007-02-02 566.388 2007 2 567.2967
# 23 2007-02-05 567.325 2007 2 567.2967
# 24 2007-02-06 568.237 2007 2 567.2967
我倾向于做一个句号列
df <- left_join(
infy_close_subset %>%
mutate(
period = format(date, "%Y-%m"),
yr = year(date),
mon = month(date)
),
infy_close_subset %>%
mutate(period = format(date, "%Y-%m")) %>%
group_by(period) %>%
summarise(mean_close = mean(INFY.NS.Close)
),
by = "period"
) %>%
select(-period)
# date INFY.NS.Close yr mon mean_close
# 1 2007-01-02 568.162 2007 1 558.2987
# 2 2007-01-03 577.838 2007 1 558.2987
# 3 2007-01-04 571.325 2007 1 558.2987
# 4 2007-01-05 568.763 2007 1 558.2987
# 5 2007-01-08 551.400 2007 1 558.2987
# 6 2007-01-09 547.525 2007 1 558.2987
# 7 2007-01-10 541.112 2007 1 558.2987
# 8 2007-01-11 545.750 2007 1 558.2987
# 9 2007-01-12 555.850 2007 1 558.2987
# 10 2007-01-15 560.737 2007 1 558.2987
# 11 2007-01-16 555.550 2007 1 558.2987
# 12 2007-01-17 551.362 2007 1 558.2987
# 13 2007-01-18 556.037 2007 1 558.2987
# 14 2007-01-19 550.588 2007 1 558.2987
# 15 2007-01-22 563.500 2007 1 558.2987
# 16 2007-01-23 558.787 2007 1 558.2987
# 17 2007-01-24 558.513 2007 1 558.2987
# 18 2007-01-25 560.250 2007 1 558.2987
# 19 2007-01-29 561.100 2007 1 558.2987
# 20 2007-01-31 561.825 2007 1 558.2987
# 21 2007-02-01 567.237 2007 2 567.2967
# 22 2007-02-02 566.388 2007 2 567.2967
# 23 2007-02-05 567.325 2007 2 567.2967
# 24 2007-02-06 568.237 2007 2 567.2967
df%
变异(
期间=格式(日期,“%Y-%m”),
年=年(日期),
周一=月(日)
),
通知关闭子集%>%
变异(期间=格式(日期,“%Y-%m”))%>%
分组单位(期间)%>%
总结(平均值=平均值(信息结束)
),
by=“期间”
) %>%
选择(-period)
#日期信息N.S.Close yr mon mean\U Close
# 1 2007-01-02 568.162 2007 1 558.2987
# 2 2007-01-03 577.838 2007 1 558.2987
# 3 2007-01-04 571.325 2007 1 558.2987
# 4 2007-01-05 568.763 2007 1 558.2987
# 5 2007-01-08 551.400 2007 1 558.2987
# 6 2007-01-09 547.525 2007 1 558.2987
# 7 2007-01-10 541.112 2007 1 558.2987
# 8 2007-01-11 545.750 2007 1 558.2987
# 9 2007-01-12 555.850 2007 1 558.2987
# 10 2007-01-15 560.737 2007 1 558.2987
# 11 2007-01-16 555.550 2007 1 558.2987
# 12 2007-01-17 551.362 2007 1 558.2987
# 13 2007-01-18 556.037 2007 1 558.2987
# 14 2007-01-19 550.588 2007 1 558.2987
# 15 2007-01-22 563.500 2007 1 558.2987
# 16 2007-01-23 558.787 2007 1 558.2987
# 17 2007-01-24 558.513 2007 1 558.2987
# 18 2007-01-25 560.250 2007 1 558.2987
# 19 2007-01-29 561.100 2007 1 558.2987
# 20 2007-01-31 561.825 2007 1 558.2987
# 21 2007-02-01 567.237 2007 2 567.2967
# 22 2007-02-02 566.388 2007 2 567.2967
# 23 2007-02-05 567.325 2007 2 567.2967
# 24 2007-02-06 568.237 2007 2 567.2967
利用数据的解决方案。表
:
library(data.table)
setDT(infy_close_subset)
infy_close_subset[, mean_close := mean(INFY.NS.Close), by = format(date, "%Y-%m")]
利用
数据的解决方案。表
:
library(data.table)
setDT(infy_close_subset)
infy_close_subset[, mean_close := mean(INFY.NS.Close), by = format(date, "%Y-%m")]
您只能得到一个结果,因为您是按年份和月份分组的,只有有效的组
2017-01
。您是否使用lubridate软件包中的year
&month
功能?请指定,因为它们不是系统库的一部分。您只能得到一个结果,因为您是按年份和月份分组的,只有有效的组2017-01
。您是否使用lubridate软件包中的year
和month
功能?请指定,因为它们不是系统库的一部分。