用于加载和清理库存数据的基本R函数

用于加载和清理库存数据的基本R函数,r,R,我正在尝试编写一个函数,以便为我使用getSymbols加载的所有股票代码执行以下任务。我尝试过使用lapply,但功能似乎不起作用 library(quantmod) getSymbols(c("XLF","VFH","XLI","VIS","RWO","IYR","VNQI","VGT","RYT","VPU","IDU"), src = "yahoo",from="2012-01-01" ) #NEED TO FIGURE OUT A FUNCTION FOR THIS XLF = a

我正在尝试编写一个函数,以便为我使用getSymbols加载的所有股票代码执行以下任务。我尝试过使用lapply,但功能似乎不起作用

library(quantmod)
getSymbols(c("XLF","VFH","XLI","VIS","RWO","IYR","VNQI","VGT","RYT","VPU","IDU"), src = "yahoo",from="2012-01-01" ) 

#NEED TO FIGURE OUT A FUNCTION FOR THIS
XLF = as.data.frame(XLF)
XLF$date = row.names(XLF)
XLI[,c("XLI.Open","XLI.High", "XLI.Low", "XLI.Adjusted")] = NULL
XLI["ticker"]="XLI"
XLI["industry"]="industrials"
colnames(XLI) <- c("date","close","volume","ticker","industry")

虽然您在输出中提到了接近价格,但建议使用 改为调整价格列,因为它是针对公司行为进行调整的,例如 股票分割、股息等

我使用了一个测试行业向量,您需要用实际值替换它们

您可以使用new.env和lapply,如下所示:

library(quantmod)


tickerVec = c("XLF","VFH","XLI","VIS","RWO","IYR","VNQI","VGT","RYT","VPU","IDU")

#test industry vector, replace with actual sector names
industryVec = c("industrials","financials","materials","energy",
            "materials","energy","financials","technology","industrials","technology","energy")


startDt = as.Date("2012-01-01")

#create new data environment for storing all price timeseries

data.env = new.env()

getSymbols(tickerVec,env=data.env,src = "yahoo",from=startDt )      


#convert to list class for ease in manipulation

data.env.lst = as.list(data.env)

#create an anoynmous function to reshape timeseries into required shape

fn_modifyData = function(x) {

TS = data.env.lst[[x]]

#xts to data.frame
TS_DF = data.frame(date=as.Date(index(TS)),coredata(TS),stringsAsFactors=FALSE)

#retain only required columns
TS_DF = TS_DF[,c(1,5,6)]

TS_DF$ticker = tickerVec[x]
TS_DF$industry = industryVec[x]
colnames(TS_DF)  = c("date","close","volume","ticker","industry")
row.names(TS_DF) = NULL

return(TS_DF)

}
输出:


你似乎没有在这里尝试拉普拉。什么东西不起作用?您希望此函数的输入是什么?您希望输出是什么?
#apply function to all timeseries using lapply
outList = lapply(1:length(data.env.lst),function(z) fn_modifyData(z) )


head(outList[[1]])
#        date close    volume ticker    industry
#1 2012-01-03 13.34 103362000    XLF industrials
#2 2012-01-04 13.30  69833900    XLF industrials
#3 2012-01-05 13.48  89935300    XLF industrials
#4 2012-01-06 13.40  83878600    XLF industrials
#5 2012-01-09 13.47  69189600    XLF industrials
#6 2012-01-10 13.71  86035100    XLF industrials
head(outList[[11]])
#        date close volume ticker industry
#1 2012-01-03 50.55   6100    IDU   energy
#2 2012-01-04 50.41   2700    IDU   energy
#3 2012-01-05 50.83   1700    IDU   energy
#4 2012-01-06 50.82   7700    IDU   energy
#5 2012-01-09 51.25   1800    IDU   energy
#6 2012-01-10 51.71   5500    IDU   energy


#if you wish to combine all datasets 
outDF = do.call(rbind,outList)

head(outDF)
#        date close    volume ticker    industry
#1 2012-01-03 13.34 103362000    XLF industrials
#2 2012-01-04 13.30  69833900    XLF industrials
#3 2012-01-05 13.48  89935300    XLF industrials
#4 2012-01-06 13.40  83878600    XLF industrials
#5 2012-01-09 13.47  69189600    XLF industrials
#6 2012-01-10 13.71  86035100    XLF industrials