清除R中的数据并将其转换为XTS的最佳方法
我正在尝试清理一些从web下载的数据,并将其转换为XTS。我找到了一些关于使用GREPL清理数据的CRAN文档,但是我想知道除了使用GREPL之外,是否还有更简单的方法来清理数据。我希望有人能够帮助我使用GREPL或R中的其他函数来清理这些数据。提前感谢您为我提供的任何帮助清除R中的数据并将其转换为XTS的最佳方法,r,xts,grepl,R,Xts,Grepl,我正在尝试清理一些从web下载的数据,并将其转换为XTS。我找到了一些关于使用GREPL清理数据的CRAN文档,但是我想知道除了使用GREPL之外,是否还有更简单的方法来清理数据。我希望有人能够帮助我使用GREPL或R中的其他函数来清理这些数据。提前感谢您为我提供的任何帮助 [1] "{" [2] " \"Meta Dat
[1] "{"
[2] " \"Meta Data\": {"
[3] " \"1. Information\": \"Daily Prices (open, high, low, close) and Volumes\","
[4] " \"2. Symbol\": \"MSFT\","
[5] " \"3. Last Refreshed\": \"2017-06-08 15:15:00\","
[6] " \"4. Output Size\": \"Compact\","
[7] " \"5. Time Zone\": \"US/Eastern\""
[8] " },"
[9] " \"2017-01-19\": {"
[10] " \"1. open\": \"62.2400\","
[11] " \"2. high\": \"62.9800\","
[12] " \"3. low\": \"62.1950\","
[13] " \"4. close\": \"62.3000\","
[14] " \"5. volume\": \"18451655\""
[15] " },"
[16] " \"2017-01-18\": {"
[17] " \"1. open\": \"62.6700\","
[18] " \"2. high\": \"62.7000\","
[19] " \"3. low\": \"62.1200\","
[20] " \"4. close\": \"62.5000\","
[21] " \"5. volume\": \"19670102\""
[22] " },"
[23] " \"2017-01-17\": {"
[24] " \"1. open\": \"62.6800\","
[25] " \"2. high\": \"62.7000\","
[26] " \"3. low\": \"62.0300\","
[27] " \"4. close\": \"62.5300\","
[28] " \"5. volume\": \"20663983\""
[29] " }"
[30] " }"
[31] "}"
此数据的最终输出如下所示:
Open High Low Close Volume
2017-01-17 62.68 62.70 62.03 62.53 20663983
2017-01-18 62.67 62.70 62.12 62.50 19670102
2017-01-19 62.24 62.98 62.195 62.30 18451655
如前所述,您需要做的第一件事是解析JSON
Lines <-
"{
\"Meta Data\": {
\"1. Information\": \"Daily Prices (open, high, low, close) and Volumes\",
\"2. Symbol\": \"MSFT\",
\"3. Last Refreshed\": \"2017-06-08 15:15:00\",
\"4. Output Size\": \"Compact\",
\"5. Time Zone\": \"US/Eastern\"
},
\"2017-01-19\": {
\"1. open\": \"62.2400\",
\"2. high\": \"62.9800\",
\"3. low\": \"62.1950\",
\"4. close\": \"62.3000\",
\"5. volume\": \"18451655\"
},
\"2017-01-18\": {
\"1. open\": \"62.6700\",
\"2. high\": \"62.7000\",
\"3. low\": \"62.1200\",
\"4. close\": \"62.5000\",
\"5. volume\": \"19670102\"
},
\"2017-01-17\": {
\"1. open\": \"62.6800\",
\"2. high\": \"62.7000\",
\"3. low\": \"62.0300\",
\"4. close\": \"62.5300\",
\"5. volume\": \"20663983\"
}
}"
parsedLines <- jsonlite::fromJSON(Lines)
现在您可能已经注意到,parsedLines
中的第一个元素是元数据。我们可以稍后将其附加到最终对象。但是首先,让我们将所有其他元素都绑定到一个矩阵中。我们可以通过使用do.call
对任何长度的列表执行此操作
ohlcv <- do.call(rbind, parsedLines[-1]) # [-1] removes the first element
您可能需要研究如何使用jsonlite::fromJSON将其转换为一个对象,以便在R中更轻松地操作。还可以显示所需的输出结果。@beigel感谢您的回答。我将研究jsonlite,我还添加了所需的输出。谢谢@Joshua和@Beigel!我真的很感谢你在这个问题上的时间和帮助!我能够清理数据,现在可以使用了!
ohlcv <- do.call(rbind, parsedLines[-1]) # [-1] removes the first element
colnames(ohlcv) <- gsub("^[[:digit:]]\\.", "", colnames(ohlcv))
ohlcv <- type.convert(ohlcv)
# convert to xts
x <- as.xts(ohlcv, dateFormat = "Date")
# attach attributes
metadata <- parsedLines[[1]]
names(metadata) <- gsub("[[:digit:]]|\\.|[[:space:]]", "", names(metadata))
xtsAttributes(x) <- metadata
# view attributes
str(x)
An 'xts' object on 2017-01-17/2017-01-19 containing:
Data: num [1:3, 1:5] 62.7 62.7 62.2 62.7 62.7 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:5] " open" " high" " low" " close" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 5
$ Information : chr "Daily Prices (open, high, low, close) and Volumes"
$ Symbol : chr "MSFT"
$ LastRefreshed: chr "2017-06-08 15:15:00"
$ OutputSize : chr "Compact"
$ TimeZone : chr "US/Eastern"