清除R中的数据并将其转换为XTS的最佳方法

清除R中的数据并将其转换为XTS的最佳方法,r,xts,grepl,R,Xts,Grepl,我正在尝试清理一些从web下载的数据,并将其转换为XTS。我找到了一些关于使用GREPL清理数据的CRAN文档,但是我想知道除了使用GREPL之外,是否还有更简单的方法来清理数据。我希望有人能够帮助我使用GREPL或R中的其他函数来清理这些数据。提前感谢您为我提供的任何帮助 [1] "{" [2] " \"Meta Dat

我正在尝试清理一些从web下载的数据,并将其转换为XTS。我找到了一些关于使用GREPL清理数据的CRAN文档,但是我想知道除了使用GREPL之外,是否还有更简单的方法来清理数据。我希望有人能够帮助我使用GREPL或R中的其他函数来清理这些数据。提前感谢您为我提供的任何帮助

  [1] "{"                                                                                 
  [2] "    \"Meta Data\": {"                                                              
  [3] "        \"1. Information\": \"Daily Prices (open, high, low, close) and Volumes\","
  [4] "        \"2. Symbol\": \"MSFT\","                                                  
  [5] "        \"3. Last Refreshed\": \"2017-06-08 15:15:00\","                           
  [6] "        \"4. Output Size\": \"Compact\","                                          
  [7] "        \"5. Time Zone\": \"US/Eastern\""     
  [8] "        },"                                                                        
  [9] "        \"2017-01-19\": {"                                                         
 [10] "            \"1. open\": \"62.2400\","                                             
 [11] "            \"2. high\": \"62.9800\","                                             
 [12] "            \"3. low\": \"62.1950\","                                              
 [13] "            \"4. close\": \"62.3000\","                                            
 [14] "            \"5. volume\": \"18451655\""                                           
 [15] "        },"                                                                        
 [16] "        \"2017-01-18\": {"                                                         
 [17] "            \"1. open\": \"62.6700\","                                             
 [18] "            \"2. high\": \"62.7000\","                                             
 [19] "            \"3. low\": \"62.1200\","                                              
 [20] "            \"4. close\": \"62.5000\","                                            
 [21] "            \"5. volume\": \"19670102\""                                           
 [22] "        },"                                                                        
 [23] "        \"2017-01-17\": {"                                                         
 [24] "            \"1. open\": \"62.6800\","                                             
 [25] "            \"2. high\": \"62.7000\","                                             
 [26] "            \"3. low\": \"62.0300\","                                              
 [27] "            \"4. close\": \"62.5300\","                                            
 [28] "            \"5. volume\": \"20663983\""                                           
 [29] "        }"                                                                         
 [30] "    }"                                                                             
 [31] "}"                                  
此数据的最终输出如下所示:

            Open        High        Low        Close        Volume
2017-01-17  62.68       62.70       62.03       62.53       20663983
2017-01-18  62.67       62.70       62.12       62.50       19670102
2017-01-19  62.24       62.98       62.195      62.30       18451655
如前所述,您需要做的第一件事是解析JSON

Lines <-
"{                                                                                 
  \"Meta Data\": {
    \"1. Information\": \"Daily Prices (open, high, low, close) and Volumes\",
    \"2. Symbol\": \"MSFT\",
    \"3. Last Refreshed\": \"2017-06-08 15:15:00\",
    \"4. Output Size\": \"Compact\",
    \"5. Time Zone\": \"US/Eastern\"
  },
  \"2017-01-19\": {
      \"1. open\": \"62.2400\",
      \"2. high\": \"62.9800\",
      \"3. low\": \"62.1950\",
      \"4. close\": \"62.3000\",
      \"5. volume\": \"18451655\"
  },
  \"2017-01-18\": {
      \"1. open\": \"62.6700\",
      \"2. high\": \"62.7000\",
      \"3. low\": \"62.1200\",
      \"4. close\": \"62.5000\",
      \"5. volume\": \"19670102\"
  },
  \"2017-01-17\": {
      \"1. open\": \"62.6800\",
      \"2. high\": \"62.7000\",
      \"3. low\": \"62.0300\",
      \"4. close\": \"62.5300\",
      \"5. volume\": \"20663983\"
  }
}"
parsedLines <- jsonlite::fromJSON(Lines)
现在您可能已经注意到,
parsedLines
中的第一个元素是元数据。我们可以稍后将其附加到最终对象。但是首先,让我们将所有其他元素都绑定到一个矩阵中。我们可以通过使用
do.call
对任何长度的列表执行此操作

 ohlcv <- do.call(rbind, parsedLines[-1])  # [-1] removes the first element

您可能需要研究如何使用jsonlite::fromJSON将其转换为一个对象,以便在R中更轻松地操作。还可以显示所需的输出结果。@beigel感谢您的回答。我将研究jsonlite,我还添加了所需的输出。谢谢@Joshua和@Beigel!我真的很感谢你在这个问题上的时间和帮助!我能够清理数据,现在可以使用了!
 ohlcv <- do.call(rbind, parsedLines[-1])  # [-1] removes the first element
colnames(ohlcv) <- gsub("^[[:digit:]]\\.", "", colnames(ohlcv))
ohlcv <- type.convert(ohlcv)
# convert to xts
x <- as.xts(ohlcv, dateFormat = "Date")
# attach attributes
metadata <- parsedLines[[1]]
names(metadata) <- gsub("[[:digit:]]|\\.|[[:space:]]", "", names(metadata))
xtsAttributes(x) <- metadata
# view attributes
str(x)

An 'xts' object on 2017-01-17/2017-01-19 containing:
  Data: num [1:3, 1:5] 62.7 62.7 62.2 62.7 62.7 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:5] " open" " high" " low" " close" ...
  Indexed by objects of class: [Date] TZ: UTC
  xts Attributes:  
List of 5
 $ Information  : chr "Daily Prices (open, high, low, close) and Volumes"
 $ Symbol       : chr "MSFT"
 $ LastRefreshed: chr "2017-06-08 15:15:00"
 $ OutputSize   : chr "Compact"
 $ TimeZone     : chr "US/Eastern"