Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/78.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在R中读取凌乱的json_R_Json - Fatal编程技术网

在R中读取凌乱的json

在R中读取凌乱的json,r,json,R,Json,我有一个具有以下结构的csv文件: 输入 {"eid":"START","ver":"3.0","ets":1514764800238}} {"eid":"INTERACT","ver":"3.0","ets":1514764820546}} {"eid":"IMPRESSION","ver":"3.0","ets":895732}} {"eid":"IMPRESSION","ver":"3.0","ets":245636}} {"eid":"INTERACT","ver":"3.0","ets

我有一个具有以下结构的csv文件:

输入

{"eid":"START","ver":"3.0","ets":1514764800238}}
{"eid":"INTERACT","ver":"3.0","ets":1514764820546}}
{"eid":"IMPRESSION","ver":"3.0","ets":895732}}
{"eid":"IMPRESSION","ver":"3.0","ets":245636}}
{"eid":"INTERACT","ver":"3.0","ets":535235423525}}
[{"eid":"START","ver":"3.0","ets":1514764800238},
{"eid":"INTERACT","ver":"3.0","ets":1514764820546},
{"eid":"IMPRESSION","ver":"3.0","ets":895732},
{"eid":"IMPRESSION","ver":"3.0","ets":245636},
{"eid":"INTERACT","ver":"3.0","ets":535235423525}]
如您所见,它不是一个有效的json,要使上述内容成为有效的json,结构应如下所示:

预期产出

{"eid":"START","ver":"3.0","ets":1514764800238}}
{"eid":"INTERACT","ver":"3.0","ets":1514764820546}}
{"eid":"IMPRESSION","ver":"3.0","ets":895732}}
{"eid":"IMPRESSION","ver":"3.0","ets":245636}}
{"eid":"INTERACT","ver":"3.0","ets":535235423525}}
[{"eid":"START","ver":"3.0","ets":1514764800238},
{"eid":"INTERACT","ver":"3.0","ets":1514764820546},
{"eid":"IMPRESSION","ver":"3.0","ets":895732},
{"eid":"IMPRESSION","ver":"3.0","ets":245636},
{"eid":"INTERACT","ver":"3.0","ets":535235423525}]
问题:

理想情况下,我希望读取文件并修复它,然后另存为JSON, 就是

  • 将“}}”替换为“}”,除最后一行外的所有位置
  • 在文件的开头和结尾追加“[”和“]”
  • 我试着使用fromJSON(rjson),read_delim,但我看不懂


    提前感谢

    对于可复制的工作流来说,手动查找/替换是一个糟糕、糟糕、糟糕的建议

    一个选项-假设每行末尾确实有一个
    }
    ,并且文件位于
    /tmp/badlines

    library(magrittr)
    library(ndjson)
    
    readLines("/tmp/badlines") %>%
      sub("\\}$", "", .) %>% 
      ndjson::flatten(cls = "tbl")
    ## # A tibble: 5 x 3
    ##   eid            ets ver  
    ##   <chr>        <dbl> <chr>
    ## 1 START      1.51e12 3.0  
    ## 2 INTERACT   1.51e12 3.0  
    ## 3 IMPRESSION 8.96e 5 3.0  
    ## 4 IMPRESSION 2.46e 5 3.0  
    ## 5 INTERACT   5.35e11 3.0  
    
    库(magrittr)
    库(ndjson)
    读线(“/tmp/badlines”)%>%
    子(“\\}$”,“,)%>%
    ndjson::展平(cls=“tbl”)
    ###tibble:5 x 3
    ##开斋节
    ##            
    ##1启动1.51e12 3.0
    ##2.1.51e12 3.0
    ##3印象8.96e 5 3.0
    ##4印象2.46e 5 3.0
    ##5.5.35e11 3.0
    
    对于可复制的工作流来说,手动查找/替换是一个糟糕、糟糕、糟糕的建议

    一个选项-假设每行末尾确实有一个
    }
    ,并且文件位于
    /tmp/badlines

    library(magrittr)
    library(ndjson)
    
    readLines("/tmp/badlines") %>%
      sub("\\}$", "", .) %>% 
      ndjson::flatten(cls = "tbl")
    ## # A tibble: 5 x 3
    ##   eid            ets ver  
    ##   <chr>        <dbl> <chr>
    ## 1 START      1.51e12 3.0  
    ## 2 INTERACT   1.51e12 3.0  
    ## 3 IMPRESSION 8.96e 5 3.0  
    ## 4 IMPRESSION 2.46e 5 3.0  
    ## 5 INTERACT   5.35e11 3.0  
    
    库(magrittr)
    库(ndjson)
    读线(“/tmp/badlines”)%>%
    子(“\\}$”,“,)%>%
    ndjson::展平(cls=“tbl”)
    ###tibble:5 x 3
    ##开斋节
    ##            
    ##1启动1.51e12 3.0
    ##2.1.51e12 3.0
    ##3印象8.96e 5 3.0
    ##4印象2.46e 5 3.0
    ##5.5.35e11 3.0
    
    注意,这个问题几乎是重复的

    除了在JSON中读取并运行
    fromJSON
    (jsonlite包)之外,一行基本代码可以将其转换为有效的JSON(在变量
    JSON
    中)

    • 在每个输入行上使用
      sub
      “}}}”替换为
      “}”
    • 使用
      toString
      和在行之间插入逗号
    • “[”
      “]”
      使用
      c
    代码:

    变异 这也可以表示为提供相同输出的管道:

    library(jsonlite)
    library(magrittr)
    
    "test.json" %>%
      sub("}}", "}", .) %>%
      toString %>%
      c("[", ., "]") %>%
      fromJSON
    
    注 使用以下代码生成测试输入:

    Lines <- c('{"eid":"START","ver":"3.0","ets":1514764800238}}',
    '{"eid":"INTERACT","ver":"3.0","ets":1514764820546}}',
    '{"eid":"IMPRESSION","ver":"3.0","ets":895732}}',
    '{"eid":"IMPRESSION","ver":"3.0","ets":245636}}',
    '{"eid":"INTERACT","ver":"3.0","ets":535235423525}}')
    
    writeLines(Lines, "test.json")
    

    Lines注意,这个问题几乎与

    除了在JSON中读取并运行
    fromJSON
    (jsonlite包)之外,一行基本代码可以将其转换为有效的JSON(在变量
    JSON
    中)

    • 在每个输入行上使用
      sub
      “}}}”替换为
      “}”
    • 使用
      toString
      和在行之间插入逗号
    • “[”
      “]”
      使用
      c
    代码:

    变异 这也可以表示为提供相同输出的管道:

    library(jsonlite)
    library(magrittr)
    
    "test.json" %>%
      sub("}}", "}", .) %>%
      toString %>%
      c("[", ., "]") %>%
      fromJSON
    
    注 使用以下代码生成测试输入:

    Lines <- c('{"eid":"START","ver":"3.0","ets":1514764800238}}',
    '{"eid":"INTERACT","ver":"3.0","ets":1514764820546}}',
    '{"eid":"IMPRESSION","ver":"3.0","ets":895732}}',
    '{"eid":"IMPRESSION","ver":"3.0","ets":245636}}',
    '{"eid":"INTERACT","ver":"3.0","ets":535235423525}}')
    
    writeLines(Lines, "test.json")
    

    行使用vscode查找和替换如何解决除最后一行之外的问题?我可以使用sublime来实现这一点,但是json的大小非常大。你确定每行的结尾都是
    }
    吗?@hrbrmstr是的,但是在这一点上,我甚至无法理解阅读此csvUse vscode查找和替换我们如何解决除最后一行之外的问题?我可以使用sublime来实现这一点,但是json的大小非常大。你确定每行的结尾都是
    }
    吗?@hrbrmstr是的,但是在这一点上,我甚至无法理解如何读取这个csv