Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/77.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
分析错误:";“拖尾垃圾”;在尝试解析数据帧中的JSON列时_Json_R_Jsonlite_Purrr - Fatal编程技术网

分析错误:";“拖尾垃圾”;在尝试解析数据帧中的JSON列时

分析错误:";“拖尾垃圾”;在尝试解析数据帧中的JSON列时,json,r,jsonlite,purrr,Json,R,Jsonlite,Purrr,我有一个类似的日志文件。这是一个文本文档,看起来像: Id,Date,Level,Message 35054,2016-06-17 19:29:43 +0000,INFO,"{ ""id"": -2, ""ipAddress"": ""100.100.100.100"", ""howYouHearAboutUs"":

我有一个类似的日志文件。这是一个文本文档,看起来像:

Id,Date,Level,Message
35054,2016-06-17 19:29:43 +0000,INFO,"{
  ""id"": -2,
  ""ipAddress"": ""100.100.100.100"",
  ""howYouHearAboutUs"": null,
  ""isInterestedInOffer"": true,
  ""incomeRange"": 60000,
  ""isEmailConfirmed"": false
}"
35055,2016-06-17 19:36:38 +0000,INFO,"{
  ""id"": -1,
  ""firstName"": ""John"",
  ""lastName"": ""Smith"",
  ""email"": ""john.smith@gmail.com"",
  ""city"": ""Smalltown"",
  ""incomeRange"": 1,
  ""birthDate"": ""1999-12-10T05:00:00Z"",
  ""password"": ""*********"",
  ""agreeToTermsOfUse"": true,
  ""howYouHearAboutUs"": ""Radio"",
  ""isInterestedInOffer"": false
}"
35059,2016-07-19 19:52:08 +0000,INFO,"{
  ""id"": -3,
  ""visitUrl"": ""https://www.website.com/?purpose=X"",
  ""ipAddress"": ""100.200.300.400"",
  ""howYouHearAboutUs"": null,
  ""isInterestedInOffer"": true,
  ""incomeRange"": 100000,
  ""isEmailConfirmed"": true,
  ""isIdentityConfirmed"": false,
  ""agreeToTermsOfUse"": true,
  ""validationResults"": null
}"
我试图通过以下方式解析
消息
列中的JSON:

library(readr)
library(jsonlite)

df <- read_csv("log_file_from_above.csv")
fromJSON(as.character(df$Message))
如何去除“拖尾垃圾”?

fromJSON()
不是对字符向量进行“应用”,而是试图将其全部转换为数据帧。你可以试试

purrr::map(df$Message, jsonlite::fromJSON)
@Abdou提供了什么或

jsonlite::stream_in(textConnection(gsub("\\n", "", df$Message)))
后两者将创建数据帧。第一个将创建一个列表,您可以将其添加为列

您可以将最后一种方法与
dplyr::bind_cols
结合使用,以创建包含所有数据的新数据帧:

dplyr::bind_cols(df[,1:3],
                 jsonlite::stream_in(textConnection(gsub("\\n", "", df$Message))))
@Abdou还提出了一种几乎纯的碱性R解决方案:

cbind(df, do.call(plyr::rbind.fill, lapply(paste0("[",df$Message,"]"), function(x) jsonlite::fromJSON(x))))
完整的工作流程:

library(dplyr)
library(jsonlite)

df <- read.table("http://pastebin.com/raw/MMPMwNZv",
                 quote='"', sep=",", stringsAsFactors=FALSE, header=TRUE)

bind_cols(df[,1:3], stream_in(textConnection(gsub("\\n", "", df$Message)))) %>%
  glimpse()
## 
 Found 3 records...
 Imported 3 records. Simplifying into dataframe...
## Observations: 3
## Variables: 19
## $ Id                  <int> 35054, 35055, 35059
## $ Date                <chr> "2016-06-17 19:29:43 +0000", "2016-06-17 1...
## $ Level               <chr> "INFO", "INFO", "INFO"
## $ id                  <int> -2, -1, -3
## $ ipAddress           <chr> "100.100.100.100", NA, "100.200.300.400"
## $ howYouHearAboutUs   <chr> NA, "Radio", NA
## $ isInterestedInOffer <lgl> TRUE, FALSE, TRUE
## $ incomeRange         <int> 60000, 1, 100000
## $ isEmailConfirmed    <lgl> FALSE, NA, TRUE
## $ firstName           <chr> NA, "John", NA
## $ lastName            <chr> NA, "Smith", NA
## $ email               <chr> NA, "john.smith@gmail.com", NA
## $ city                <chr> NA, "Smalltown", NA
## $ birthDate           <chr> NA, "1999-12-10T05:00:00Z", NA
## $ password            <chr> NA, "*********", NA
## $ agreeToTermsOfUse   <lgl> NA, TRUE, TRUE
## $ visitUrl            <chr> NA, NA, "https://www.website.com/?purpose=X"
## $ isIdentityConfirmed <lgl> NA, NA, FALSE
## $ validationResults   <lgl> NA, NA, NA
库(dplyr)
图书馆(jsonlite)
df%
一瞥
## 
找到3条记录。。。
导入3条记录。简化为数据帧。。。
##意见:3
##变量:19
##350543505535059美元
##$Date“2016-06-17 19:29:43+0000”,“2016-06-17 1。。。
##$Level“INFO”、“INFO”、“INFO”
##$id-2、-1、-3
##$ipAddress“100.100.100.100”,NA,“100.200.300.400”
##$howYouHearAboutUs NA,“收音机”,NA
##$IsInterestdinOffer真、假、真
##收入范围600001000美元
##$Isemaild假,不,真
##$firstName NA,“约翰”,NA
##$lastName NA,“史密斯”,NA
##$NA,“约翰。smith@gmail.com“,不
##$city NA,“小镇”,NA
##$birthDate NA,“1999-12-10T05:00:00Z”,NA
##$password NA,“*******”,NA
##$AgreentToTermsofuse不适用,对,对
##$visitUrl不,不,”https://www.website.com/?purpose=X"
##$isIdentityConfirmed不适用,不适用,错误
##$validationResults不适用,不适用,不适用

lappy(paste0(“[”,df$Message,“]),函数(x)jsonlite::fromJSON(x))
产生了一些结果?我将一块json数据从html文档复制到一个新的文本文件中,并且也遇到了这个错误。根据上面的注释,我解决了这个问题的方法是手动添加一个开括号([)在我的json数据文本文件的顶部,末尾有一个小括号(])。我如何在数据帧中使用
purrr::map(df$Message,jsonlite::fromJSON)
?这样我就不会丢失时间戳?@hrbrmstr,你能添加
cbind(df[,1:3],do.call(plyr::rbind.fill,lappy(paste0(“[”,df$Message,”),df,“]),函数(x)jsonlite::fromJSON(x)))
?它使用了
plyr
包。仍然会在
dplyr::bind_cols(df[,1:3],jsonlite::stream_in(textConnection(gsub(\\n“,”,df$Message)))中遇到解析错误。
?您使用的是完整保留奇怪缩进的pastebin文件吗?
library(dplyr)
library(jsonlite)

df <- read.table("http://pastebin.com/raw/MMPMwNZv",
                 quote='"', sep=",", stringsAsFactors=FALSE, header=TRUE)

bind_cols(df[,1:3], stream_in(textConnection(gsub("\\n", "", df$Message)))) %>%
  glimpse()
## 
 Found 3 records...
 Imported 3 records. Simplifying into dataframe...
## Observations: 3
## Variables: 19
## $ Id                  <int> 35054, 35055, 35059
## $ Date                <chr> "2016-06-17 19:29:43 +0000", "2016-06-17 1...
## $ Level               <chr> "INFO", "INFO", "INFO"
## $ id                  <int> -2, -1, -3
## $ ipAddress           <chr> "100.100.100.100", NA, "100.200.300.400"
## $ howYouHearAboutUs   <chr> NA, "Radio", NA
## $ isInterestedInOffer <lgl> TRUE, FALSE, TRUE
## $ incomeRange         <int> 60000, 1, 100000
## $ isEmailConfirmed    <lgl> FALSE, NA, TRUE
## $ firstName           <chr> NA, "John", NA
## $ lastName            <chr> NA, "Smith", NA
## $ email               <chr> NA, "john.smith@gmail.com", NA
## $ city                <chr> NA, "Smalltown", NA
## $ birthDate           <chr> NA, "1999-12-10T05:00:00Z", NA
## $ password            <chr> NA, "*********", NA
## $ agreeToTermsOfUse   <lgl> NA, TRUE, TRUE
## $ visitUrl            <chr> NA, NA, "https://www.website.com/?purpose=X"
## $ isIdentityConfirmed <lgl> NA, NA, FALSE
## $ validationResults   <lgl> NA, NA, NA