带有JSON字符串的R数据帧列-需要创建JSON对象列
因此,数据帧有一个列(类别),即JSON。这是一个样本带有JSON字符串的R数据帧列-需要创建JSON对象列,r,json,dataframe,R,Json,Dataframe,因此,数据帧有一个列(类别),即JSON。这是一个样本 {"id":254,"name":"Performances","slug":"dance/performances","position":1,"parent_id":6,"parent_name":"Dance","color":10917369,"urls":{"web":{"discover":"http://www.kickstarter.com/discover/categories/dance/performances"}}}
{"id":254,"name":"Performances","slug":"dance/performances","position":1,"parent_id":6,"parent_name":"Dance","color":10917369,"urls":{"web":{"discover":"http://www.kickstarter.com/discover/categories/dance/performances"}}}
我很难将一些json对象转换成数据帧的特性
示例:我希望数据框['parent_cat']包含来自数据框['category']的“parent_name”的JSON值
下面是我对apply的尝试,但正如您所看到的,其中一条记录返回一个列表
json <- function(r){
return(data.frame(jsonlite::fromJSON(txt=r['category']),stringsAsFactors=F)$name)
}
json2 <- function(df){
data.frame(jsonlite::fromJSON(df$category),stringsAsFactors=F)$parent_name
}
df$child_cat <- apply(df, 1,json)
df$parent_cat <- apply(df,1,json2)
head(df[c("child_cat","parent_cat","category")])
child_cat
<chr>
parent_cat
<list>
1 Performances <chr [1]>
2 Hardware <chr [1]>
3 Software <chr [1]>
4 Anthologies <chr [1]>
5 Experimental <chr [1]>
6 Software <chr [1]>
json我能够让mutate操作与rowwise()函数一起工作。我在阅读JSON时也遇到了问题。有时JSON是不完整的,它会导致一些列返回错误的行数,从而破坏mutate函数。所以我把它扔到一个trycatch块中,并在出现错误或警告时返回“Undefined”。我还确保JSON对象存在,并且在返回之前确实是一个字符串,或者返回未定义。这段代码对于希望从data.frame列读取JSON字符数据的任何人都很有用
json <- function(r, str){
df <- tryCatch({
data.frame(fromJSON(r),stringsAsFactors=F)
}, warning = function(war) {
return("undefined")
}, error = function(err) {
return("undefined")
}, finally = {
})
if(!is.data.frame(df) || !str %in% colnames(df) || !is.character(df[[str]])){
return("undefined")
}
return(df[[str]])
}
df1 <- df1 %>%
rowwise() %>%
mutate(cat_parent = json(category,"parent_name"),
cat_child = json(category,"name"),
city = json(location,"name"))
json我能够让mutate操作与rowwise()函数一起工作。我在阅读JSON时也遇到了问题。有时JSON是不完整的,它会导致一些列返回错误的行数,从而破坏mutate函数。所以我把它扔到一个trycatch块中,并在出现错误或警告时返回“Undefined”。我还确保JSON对象存在,并且在返回之前确实是一个字符串,或者返回未定义。这段代码对于希望从data.frame列读取JSON字符数据的任何人都很有用
json <- function(r, str){
df <- tryCatch({
data.frame(fromJSON(r),stringsAsFactors=F)
}, warning = function(war) {
return("undefined")
}, error = function(err) {
return("undefined")
}, finally = {
})
if(!is.data.frame(df) || !str %in% colnames(df) || !is.character(df[[str]])){
return("undefined")
}
return(df[[str]])
}
df1 <- df1 %>%
rowwise() %>%
mutate(cat_parent = json(category,"parent_name"),
cat_child = json(category,"name"),
city = json(location,"name"))
jsontidyjson
包可能就是您想要的:
library(dplyr)
library(tidyjson)
df <- tibble(category = '{"id":254,"name":"Performances","slug":"dance/performances","position":1,"parent_id":6,"parent_name":"Dance","color":10917369,"urls":{"web":{"discover":"http://www.kickstarter.com/discover/categories/dance/performances"}}}')
df <- df %>%
mutate(cat_parent = category %>%
spread_all() %>%
pull(parent_name),
cat_child = category %>%
spread_all() %>%
pull(name))
库(dplyr)
库(tidyjson)
df%
全部展开()%>%
pull(父项名称),
类别儿童=类别%>%
全部展开()%>%
(姓名)
这可能有助于检查列中的字符是否不是有效的json输入
library(dplyr)
library(tidyjson)
library(jsonlite)
df <- tibble(category = c('{"id":254,"name":"Performances","slug":"dance/performances","position":1,"parent_id":6,"parent_name":"Dance","color":10917369,"urls":{"web":{"discover":"http://www.kickstarter.com/discover/categories/dance/performances"}}}',
'not_json'))
df <- df %>%
rowwise() %>%
mutate(check_json = validate(category),
cat_parent = ifelse(check_json,
category %>%
spread_all() %>%
pull(parent_name),
NA))
库(dplyr)
库(tidyjson)
图书馆(jsonlite)
df%
变异(检查_json=validate(类别),
cat_parent=ifelse(检查,
类别%>%
全部展开()%>%
pull(父项名称),
NA))
您可能正在寻找的是tidyjson
包:
library(dplyr)
library(tidyjson)
df <- tibble(category = '{"id":254,"name":"Performances","slug":"dance/performances","position":1,"parent_id":6,"parent_name":"Dance","color":10917369,"urls":{"web":{"discover":"http://www.kickstarter.com/discover/categories/dance/performances"}}}')
df <- df %>%
mutate(cat_parent = category %>%
spread_all() %>%
pull(parent_name),
cat_child = category %>%
spread_all() %>%
pull(name))
库(dplyr)
库(tidyjson)
df%
全部展开()%>%
pull(父项名称),
类别儿童=类别%>%
全部展开()%>%
(姓名)
这可能有助于检查列中的字符是否不是有效的json输入
library(dplyr)
library(tidyjson)
library(jsonlite)
df <- tibble(category = c('{"id":254,"name":"Performances","slug":"dance/performances","position":1,"parent_id":6,"parent_name":"Dance","color":10917369,"urls":{"web":{"discover":"http://www.kickstarter.com/discover/categories/dance/performances"}}}',
'not_json'))
df <- df %>%
rowwise() %>%
mutate(check_json = validate(category),
cat_parent = ifelse(check_json,
category %>%
spread_all() %>%
pull(parent_name),
NA))
库(dplyr)
库(tidyjson)
图书馆(jsonlite)
df%
变异(检查_json=validate(类别),
cat_parent=ifelse(检查,
类别%>%
全部展开()%>%
pull(父项名称),
NA))
您可以使用dput
添加数据,以便我们可以复制与此处相同的内容吗?您可以使用dput
添加数据,以便我们可以复制与此处相同的内容吗?嗨,克里斯,对于一列,每行包含一个JSON字符串。但有些列不是JSON。这就是我在提供的答案中使用rowwise()的原因。Tidyjson看起来也不错,我确信如果使用rowwise()它会起作用。但有些列不是JSON。这就是我在提供的答案中使用rowwise()的原因。Tidyjson看起来也不错,我确信如果使用rowwise()它会起作用。