如何将SharePoint列表加载到TIBLE in R中?
我想将SharePoint列表加载到R中的TIBLE中 我尝试的问题是,数据的每个值都包装在一个列表中。如何展开每个值或更改数据转换以直接包含字符串,而不是列表如何将SharePoint列表加载到TIBLE in R中?,r,json,dataframe,dplyr,nested-lists,R,Json,Dataframe,Dplyr,Nested Lists,我想将SharePoint列表加载到R中的TIBLE中 我尝试的问题是,数据的每个值都包装在一个列表中。如何展开每个值或更改数据转换以直接包含字符串,而不是列表 # A tibble: 10 x 6 `__metadata` A B C D E <list> <list> <list> <list>
# A tibble: 10 x 6
`__metadata` A B C D E
<list> <list> <list> <list> <list> <list>
1 <list [4]> <list [1]> <chr [1]> <list [2]> <chr [1]> <chr [1]>
2 <list [4]> <list [1]> <chr [1]> <list [2]> <chr [1]> <chr [1]>
3 <list [4]> <list [1]> <chr [1]> <list [2]> <chr [1]> <chr [1]>
4 <list [4]> <list [1]> <chr [1]> <list [2]> <chr [1]> <chr [1]>
5 <list [4]> <list [1]> <chr [1]> <list [2]> <chr [1]> <chr [1]>
6 <list [4]> <list [1]> <chr [1]> <list [2]> <chr [1]> <chr [1]>
7 <list [4]> <list [1]> <chr [1]> <list [2]> <chr [1]> <chr [1]>
...
。。。以及map()
、mutate\u all()
、unest()
和unlist()
的许多其他组合
我认为问题在于我处理数据的方式。原始JSON的格式如下:
{
"d": {
"results": [
{
"__metadata": {
"id": "<GUID>",
"uri": "<redacted>",
"etag": "\"42\"",
"type": "SP.Data.DownloadcenterItem"
},
"A": {
"results": [
{
"__metadata": {
"id": "<GUID>",
"type": "SP.Data.UserInfoItem"
},
"Title": "<redacted>"
}
]
},
"C": {
"__metadata": {
"id": "<GUID>",
"type": "SP.Data.UserInfoItem"
},
"EMail": "<redacted>"
},
"B": "<redacted>",
"D": "<redacted>",
"E": "<redacted>"
},
...
],
"__next": "<redacted>"
}
}
以下内容适用于您的数据。您不需要在此处使用
purrr
包
库(dplyr)
图书馆(tibble)
图书馆(tidyr)
酷。我认为在这里使用
map\u-dfr(~enframe(.x))
不是一个好主意。您需要后退一步,查看JSON数据。找出你需要什么,然后提取出来。现在你有一大堆嵌套的-嵌套的列表。我的建议是离这个问题稍微远一点,以一种开放的心态以一种更普遍的方式来思考它。如果您希望我们帮助您,请共享dput(my_data$d$results)
。(@akrun自找的)。好的。让我们看看enframe的结果(未列出(当前页面$d$results))。大量的元数据、ID、类型等。现在您需要确定要保留哪些行(数据字段)。然后,您可以进入下一步,即重塑!我想扔掉所有元数据,保留其余的$d$results%%>%map\u dfr(enframe)%%>%filter(name!=''uuu metadata')
似乎丢弃了元数据,但unnest(value)
产生了与之前相同的错误enframe(unlist(当前页面$d$results))%%>%filter(!grepl(“metadata”,name,ignore.case=T))%%>%groupby(name)%%>%mutate(rid=1:n())>%pivot\u(-rid,names\u from=“name”,values\u from=“value”)%%>%unest
这在您发布的dput的输出中对我起了作用。
{
"d": {
"results": [
{
"__metadata": {
"id": "<GUID>",
"uri": "<redacted>",
"etag": "\"42\"",
"type": "SP.Data.DownloadcenterItem"
},
"A": {
"results": [
{
"__metadata": {
"id": "<GUID>",
"type": "SP.Data.UserInfoItem"
},
"Title": "<redacted>"
}
]
},
"C": {
"__metadata": {
"id": "<GUID>",
"type": "SP.Data.UserInfoItem"
},
"EMail": "<redacted>"
},
"B": "<redacted>",
"D": "<redacted>",
"E": "<redacted>"
},
...
],
"__next": "<redacted>"
}
}
current_page <- httr::GET('<URL>') %>% httr::content()
my_data <- current_page$d$results %>%
map(enframe) %>%
map(~ spread(.x, name, value))
list(list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"), dmsAuthor = list(
results = list(list(`__metadata` = list(id = "<redacted>",
type = "<redacted>"), Title = "<redacted>"))),
dmsDocumentOwner = list(`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"),
list(`__metadata` = list(id = "<redacted>",
uri = "<redacted>",
etag = "<redacted>", type = "<redacted>"),
dmsAuthor = list(results = list(list(`__metadata` = list(
id = "<redacted>", type = "<redacted>"),
Title = "<redacted>"))), dmsDocumentOwner = list(
`__metadata` = list(id = "<redacted>",
type = "<redacted>"), EMail = "<redacted>"),
dmsDocumentID = "<redacted>", dmsDocVersion = "<redacted>", dmsSPTitle = "<redacted>"))
enframe(unlist(current_page$d$results)) %>%
filter(!grepl("metadata", name, ignore.case = T)) %>%
group_by(name) %>%
mutate(rid = 1:n()) %>%
pivot_wider(-rid, names_from = "name", values_from = "value") %>%
unnest
#> # A tibble: 10 x 5
#> dmsAuthor.resul~ dmsDocumentOwne~ dmsDocumentID dmsDocVersion dmsSPTitle
#> <chr> <chr> <chr> <chr> <chr>
#> 1 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 2 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 3 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 4 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 5 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 6 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 7 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 8 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 9 <redacted> <redacted> <redacted> <redacted> <redacted>
#> 10 <redacted> <redacted> <redacted> <redacted> <redacted>
#> Warning: Values in `value` are not uniquely identified; output will contain list-cols.
##> * Use `values_fn = list(value = list)` to suppress this warning.
##> * Use `values_fn = list(value = length)` to identify where the duplicates arise
##> * Use `values_fn = list(value = summary_fun)` to summarise duplicates
##> Warning: `cols` is now required.
##> Please use `cols = c(dmsAuthor.results.Title, dmsDocumentOwner.EMail, dmsDocumentID,
##> dmsDocVersion, dmsSPTitle)`