R 如何将包含空值的列表列表转换为数据帧_R

R 如何将包含空值的列表列表转换为数据帧

R 如何将包含空值的列表列表转换为数据帧,r,R,我有一个从JSON对象创建的列表，作为电子商务API的输出-minimal。下面的例子。我正试图将其转换为df，但运气不好 my_ls <- list(list(id = 406962L, user_id = 132786L, user_name = "Visitor Account", organization_id = NULL, checkout_at = NULL, currency = "USD", bulk_discount = NULL, coup

我有一个从

JSON

对象创建的列表，作为电子商务API的输出-minimal。下面的例子。我正试图将其转换为df，但运气不好

my_ls <- list(list(id = 406962L, user_id = 132786L, user_name = "Visitor Account", 
      organization_id = NULL, checkout_at = NULL, currency = "USD", 
      bulk_discount = NULL, coupon_codes = NULL, items = list(list(
        id = 505296L, quantity = 1L, unit_cost = 1295, used = 0L, 
        item_id = 6165L, item_type = "Path", item_name = "Product_2", 
        discount_type = "Percent", discount = NULL, coupon_id = NULL), 
        list(id = 505297L, quantity = 1L, unit_cost = 1295, used = 0L, 
             item_id = 6163L, item_type = "Path", item_name = "Product_1", 
             discount_type = "Percent", discount = NULL, coupon_id = NULL))), 
 list(id = 407178L, user_id = 132786L, user_name = "Visitor Account", 
      organization_id = "00001", checkout_at = NULL, currency = "USD", 
      bulk_discount = NULL, coupon_codes = NULL, items = list(
        list(id = 505744L, quantity = 1L, unit_cost = 1295, 
             used = 0L, item_id = 6163L, item_type = "Path", 
             item_name = "Product_1", 
             discount_type = "Percent", discount = NULL, coupon_id = NULL))))

my_ls这是否接近您的预期输出
库（tidyverse）
我的%
自动卸载（项目）
#>使用'unnest_lower（my_ls）'；元素共有9个名称
#>使用'unnest_longer（items）'；没有元素有名称
#>使用“最新（项目）”；元素共有10个名称
#>新名称：
#>*id->id…1
#>*id->id…5
结果
#>#tibble:3 x 13
#>id…1用户id用户名货币id…5数量单位成本已用项目id
#>                               
#>1406962132786访客~5052961129506165美元
#>2406962132786访客~5052971129506163美元
#>3407178132786访客~5057441129506163美元
#> # ... 还有4个变量：项目类型、项目名称、，
#>#折扣_类型、组织_id

由（v0.3.0）于2020年6月14日创建
如果您需要一个非参数版本，在这种情况下，我们知道列表的深度为2，但将来可能会发生变化，这可能会更具弹性，尽管解释起来更难
库（tidyverse）
我的%
未测试_管道（）
#>使用'unnest_lower（my_ls）'；元素共有9个名称
#>使用'unnest_longer（items）'；没有元素有名称
#>使用“最新（项目）”；元素共有10个名称
#>新名称：
#>*id->id…1
#>*id->id…5
#>警告：`cols`现在在使用unnest（）时是必需的。
#>请使用'cols=c（id…1，用户id，用户名，货币，组织id）`
#>警告：`cols`现在在使用unnest（）时是必需的。
#>请使用'cols=c（id…1，用户id，用户名，货币，组织id）`
#>警告：`cols`现在在使用unnest（）时是必需的。
#>请使用'cols=c（id…1，用户id，用户名，货币，组织id）`
结果
#>#tibble:3 x 15
#>id…1用户id用户名货币id…5数量单位成本已用项目id
#>                               
#>1406962132786访客~5052961129506165美元
#>2406962132786访客~5052971129506163美元
#>3407178132786访客~5057441129506163美元
#> # ... 还有6个变量：项目类型、项目名称、，
#>#折扣类型、折扣、优惠券id、组织id

由（v0.3.0）于2020年6月14日创建的一个选项涉及到dplyr
、tidyr
和purrr
：
map_depth(.x = my_ls, 2, ~ replace(.x, is.null(.x), NA), .ragged = TRUE) %>%
 bind_rows() %>%
 mutate(items = map_depth(items, 2, ~ replace(.x, is.null(.x), NA))) %>%
 rename(`original_id` = id) %>%
 unnest_wider(items) 

 original_id user_id user_name organization_id checkout_at currency bulk_discount
        <int>   <int> <chr>     <chr>           <lgl>       <chr>    <lgl>        
1      406962  132786 Visitor … <NA>            NA          USD      NA           
2      406962  132786 Visitor … <NA>            NA          USD      NA           
3      407178  132786 Visitor … 00001           NA          USD      NA           
# … with 11 more variables: coupon_codes <lgl>, id <int>, quantity <int>, unit_cost <dbl>,
#   used <int>, item_id <int>, item_type <chr>, item_name <chr>, discount_type <chr>,
#   discount <lgl>, coupon_id <lgl>

dir您是否尝试用NA替换NULL？不，我没有，但这只能解决两个问题中的一个。谢谢-但是最终数据中缺少一些列，如优惠券id。如果您的示例数据中有一个案例，它将显示，我看到，它只是在保留哪些列上非常懒惰——我将在更大的集合上进行测试。我还添加了一个版本，可以在更大的列表上工作，但您可能仍然需要手工挖掘前几个级别
rrapply(my_ls, f = function(x) if(is.null(x)) NA else x, how = "replace") %>%
 bind_rows() %>%
 rename(`original_id` = id) %>%
 unnest_wider(items)