R 将列表列值分离为相对于条件的奇异值_R_If Statement_Pivot_Conditional Statements_Difference

R 将列表列值分离为相对于条件的奇异值

r if-statement

R 将列表列值分离为相对于条件的奇异值,r,if-statement,pivot,conditional-statements,difference,R,If Statement,Pivot,Conditional Statements,Difference,简化解释从长到宽转换，同时为2019和2010填充缺少的值作为17和16，同时将2010中与2019匹配的值减去其计划值（即2019-2010）。如果2019年没有值，且其值已填入17，则将该计划值设为负值。同时，如果为2010中缺少的值填充了16，则保持计划值不变，正值这应该类似于表2 表1：长格式的数据帧示例 # A tibble: 10 x 4 year locality_id landcover pland <chr> <chr>

简化解释

从长到宽转换，同时为

和

填充缺少的值作为

和

，同时将

中与

匹配的值减去其计划值（即2019-2010）。如果2019年没有值，且其值已填入

，则将该计划值设为

负值。同时，如果为2010
中缺少的值填充了16
，则保持计划值不变，正值

这应该类似于表2
表1：长格式的数据帧示例
# A tibble: 10 x 4
   year  locality_id landcover  pland
   <chr> <chr>           <int>  <dbl>
 1 2010  L452817             8 0.0968
 2 2010  L452817             9 0.0323
 3 2010  L452817            12 0.613 
 4 2010  L452817            13 0.194 
 5 2010  L452817            14 0.0645
 6 2019  L452817             8 0.0645
 7 2019  L452817             9 0.0645
 8 2019  L452817            12 0.516 
 9 2019  L452817            13 0.194 
10 2019  L452817            14 0.161 

我所尝试的：
#set the values of t inot another variable
y <- t
#remove pland from the new variable
y <- y[, -4]

#set from long to wide providing the pland differences from t as another column
y %>%
    group_by(year) %>%
    mutate(row = row_number()) %>%
    tidyr::pivot_wider(names_from = year, values_from = landcover) %>%
    select(-row) %>% mutate(across(`2010`:`2019`, ~if(cur_column() == '2019') 
        replace_na(.x, 17) else replace_na(.x, 16))) %>% mutate(t[t$year %in% 2019,]$pland - t[t$year %in% 2010,]$pland)

# A tibble: 11 x 4
   locality_id `2010` `2019` `t[t$year %in% 2019, ]$pland - t[t$year %in% 2010, ]$pland`
   <chr>        <dbl>  <dbl>                                                       <dbl>
 1 L452817          8      8                                                    -0.0323 
 2 L452817          9      9                                                     0.0323 
 3 L452817         12     12                                                    -0.0968 
 4 L452817         13     13                                                     0      
 5 L452817         14     14                                                     0.0968 
 6 L910180          0     17                                                    -0.373  
 7 L910180          8     17                                                    -0.279  
 8 L910180          9     17                                                     0.485  
 9 L910180         10     17                                                     0.162  
10 L910180         11     17                                                     0.0675 
11 L910180         13     17                                                     0.00202


尽管我们欢迎更好的建议，特别是在没有警告的情况下，我们还是设法找到了答案
#set the values of t inot another variable
y <- t
#remove pland from the new variable
y <- y[, -4]

#set from long to wide providing the pland differences from t as another column
y %>%
group_by(year) %>%
mutate(row = row_number()) %>%
tidyr::pivot_wider(names_from = year, values_from = landcover) %>%
select(-row) %>% 
mutate(across(`2010`:`2019`, ~if(cur_column() == '2019') replace_na(.x, 17) else replace_na(.x, 16))) %>% 
mutate(ifelse(`2019` == `2010`, t[t$year %in% 2019, ]$pland - t[t$year %in% 2010, ]$pland, -t$pland))


细分：
使用来自的代码建议

这将创建一个相对于分组列的id
列，并对group\u by（）中的每个unique
值重复


然后使用下一个代码，从

这将用16
替换2010
的NAs
，用17
替换2019
的NAs



最后，在ifelse（）
语句中，我被挂在一个线程旁，以为它会工作，结果它成功了

它选择分别等于2019
和2010
的土地覆被值，然后通过取这些值的负数来计算它们的差值。最后，那些不相同的值将用剩余的pland值填充，同时取其负值

然而当2010
中出现16
时，我还没有弄清楚如何处理这些值，因此2019
计划值仍然为正值，因为它总是设置为负值
 我没有使用虚拟变量来识别缺失的变量，而是使用了另一种方法，即complete
和df
是您的原始数据结构
df %>%
  # fill in the data with missing year so we can compute while data in long format
  complete(year, nesting(locality_id, landcover), fill = list(pland = 0)) %>%
  arrange(desc(year)) %>%
  group_by(locality_id, landcover) %>%
  summarize(
    X2010 = if_else(pland[year == 2010] == 0 , 16L, first(landcover)),
    X2019 = if_else(pland[year == 2019] == 0 , 17L, first(landcover)),
    pland  = pland[year == 2019] - pland[year == 2010]) %>%
  arrange(locality_id, landcover)

这是输出
   locality_id landcover X2010 X2019   pland
   <chr>           <int> <int> <int>   <dbl>
 1 L452817             8     8     8 -0.0323
 2 L452817             9     9     9  0.0323
 3 L452817            12    12    12 -0.0968
 4 L452817            13    13    13  0     
 5 L452817            14    14    14  0.0968
 6 L910180             0     0    17 -0.438 
 7 L910180             8     8    17 -0.344 
 8 L910180             9     9    17 -0.0312
 9 L910180            10    10    17 -0.0312
10 L910180            11    11    17 -0.0938
11 L910180            13    13    17 -0.0625

locality\u id土地覆盖X2010 X2019规划
1 L452817-0.0323
2 L452817 0.0323
3 L452817 12-0.0968
4 L452817 13 0
5 L452817 14 0.0968
6 L910180 0 17-0.438
7 L910180 8 17-0.344
8 L910180 9 9 17-0.0312
9 L910180 10 17-0.0312
10 L910180 11 11 17-0.0938
11 L910180 13 17-0.0625
我喜欢你的方法！尽管我要求用16
或17
填充缺少的值，因为它们被转换为字符来命名。因此，管理一种方法，以土地覆盖为价值，同时保持2010年和2019年的土地覆盖是非常好的。同样，我喜欢你的方法，我测试了它，看看它是否同时适用于正值和负值，确实如此！只需要考虑如何引入2010
和2010
，将NAs
替换为16
和17
。我更新了答案，将X2019&X2010I包括在内。我只是想知道如何将这些零转换为16
和17，你比我快了。回答得好！
# A tibble: 11 x 4
   locality_id `2010` `2019` `ifelse(...)`
   <chr>        <dbl>  <dbl>         <dbl>
 1 L452817          8      8       -0.0323
 2 L452817          9      9        0.0323
 3 L452817         12     12       -0.0968
 4 L452817         13     13        0     
 5 L452817         14     14        0.0968
 6 L910180          0     17       -0.438 
 7 L910180          8     17       -0.344 
 8 L910180          9     17       -0.0312
 9 L910180         10     17       -0.0312
10 L910180         11     17       -0.0938
11 L910180         13     17       -0.0625

df %>%
  # fill in the data with missing year so we can compute while data in long format
  complete(year, nesting(locality_id, landcover), fill = list(pland = 0)) %>%
  arrange(desc(year)) %>%
  group_by(locality_id, landcover) %>%
  summarize(
    X2010 = if_else(pland[year == 2010] == 0 , 16L, first(landcover)),
    X2019 = if_else(pland[year == 2019] == 0 , 17L, first(landcover)),
    pland  = pland[year == 2019] - pland[year == 2010]) %>%
  arrange(locality_id, landcover)

   locality_id landcover X2010 X2019   pland
   <chr>           <int> <int> <int>   <dbl>
 1 L452817             8     8     8 -0.0323
 2 L452817             9     9     9  0.0323
 3 L452817            12    12    12 -0.0968
 4 L452817            13    13    13  0     
 5 L452817            14    14    14  0.0968
 6 L910180             0     0    17 -0.438 
 7 L910180             8     8    17 -0.344 
 8 L910180             9     9    17 -0.0312
 9 L910180            10    10    17 -0.0312
10 L910180            11    11    17 -0.0938
11 L910180            13    13    17 -0.0625