R如何使用case_when()确定列中的前一个值是否大于有序向量中的前一个值

R如何使用case_when()确定列中的前一个值是否大于有序向量中的前一个值,r,dplyr,lag,R,Dplyr,Lag,我正在计算珊瑚人口统计学数据集的增长,需要对最大直径(cm)进行比较,以确定时间步珊瑚萎缩的程度。我尝试使用lag,但出于某种原因,我的新列是全NA,而不是仅更改为新coralID的行。是否有人知道我需要做些什么才能使我的Diff专栏只包含发生向新殖民地过渡的NAs 数据帧 A tibble:20 x 22 `分类代码` ID日期年份地点(U long Shelter)模块(U long Shelter)(模块#)(侧面位置)定居点(U Area TimeStep size)类别(Cover C

我正在计算珊瑚人口统计学数据集的增长,需要对
最大直径(cm)
进行比较,以确定
时间步
珊瑚萎缩的程度。我尝试使用lag,但出于某种原因,我的新列是全NA,而不是仅更改为新coral
ID
的行。是否有人知道我需要做些什么才能使我的
Diff
专栏只包含发生向新殖民地过渡的NAs

数据帧
A tibble:20 x 22
`分类代码` ID日期年份地点(U long Shelter)模块(U long Shelter)(模块#)(侧面位置)定居点(U Area TimeStep size)类别(Cover Code)…
1 PR H30 2018-11-27 18花沼…低216 S D3 0.759 7 3 2 22 17
2 PR H30 2019-02-26 19花沼…低216 S D3 0.751 8 3 1 24 19
3 PR H30 2019-05-28 19 Hanauma…低216 S D3 0.607 9 3 1 30 20
4 PR H30 2019-08-27 19花沼…低216 S D3 0.615 10 1 8
5 PR H30 2019-11-26 19 Hanauma…低216 S D3 0.622 11 5 1 46 30
6 PR H37 2018-09-09 18花沼…高215 S C1 0.759 6 2 1 14 12
7 PR H37 2018-11-27 18花沼…高215 S C1 0.751 7 3 1 22 19
8 PR H37 2019-03-12 19花沼…高215 S C1 0.759 8 3 1 26 20
9 PR H37 2019-05-21 19花沼…高215 S C1 0.759 9 3 29 21
10 PR H37 2019-09-03 19花沼…高215 S C1 0.683 10 3 1 30 26
11 PR H66 2018-06-05 18花沼…高213 N A1 0.759 5 2 1 20 19
12 PR H66 2018-09-09 18花沼…高213 N A1 0.759 6 2 1 20 19
13 PR H66 2018-12-04 18花沼…高213 N A1 0.653 7 3 1 24 22
14 PR H66 2019-03-05 19 Hanauma…高213 N A1 0.759 8 3 1 25 24
15 PR H66 2019-05-28 19 Hanauma…高213 N A1 0.615 9 3 1 28 24
16 PR H66 2019-09-03 19 Hanauma…高213 N A1 0.531 10 3 1 23 20
17 PR H66 2019-12-03 19 Hanauma…高213 N A1 0.600 11 3 1 23 16
18 PR H76 2018-09-09 18花沼…高213 N A4 0.759 6 3 1 21 18
19 PR H76 2018-12-04 18花沼…高213 N A4 0.653 7 3 1 24 12
20 PR H76 2019-03-05 19花沼…高213 N A4 0.759 8 3 1 22 19
#…还有7个变量:`Height(cm)`、`Status Code`、area_mm_squared、area_cm_squared、Volume_mm_cubed、Volume_cm_cubed、MD
数据帧代码
数据%
变异(Diff=`Max Diameter(cm)`-dplyr::lag(`Max Diameter(cm)`))
输出
data\u output问题在于分组。当我们包括“TimeStep”时,每组只有一行,单个元素的
lag
NA

library(dplyr)
data %>%
   group_by(ID %>%
   mutate(Diff = `Max Diameter (cm)` - dplyr::lag(`Max Diameter (cm)`))

原因是每个组只有一行,该值的
lag
返回
NA
,因此所有元素都有NA。检查
lag(5)
可以检查
count
数据%>%count(ID,TimeStep)%%>%pull(n)[1]1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1不清楚您的期望值。可能您只需要按“ID”分组,即
数据%>%group_by(ID)%%>%mutate(Diff=
Max Diameter(cm)`-dplyr::lag(
Max Diameter(cm)
)%>%pull(Diff)\[1]NA 26-22 38 NA 8 4 3 1 NA 0 4 1 3-5 0 NA 3-2`我同意@akrun的上述评论。我还建议您将直径列重命名为不带空格的列。可能
max_diam_cm
。此外,还有一个
diff()
在R中起作用,这样您就不需要将其指定为减法问题。但您需要在每个组中添加第一个元素才能使其工作。
数据%>%重命名(Max_Diam_cm='Max Diameter(cm))%%>%group_by(ID)%%>%mutate(Diff=Diff(c(first(Max_Diam_cm),Max_Diam_cm))%%>%ungroup()
akrun和statstew感谢您的见解!问题是
data <- structure(list(`Taxonomic Code` = c("PR", "PR", "PR", "PR", "PR", 
"PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", 
"PR", "PR", "PR", "PR"), ID = structure(c(35L, 35L, 35L, 35L, 
35L, 38L, 38L, 38L, 38L, 38L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 
61L, 61L, 61L), .Label = c("H1051", "H108", "H110", "H1101", 
"H112", "H113", "H116", "H118", "H1188", "H1211", "H122", "H125", 
"H1253", "H1289", "H171", "H172", "H174", "H186", "H187", "H188", 
"H189", "H191", "H192", "H236", "H237", "H244", "H252", "H254", 
"H258", "H274", "H277", "H288", "H292", "H293", "H30", "H332", 
"H366", "H37", "H374", "H396", "H466", "H479", "H484", "H499", 
"H531", "H560", "H580", "H593", "H597", "H625", "H644", "H647", 
"H649", "H653", "H66", "H693", "H695", "H712", "H728", "H737", 
"H76", "H760", "H774", "H854", "H926", "H96", "H963", "H98", 
"H985", "H991", "H996", "W1038", "W1101", "W1152", "W1154", "W1192", 
"W1208", "W1209", "W1214", "W1227", "W1243", "W1245", "W1315", 
"W1345", "W1361", "W1377", "W1399", "W1438", "W1494", "W1495", 
"W1537", "W1557", "W1614", "W1636", "W1655", "W1669", "W1690", 
"W1697", "W1729", "W1741", "W1758", "W1782", "W1785", "W1847", 
"W1919", "W2000", "W2004", "W2011", "W2036", "W2044", "W2046", 
"W2131", "W2133", "W234", "W249", "W251", "W254", "W307", "W355", 
"W359", "W369", "W433", "W450", "W461", "W470", "W480", "W538", 
"W542", "W544", "W584", "W601", "W606", "W781", "W79", "W807", 
"W872", "W874", "W887", "W890", "W891", "W923", "W952"), class = "factor"), 
    Date = structure(c(17862, 17953, 18044, 18135, 18226, 17783, 
    17862, 17967, 18037, 18142, 17687, 17783, 17869, 17960, 18044, 
    18142, 18233, 17783, 17869, 17960), class = "Date"), Year = c("18", 
    "19", "19", "19", "19", "18", "18", "19", "19", "19", "18", 
    "18", "18", "19", "19", "19", "19", "18", "18", "19"), Site_long = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L), .Label = c("Hanauma Bay", "Waikiki"), class = "factor"), 
    Shelter = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("High", 
    "Low"), class = "factor"), `Module #` = c(216, 216, 216, 
    216, 216, 215, 215, 215, 215, 215, 213, 213, 213, 213, 213, 
    213, 213, 213, 213, 213), Side = c("S", "S", "S", "S", "S", 
    "S", "S", "S", "S", "S", "N", "N", "N", "N", "N", "N", "N", 
    "N", "N", "N"), Location = c("D3", "D3", "D3", "D3", "D3", 
    "C1", "C1", "C1", "C1", "C1", "A1", "A1", "A1", "A1", "A1", 
    "A1", "A1", "A4", "A4", "A4"), Settlement_Area = c(0.75902336, 
    0.751433126, 0.607218688, 0.614808922, 0.622399155, 0.75902336, 
    0.751433126, 0.75902336, 0.75902336, 0.683121024, 0.75902336, 
    0.75902336, 0.65276009, 0.75902336, 0.614808922, 0.531316352, 
    0.599628454, 0.75902336, 0.65276009, 0.75902336), TimeStep = c(7, 
    8, 9, 10, 11, 6, 7, 8, 9, 10, 5, 6, 7, 8, 9, 10, 11, 6, 7, 
    8), size_class = c(3, 3, 3, 1, 5, 2, 3, 3, 3, 3, 2, 2, 3, 
    3, 3, 3, 3, 3, 3, 3), `Cover Code` = c(2, 1, 1, 1, 1, 1, 
    1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), `Max Diameter (cm)` = c(22, 
    24, 30, 8, 46, 14, 22, 26, 29, 30, 20, 20, 24, 25, 28, 23, 
    23, 21, 24, 22), `Max Orthogonal (cm)` = c(17, 19, 20, 8, 
    30, 12, 19, 20, 21, 26, 19, 19, 22, 24, 24, 20, 16, 18, 12, 
    19), `Height (cm)` = c(2, 2, 3, 1, 3, 1, 2, 1, 1, 3, 1, 1, 
    1, 2, 2, 2, 2, 1, 1, 1), `Status Code` = c(NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, "B", NA, NA, "PB", NA, NA, 
    NA, NA), area_mm_squared = c(374, 456, 600, 64, 1380, 168, 
    418, 520, 609, 780, 380, 380, 528, 600, 672, 460, 368, 378, 
    288, 418), area_cm_squared = c(3.74, 4.56, 6, 0.64, 13.8, 
    1.68, 4.18, 5.2, 6.09, 7.8, 3.8, 3.8, 5.28, 6, 6.72, 4.6, 
    3.68, 3.78, 2.88, 4.18), Volume_mm_cubed = c(391.651884147528, 
    477.522083345649, 942.477796076938, 33.5103216382911, 2167.69893097696, 
    87.9645943005142, 437.728576400178, 272.271363311115, 318.871654339364, 
    1225.22113490002, 198.967534727354, 198.967534727354, 276.460153515902, 
    628.318530717959, 703.716754404114, 481.710873550435, 385.368698840348, 
    197.920337176157, 150.79644737231, 218.864288200089), Volume_cm_cubed = c(0.391651884147528, 
    0.477522083345649, 0.942477796076938, 0.0335103216382911, 
    2.16769893097696, 0.0879645943005142, 0.437728576400178, 
    0.272271363311115, 0.318871654339364, 1.22522113490002, 0.198967534727354, 
    0.198967534727354, 0.276460153515902, 0.628318530717959, 
    0.703716754404114, 0.481710873550435, 0.385368698840348, 
    0.197920337176157, 0.15079644737231, 0.218864288200089), 
    MD = c(22, 24, 30, 8, 46, 14, 22, 26, 29, 30, 20, 20, 24, 
    25, 28, 23, 23, 21, 24, 22)), row.names = c(NA, -20L), class = c("tbl_df", 
"tbl", "data.frame"))
data_new <- data %>% group_by(ID, TimeStep) %>%
  mutate(Diff = `Max Diameter (cm)` - dplyr::lag(`Max Diameter (cm)`))
data_output <- structure(list(`Taxonomic Code` = c("PR", "PR", "PR", "PR", "PR", 
"PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", "PR", 
"PR", "PR", "PR", "PR"), ID = structure(c(35L, 35L, 35L, 35L, 
35L, 38L, 38L, 38L, 38L, 38L, 55L, 55L, 55L, 55L, 55L, 55L, 55L, 
61L, 61L, 61L), .Label = c("H1051", "H108", "H110", "H1101", 
"H112", "H113", "H116", "H118", "H1188", "H1211", "H122", "H125", 
"H1253", "H1289", "H171", "H172", "H174", "H186", "H187", "H188", 
"H189", "H191", "H192", "H236", "H237", "H244", "H252", "H254", 
"H258", "H274", "H277", "H288", "H292", "H293", "H30", "H332", 
"H366", "H37", "H374", "H396", "H466", "H479", "H484", "H499", 
"H531", "H560", "H580", "H593", "H597", "H625", "H644", "H647", 
"H649", "H653", "H66", "H693", "H695", "H712", "H728", "H737", 
"H76", "H760", "H774", "H854", "H926", "H96", "H963", "H98", 
"H985", "H991", "H996", "W1038", "W1101", "W1152", "W1154", "W1192", 
"W1208", "W1209", "W1214", "W1227", "W1243", "W1245", "W1315", 
"W1345", "W1361", "W1377", "W1399", "W1438", "W1494", "W1495", 
"W1537", "W1557", "W1614", "W1636", "W1655", "W1669", "W1690", 
"W1697", "W1729", "W1741", "W1758", "W1782", "W1785", "W1847", 
"W1919", "W2000", "W2004", "W2011", "W2036", "W2044", "W2046", 
"W2131", "W2133", "W234", "W249", "W251", "W254", "W307", "W355", 
"W359", "W369", "W433", "W450", "W461", "W470", "W480", "W538", 
"W542", "W544", "W584", "W601", "W606", "W781", "W79", "W807", 
"W872", "W874", "W887", "W890", "W891", "W923", "W952"), class = "factor"), 
    Date = structure(c(17862, 17953, 18044, 18135, 18226, 17783, 
    17862, 17967, 18037, 18142, 17687, 17783, 17869, 17960, 18044, 
    18142, 18233, 17783, 17869, 17960), class = "Date"), Year = c("18", 
    "19", "19", "19", "19", "18", "18", "19", "19", "19", "18", 
    "18", "18", "19", "19", "19", "19", "18", "18", "19"), Site_long = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L), .Label = c("Hanauma Bay", "Waikiki"), class = "factor"), 
    Shelter = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("High", 
    "Low"), class = "factor"), `Module #` = c(216, 216, 216, 
    216, 216, 215, 215, 215, 215, 215, 213, 213, 213, 213, 213, 
    213, 213, 213, 213, 213), Side = c("S", "S", "S", "S", "S", 
    "S", "S", "S", "S", "S", "N", "N", "N", "N", "N", "N", "N", 
    "N", "N", "N"), Location = c("D3", "D3", "D3", "D3", "D3", 
    "C1", "C1", "C1", "C1", "C1", "A1", "A1", "A1", "A1", "A1", 
    "A1", "A1", "A4", "A4", "A4"), Settlement_Area = c(0.75902336, 
    0.751433126, 0.607218688, 0.614808922, 0.622399155, 0.75902336, 
    0.751433126, 0.75902336, 0.75902336, 0.683121024, 0.75902336, 
    0.75902336, 0.65276009, 0.75902336, 0.614808922, 0.531316352, 
    0.599628454, 0.75902336, 0.65276009, 0.75902336), TimeStep = c(7, 
    8, 9, 10, 11, 6, 7, 8, 9, 10, 5, 6, 7, 8, 9, 10, 11, 6, 7, 
    8), size_class = c(3, 3, 3, 1, 5, 2, 3, 3, 3, 3, 2, 2, 3, 
    3, 3, 3, 3, 3, 3, 3), `Cover Code` = c(2, 1, 1, 1, 1, 1, 
    1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), `Max Diameter (cm)` = c(22, 
    24, 30, 8, 46, 14, 22, 26, 29, 30, 20, 20, 24, 25, 28, 23, 
    23, 21, 24, 22), `Max Orthogonal (cm)` = c(17, 19, 20, 8, 
    30, 12, 19, 20, 21, 26, 19, 19, 22, 24, 24, 20, 16, 18, 12, 
    19), `Height (cm)` = c(2, 2, 3, 1, 3, 1, 2, 1, 1, 3, 1, 1, 
    1, 2, 2, 2, 2, 1, 1, 1), `Status Code` = c(NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, "B", NA, NA, "PB", NA, NA, 
    NA, NA), area_mm_squared = c(374, 456, 600, 64, 1380, 168, 
    418, 520, 609, 780, 380, 380, 528, 600, 672, 460, 368, 378, 
    288, 418), area_cm_squared = c(3.74, 4.56, 6, 0.64, 13.8, 
    1.68, 4.18, 5.2, 6.09, 7.8, 3.8, 3.8, 5.28, 6, 6.72, 4.6, 
    3.68, 3.78, 2.88, 4.18), Volume_mm_cubed = c(391.651884147528, 
    477.522083345649, 942.477796076938, 33.5103216382911, 2167.69893097696, 
    87.9645943005142, 437.728576400178, 272.271363311115, 318.871654339364, 
    1225.22113490002, 198.967534727354, 198.967534727354, 276.460153515902, 
    628.318530717959, 703.716754404114, 481.710873550435, 385.368698840348, 
    197.920337176157, 150.79644737231, 218.864288200089), Volume_cm_cubed = c(0.391651884147528, 
    0.477522083345649, 0.942477796076938, 0.0335103216382911, 
    2.16769893097696, 0.0879645943005142, 0.437728576400178, 
    0.272271363311115, 0.318871654339364, 1.22522113490002, 0.198967534727354, 
    0.198967534727354, 0.276460153515902, 0.628318530717959, 
    0.703716754404114, 0.481710873550435, 0.385368698840348, 
    0.197920337176157, 0.15079644737231, 0.218864288200089), 
    MD = c(22, 24, 30, 8, 46, 14, 22, 26, 29, 30, 20, 20, 24, 
    25, 28, 23, 23, 21, 24, 22), Diff = c(NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, 
    NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
    )), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -20L), groups = structure(list(ID = structure(c(35L, 
35L, 35L, 35L, 35L, 38L, 38L, 38L, 38L, 38L, 55L, 55L, 55L, 55L, 
55L, 55L, 55L, 61L, 61L, 61L), .Label = c("H1051", "H108", "H110", 
"H1101", "H112", "H113", "H116", "H118", "H1188", "H1211", "H122", 
"H125", "H1253", "H1289", "H171", "H172", "H174", "H186", "H187", 
"H188", "H189", "H191", "H192", "H236", "H237", "H244", "H252", 
"H254", "H258", "H274", "H277", "H288", "H292", "H293", "H30", 
"H332", "H366", "H37", "H374", "H396", "H466", "H479", "H484", 
"H499", "H531", "H560", "H580", "H593", "H597", "H625", "H644", 
"H647", "H649", "H653", "H66", "H693", "H695", "H712", "H728", 
"H737", "H76", "H760", "H774", "H854", "H926", "H96", "H963", 
"H98", "H985", "H991", "H996", "W1038", "W1101", "W1152", "W1154", 
"W1192", "W1208", "W1209", "W1214", "W1227", "W1243", "W1245", 
"W1315", "W1345", "W1361", "W1377", "W1399", "W1438", "W1494", 
"W1495", "W1537", "W1557", "W1614", "W1636", "W1655", "W1669", 
"W1690", "W1697", "W1729", "W1741", "W1758", "W1782", "W1785", 
"W1847", "W1919", "W2000", "W2004", "W2011", "W2036", "W2044", 
"W2046", "W2131", "W2133", "W234", "W249", "W251", "W254", "W307", 
"W355", "W359", "W369", "W433", "W450", "W461", "W470", "W480", 
"W538", "W542", "W544", "W584", "W601", "W606", "W781", "W79", 
"W807", "W872", "W874", "W887", "W890", "W891", "W923", "W952"
), class = "factor"), TimeStep = c(7, 8, 9, 10, 11, 6, 7, 8, 
9, 10, 5, 6, 7, 8, 9, 10, 11, 6, 7, 8), .rows = list(1L, 2L, 
    3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
    16L, 17L, 18L, 19L, 20L)), row.names = c(NA, -20L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE))
library(dplyr)
data %>%
   group_by(ID %>%
   mutate(Diff = `Max Diameter (cm)` - dplyr::lag(`Max Diameter (cm)`))