R 基于数字循环变量的分组,在数据帧中查找数字列的摘要统计信息?

R 基于数字循环变量的分组,在数据帧中查找数字列的摘要统计信息?,r,R,我有一个名为coom.az的数据框架,它以“每公顷作物肥料吨数”(Nopt)为列,按年份顺序排列。在同一个数据框中,我添加了一个列,该列也链接到年份列,该列显示了当时发生的特定ENSO(阶段)。比如, Year Nopt phase 1950 52.5 La Nina 1951 65.2 La Nina 1952 50.0 Neutral 1953 70.9 Neut

我有一个名为coom.az的数据框架,它以“每公顷作物肥料吨数”(Nopt)为列,按年份顺序排列。在同一个数据框中,我添加了一个列,该列也链接到年份列,该列显示了当时发生的特定ENSO(阶段)。比如,

Year    Nopt           phase
1950    52.5           La Nina
1951    65.2           La Nina
1952    50.0           Neutral
1953    70.9           Neutral
1954    63.4           Neutral
1955    43.3           El Nino
等等

我想做一个新的向量,它计算了每个“阶段变化”中肥料的中值,例如,我的第一个值是52.5和65.2的中值,因为它们都发生在拉尼娜阶段。下一个值将是50.0、70.9和63.4的中值,因为它们处于“中性阶段”,依此类推

我尝试使用tidyverse代码,如下所示:

#data for ENSO
phase_coom = coom.az$ONI

#data for Nopt_coom
Nopt_coom <- coom.az$Nopt_fertN_kg_ha

#creating a test dataset
medians <- data.frame(phase_coom, Nopt_coom)

library(tidyverse)
medians %>%
  group_by(phase_coom) %>% 
  summarise(median = median(Nopt_coom))

#This works to give me:

"phase   median"
La Nina  74.5           
Neutral  86.0           
El Nino  78.0   
但这种编码不起作用。我是r的业余爱好者,所以在那一点上我只是想看看它是否有效。任何帮助都将不胜感激!
谢谢
我想你是在找这样的东西吧

#Input data
df <- data.frame(Year = c(1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964), 
                  Nopt = c(52.5, 65.2, 50.0, 70.9, 63.4, 43.3, 22.1, 20.0, 84.5, 55.8, 60.0, 22.1, 30.5, 70.8, 55.2), 
                  phase = c("La Nina", "La Nina", "Neutral", "Neutral", "Neutral", "El Nino", "El Nino", "El Nino", "La Nina", "Neutral", "El Nino", "La Nina", "La Nina", "La Nina", "Neutral"), 
                  stringsAsFactors = FALSE)

df

#    Year Nopt   phase
# 1  1950 52.5 La Nina
# 2  1951 65.2 La Nina
# 3  1952 50.0 Neutral
# 4  1953 70.9 Neutral
# 5  1954 63.4 Neutral
# 6  1955 43.3 El Nino
# 7  1956 22.1 El Nino
# 8  1957 20.0 El Nino
# 9  1958 84.5 La Nina
# 10 1959 55.8 Neutral
# 11 1960 60.0 El Nino
# 12 1961 22.1 La Nina
# 13 1962 30.5 La Nina
# 14 1963 70.8 La Nina
# 15 1964 55.2 Neutral

#tmp df to hold phases and their medians
tmpdf <- data.frame(curphase = c(), curmedian = c())

#Initialization of loop variables
i <- 1
j <- 0

#Main loop
while ( i <= nrow(df) ) {
  curphase <- df$phase[i]
  j <- i
    while ( df$phase[j] == curphase & j <= nrow(df)) {
      j <- j + 1
      }
  curmedian <- median(df$Nopt[i:(j-1)])
  i <- j
  tmpdf <- rbind(tmpdf, data.frame(curphase, curmedian))
}

tmpdf
#   curphase curmedian
# 1  La Nina     58.85
# 2  Neutral     63.40
# 3  El Nino     22.10
# 4  La Nina     84.50
# 5  Neutral     55.80
# 6  El Nino     60.00
# 7  La Nina     30.50
# 8  Neutral     55.20
#输入数据

我很困惑。您需要每个
阶段
分组的中值
Nopt
值,对吗?这就是你的
tidyverse
代码给你的。另外,“这显然不是我想要的。”也许发布预期的输出,这里的人可能会更好地帮助你?嗨,对不起!是的,我的意思是代码的输出是正确的,但这不是我想要实现的目标,也就是说,生成一个新的向量,该向量计算了每个相变中的肥料中值,正如我在前面一段中解释的:)谢谢!很抱歉,这让人困惑!对非常感谢,这帮了大忙。很好,我只需要一些有用的东西!
#Input data
df <- data.frame(Year = c(1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964), 
                  Nopt = c(52.5, 65.2, 50.0, 70.9, 63.4, 43.3, 22.1, 20.0, 84.5, 55.8, 60.0, 22.1, 30.5, 70.8, 55.2), 
                  phase = c("La Nina", "La Nina", "Neutral", "Neutral", "Neutral", "El Nino", "El Nino", "El Nino", "La Nina", "Neutral", "El Nino", "La Nina", "La Nina", "La Nina", "Neutral"), 
                  stringsAsFactors = FALSE)

df

#    Year Nopt   phase
# 1  1950 52.5 La Nina
# 2  1951 65.2 La Nina
# 3  1952 50.0 Neutral
# 4  1953 70.9 Neutral
# 5  1954 63.4 Neutral
# 6  1955 43.3 El Nino
# 7  1956 22.1 El Nino
# 8  1957 20.0 El Nino
# 9  1958 84.5 La Nina
# 10 1959 55.8 Neutral
# 11 1960 60.0 El Nino
# 12 1961 22.1 La Nina
# 13 1962 30.5 La Nina
# 14 1963 70.8 La Nina
# 15 1964 55.2 Neutral

#tmp df to hold phases and their medians
tmpdf <- data.frame(curphase = c(), curmedian = c())

#Initialization of loop variables
i <- 1
j <- 0

#Main loop
while ( i <= nrow(df) ) {
  curphase <- df$phase[i]
  j <- i
    while ( df$phase[j] == curphase & j <= nrow(df)) {
      j <- j + 1
      }
  curmedian <- median(df$Nopt[i:(j-1)])
  i <- j
  tmpdf <- rbind(tmpdf, data.frame(curphase, curmedian))
}

tmpdf
#   curphase curmedian
# 1  La Nina     58.85
# 2  Neutral     63.40
# 3  El Nino     22.10
# 4  La Nina     84.50
# 5  Neutral     55.80
# 6  El Nino     60.00
# 7  La Nina     30.50
# 8  Neutral     55.20