R 基于数字循环变量的分组,在数据帧中查找数字列的摘要统计信息?
我有一个名为coom.az的数据框架,它以“每公顷作物肥料吨数”(Nopt)为列,按年份顺序排列。在同一个数据框中,我添加了一个列,该列也链接到年份列,该列显示了当时发生的特定ENSO(阶段)。比如,R 基于数字循环变量的分组,在数据帧中查找数字列的摘要统计信息?,r,R,我有一个名为coom.az的数据框架,它以“每公顷作物肥料吨数”(Nopt)为列,按年份顺序排列。在同一个数据框中,我添加了一个列,该列也链接到年份列,该列显示了当时发生的特定ENSO(阶段)。比如, Year Nopt phase 1950 52.5 La Nina 1951 65.2 La Nina 1952 50.0 Neutral 1953 70.9 Neut
Year Nopt phase
1950 52.5 La Nina
1951 65.2 La Nina
1952 50.0 Neutral
1953 70.9 Neutral
1954 63.4 Neutral
1955 43.3 El Nino
等等
我想做一个新的向量,它计算了每个“阶段变化”中肥料的中值,例如,我的第一个值是52.5和65.2的中值,因为它们都发生在拉尼娜阶段。下一个值将是50.0、70.9和63.4的中值,因为它们处于“中性阶段”,依此类推
我尝试使用tidyverse代码,如下所示:
#data for ENSO
phase_coom = coom.az$ONI
#data for Nopt_coom
Nopt_coom <- coom.az$Nopt_fertN_kg_ha
#creating a test dataset
medians <- data.frame(phase_coom, Nopt_coom)
library(tidyverse)
medians %>%
group_by(phase_coom) %>%
summarise(median = median(Nopt_coom))
#This works to give me:
"phase median"
La Nina 74.5
Neutral 86.0
El Nino 78.0
但这种编码不起作用。我是r的业余爱好者,所以在那一点上我只是想看看它是否有效。任何帮助都将不胜感激!
谢谢我想你是在找这样的东西吧
#Input data
df <- data.frame(Year = c(1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964),
Nopt = c(52.5, 65.2, 50.0, 70.9, 63.4, 43.3, 22.1, 20.0, 84.5, 55.8, 60.0, 22.1, 30.5, 70.8, 55.2),
phase = c("La Nina", "La Nina", "Neutral", "Neutral", "Neutral", "El Nino", "El Nino", "El Nino", "La Nina", "Neutral", "El Nino", "La Nina", "La Nina", "La Nina", "Neutral"),
stringsAsFactors = FALSE)
df
# Year Nopt phase
# 1 1950 52.5 La Nina
# 2 1951 65.2 La Nina
# 3 1952 50.0 Neutral
# 4 1953 70.9 Neutral
# 5 1954 63.4 Neutral
# 6 1955 43.3 El Nino
# 7 1956 22.1 El Nino
# 8 1957 20.0 El Nino
# 9 1958 84.5 La Nina
# 10 1959 55.8 Neutral
# 11 1960 60.0 El Nino
# 12 1961 22.1 La Nina
# 13 1962 30.5 La Nina
# 14 1963 70.8 La Nina
# 15 1964 55.2 Neutral
#tmp df to hold phases and their medians
tmpdf <- data.frame(curphase = c(), curmedian = c())
#Initialization of loop variables
i <- 1
j <- 0
#Main loop
while ( i <= nrow(df) ) {
curphase <- df$phase[i]
j <- i
while ( df$phase[j] == curphase & j <= nrow(df)) {
j <- j + 1
}
curmedian <- median(df$Nopt[i:(j-1)])
i <- j
tmpdf <- rbind(tmpdf, data.frame(curphase, curmedian))
}
tmpdf
# curphase curmedian
# 1 La Nina 58.85
# 2 Neutral 63.40
# 3 El Nino 22.10
# 4 La Nina 84.50
# 5 Neutral 55.80
# 6 El Nino 60.00
# 7 La Nina 30.50
# 8 Neutral 55.20
#输入数据
我很困惑。您需要每个阶段分组的中值Nopt
值,对吗?这就是你的tidyverse
代码给你的。另外,“这显然不是我想要的。”也许发布预期的输出,这里的人可能会更好地帮助你?嗨,对不起!是的,我的意思是代码的输出是正确的,但这不是我想要实现的目标,也就是说,生成一个新的向量,该向量计算了每个相变中的肥料中值,正如我在前面一段中解释的:)谢谢!很抱歉,这让人困惑!对非常感谢,这帮了大忙。很好,我只需要一些有用的东西!
#Input data
df <- data.frame(Year = c(1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964),
Nopt = c(52.5, 65.2, 50.0, 70.9, 63.4, 43.3, 22.1, 20.0, 84.5, 55.8, 60.0, 22.1, 30.5, 70.8, 55.2),
phase = c("La Nina", "La Nina", "Neutral", "Neutral", "Neutral", "El Nino", "El Nino", "El Nino", "La Nina", "Neutral", "El Nino", "La Nina", "La Nina", "La Nina", "Neutral"),
stringsAsFactors = FALSE)
df
# Year Nopt phase
# 1 1950 52.5 La Nina
# 2 1951 65.2 La Nina
# 3 1952 50.0 Neutral
# 4 1953 70.9 Neutral
# 5 1954 63.4 Neutral
# 6 1955 43.3 El Nino
# 7 1956 22.1 El Nino
# 8 1957 20.0 El Nino
# 9 1958 84.5 La Nina
# 10 1959 55.8 Neutral
# 11 1960 60.0 El Nino
# 12 1961 22.1 La Nina
# 13 1962 30.5 La Nina
# 14 1963 70.8 La Nina
# 15 1964 55.2 Neutral
#tmp df to hold phases and their medians
tmpdf <- data.frame(curphase = c(), curmedian = c())
#Initialization of loop variables
i <- 1
j <- 0
#Main loop
while ( i <= nrow(df) ) {
curphase <- df$phase[i]
j <- i
while ( df$phase[j] == curphase & j <= nrow(df)) {
j <- j + 1
}
curmedian <- median(df$Nopt[i:(j-1)])
i <- j
tmpdf <- rbind(tmpdf, data.frame(curphase, curmedian))
}
tmpdf
# curphase curmedian
# 1 La Nina 58.85
# 2 Neutral 63.40
# 3 El Nino 22.10
# 4 La Nina 84.50
# 5 Neutral 55.80
# 6 El Nino 60.00
# 7 La Nina 30.50
# 8 Neutral 55.20