R 对列索引变化的行进行逻辑检查
我一直在为这个具有大数据集的特殊任务开发R代码 示例数据框如下所示:R 对列索引变化的行进行逻辑检查,r,dataframe,multiple-columns,R,Dataframe,Multiple Columns,我一直在为这个具有大数据集的特殊任务开发R代码 示例数据框如下所示: mon abb Apr May Jun Jul Aug Sep Oct Nov 5 May 2 4 2 5 0 0 7 0 5 May 6 5 1 1 3 0 6 4 5 May 3 1 0 1 1 2 8 8 7 Jul 5 4 1 0 0 0 9 1
mon abb Apr May Jun Jul Aug Sep Oct Nov
5 May 2 4 2 5 0 0 7 0
5 May 6 5 1 1 3 0 6 4
5 May 3 1 0 1 1 2 8 8
7 Jul 5 4 1 0 0 0 9 1
7 Jul 3 3 4 3 4 4 9 9
7 Jul 4 2 3 3 1 2 7 4
7 Jul 4 1 4 2 3 5 4 3
6 Jun 4 0 4 3 3 6 5 5
7 Jul 4 4 5 3 4 8 8 8
5 May 4 -1 6 4 4 9 5 4
7 Jul 4 -2 4 4 2 6 6 9
对于列abb
中与列名称month匹配的月份中的每一行,相应单元格中的数字将与后续数字进行比较,并且创建列count
,其次数小于其他单元格中的数字。希望它清楚
Output would look like
mon abb Apr May Jun Jul Aug Sep Oct Nov Count
5 May 2 4 2 5 0 0 7 0 2
5 May 6 5 1 1 3 0 6 4 1
5 May 3 1 0 1 1 2 8 8 3
7 Jul 5 4 1 0 0 0 9 1 2
7 Jul 3 3 4 3 4 4 9 9 4
7 Jul 4 2 3 3 1 2 7 4 2
7 Jul 4 1 4 2 3 5 4 3 4
6 Jun 4 0 4 3 3 6 5 5 3
7 Jul 4 4 5 3 4 8 8 8 4
5 May 4 -1 6 4 4 9 5 4 6
7 Jul 4 -2 4 4 2 6 6 9 3
我创建了列索引
conhead$b=(匹配(conhead[,conhead$monthabb],colnames(conhead[,24:31]))+23)
无法继续。请分享更好的逻辑。这里有一个使用
tidyverse
的选项。使用rownames\u to_column
创建一个序列列,在按序列('rn')分组后,将数据集收集为'long'格式,切片
将'abb'等于'key'的行,通过取逻辑表达式(val[-1]>first(val)
)的和
进行总结即,计算有多少值大于匹配发生的第一个元素,并将其绑定为原始数据集中的一列(“df1”)
base R
将使用行/列索引来提取元素,然后创建逻辑矩阵以获取rowsumes
#column index position where the match occurs with 'abb' column and column names
i1 <- match(df1$abb, names(df1)[-(1:2)])
#replace elements in each row before the match to NA
m1 <- replace(df1[-(1:2)], cbind(rep(seq_along(i1), i1-1), sequence(i1-1)), NA)
#extract the elements where the match occured and compare it with 'm1'
df1$Count <- rowSums(m1 > df1[-(1:2)][cbind(1:nrow(df1), i1)], na.rm = TRUE)
df1$Count
#[1] 2 1 3 2 4 2 4 3 4 6 3
#与“abb”列和列名匹配的列索引位置
i1@AlbertRajan您是否已经加载了dplyr
,tidyr`packages您可以只做df1%gather(key,val,Apr:Nov)%%>%groupby(rn)%%>%slice((which(abb==key)):n())%%summase(Count=sum(val[-1]>first(val))%%arrange(as.integer(rn))%%>%pull(Count)%%pull)(Count)>%bind\cols(df1,Count=)
#column index position where the match occurs with 'abb' column and column names
i1 <- match(df1$abb, names(df1)[-(1:2)])
#replace elements in each row before the match to NA
m1 <- replace(df1[-(1:2)], cbind(rep(seq_along(i1), i1-1), sequence(i1-1)), NA)
#extract the elements where the match occured and compare it with 'm1'
df1$Count <- rowSums(m1 > df1[-(1:2)][cbind(1:nrow(df1), i1)], na.rm = TRUE)
df1$Count
#[1] 2 1 3 2 4 2 4 3 4 6 3