如何识别R中数据帧中值的第n个实例_R

如何识别R中数据帧中值的第n个实例

如何识别R中数据帧中值的第n个实例,r,R,我有一个数据帧投影。由约5000个观测值组成的击球手： > head(projection.hitters) Name Positions points PAR 223 Miguel Cabrera 3B 1007.97 NA 227 Mike Trout OF 962.63 NA 160 Joey Votto 1B 863.27 NA 244 Paul Goldschmidt

我有一个数据帧投影。由约5000个观测值组成的击球手：

> head(projection.hitters)
                Name Positions  points PAR
223   Miguel Cabrera        3B 1007.97  NA
227       Mike Trout        OF  962.63  NA
160       Joey Votto        1B  863.27  NA
244 Paul Goldschmidt        1B  841.83  NA
256       Ryan Braun        OF  825.67  NA
28  Andrew McCutchen        OF  823.67  NA

假设我想通过点找到位置＝＝1b的第n个最佳实例，并指定具有0的PAR值的行。所有其他1B将有PAR值，它们的“点”值之间的差值定义为高于或低于NTH 1B的PAR＝0的点的数量。p> 例如，如果第n个1B是Paul Goldschmidt，他的PAR==0，而Joey Votto的PAR==21.44，即两个点值之间的差值

这将针对每个职位进行

编辑：

我需要为每个职位确定一个不同的第n个实例，例如第12个最佳1B和第80个最佳。

如何使用

plyr

df<-read.table(header=T,text="Name Positions  points PAR
Miguel Cabrera        3B 1007.97  NA
Mike Trout        OF  962.63  NA
Joey Votto        1B  863.27  NA
Paul Goldschmidt        1B  841.83  NA
Ryan Braun        OF  825.67  NA
Andrew McCutchen        OF  823.67  NA")

n<-1  #instance

require(plyr)
ddply(df,.(Positions),mutate,PAR=abs(points-points[n]))

#         Name Positions  points    PAR
#1       Votto        1B  863.27   0.00
#2 Goldschmidt        1B  841.83  21.44
#3     Cabrera        3B 1007.97   0.00
#4       Trout        OF  962.63   0.00
#5       Braun        OF  825.67 136.96
#6   McCutchen        OF  823.67 138.96

df这在dplyr中更容易实现，因为它提供了nth
函数来提取第n个值（按原始行顺序，或按另一个变量排序）
？这解决了部分问题，但如何指定第n个

最佳值？哦，老鼠：我错过了这个--您是希望每个位置都有第n行，还是希望每个位置都有第n个最高（或最低）点的值？您的标题和文本不明确。很抱歉造成不明确。我正在寻找每个位置的第n行，但n将根据位置而变化。我添加了一个编辑，试图增加清晰度。但你的编辑说“第12名最佳…”，而你的评论说“该职位的第n行”。因此，除非您的数据帧已按

点排序，否则这些点不相同。这几乎就是我要找的，谢谢。如果我想为每个位置指定不同的“n”，该怎么办？例如，假设我想要在示例中更新的第12个最佳1B，但第80个最佳-只需将n设为一个向量
n<-matrix(c(1,1,2),ncol=1,dimnames=list(unique(df$Positions))) # first, first and second instance of player per positions

require(plyr)
ddply(df,.(Positions),mutate,PAR=abs(points-points[n[Positions]]))

Name Positions  points    PAR
1       Votto        1B  863.27   0.00
2 Goldschmidt        1B  841.83  21.44
3     Cabrera        3B 1007.97   0.00
4       Trout        OF  962.63 136.96
5       Braun        OF  825.67   0.00
6   McCutchen        OF  823.67   2.00    

transform(projection.hitters, PAR = ave(points, Positions, 
                                        FUN = function(x) x - min(x)))

                Name Positions  points    PAR
223   Miguel Cabrera        3B 1007.97   0.00
227       Mike Trout        OF  962.63 138.96
160       Joey Votto        1B  863.27  21.44
244 Paul Goldschmidt        1B  841.83   0.00
256       Ryan Braun        OF  825.67   2.00
28  Andrew McCutchen        OF  823.67   0.00

df <- read.csv(text =
"name,position,points
Miguel Cabrera,3B,1007.97
Mike Trout,OF,962.63
Joey Votto,1B,863.27
Paul Goldschmidt,1B,841.83
Ryan Braun,OF,825.67
Andrew McCutchen,OF,823.67", stringsAsFactors = FALSE)

library(dplyr)
df %.% 
  group_by(position) %.%
  mutate(
    offset = nth(points, 1, order_by = points), 
    delta = points - offset
  )

nth <- c("OF" = 3, "3B" = 1, "1B" = 2)
df %.% 
  group_by(position) %.%
  mutate(
    pos = nth[position],
    offset = nth(points, pos[1], order_by = points), 
    delta = points - offset
  )