选择Dataframe中的行百分比_R_Dataframe_Percentage

选择Dataframe中的行百分比

r dataframe

选择Dataframe中的行百分比,r,dataframe,percentage,R,Dataframe,Percentage,是否有基于百分比选择数据框中的行的函数例如，选择前25%的行，然后选择后50%的行，最后选择最后25%的函数head（x，n）和tail（x，n）生成数据帧x的第一行和最后一行n。设置n=上限（nrow（x）*百分比/100）允许获取行的第一个和最后一个百分比： head_percent <- function(x, percent) { # need validation of input x and percent!! head(x, ceiling( nrow(x)*

是否有基于百分比选择数据框中的行的函数

例如，选择前25%的行，然后选择后50%的行，最后选择最后25%的函数

head（x，n）

和

tail（x，n）

生成数据帧

x的第一行和最后一行n
。设置n=上限（nrow（x）*百分比/100）
允许获取行的第一个和最后一个百分比：
head_percent <- function(x, percent) {
   # need validation of input x and percent!! 
   head(x, ceiling( nrow(x)*percent/100)) 
}

# last percent of a dataframe
tail_percent <- function(x, percent) {
   # need validation of input x and percent!! 
   tail(x, ceiling( nrow(x)*percent/100)) 
}

head\u percent在实践中，您无法始终以绝对精度获取切片，因为要选择的行数必须是整数。因此，如果不小心，在任何集合中重复某些行是很常见的
要获取数据集前25%的行，即iris
，可以执行以下操作：
iris[1:as.integer(nrow(iris)*0.25),]

as.integer
函数将为您提供浮点的整数部分，因此您将始终保留一些行，但它将保证在选择下一个50%时不重复任何行
iris %>% 
  mutate(pctrow = seq(1,nrow(.),1)/nrow(.)) %>% 
  filter(pctrow < 0.25)

iris %>% 
  mutate(pctrow = seq(1,nrow(.),1)/nrow(.)) %>% 
  filter(pctrow > 0.25, pctrow < 0.75)

如果您使用的是tidyverse，则可以执行以下操作以获得前25%和后50%
iris %>% 
  mutate(pctrow = seq(1,nrow(.),1)/nrow(.)) %>% 
  filter(pctrow < 0.25)

iris %>% 
  mutate(pctrow = seq(1,nrow(.),1)/nrow(.)) %>% 
  filter(pctrow > 0.25, pctrow < 0.75)

iris%>%
突变（pctrow=seq（1，nrow（.））/nrow（.））%>%
过滤器（pctrow<0.25）
虹膜%>%
突变（pctrow=seq（1，nrow（.））/nrow（.））%>%
过滤器（pctrow>0.25，pctrow<0.75）
在dplyr
中，您可以使用slice
和nrow（）
：
如果您创建一个小的可复制的示例以及预期的输出，那么会更容易提供帮助。了解。
# the first 25%:
iris %>%
  slice(1:round(nrow(.)/4,0))

# the first 50%:
iris %>%
  slice(1:round(nrow(.)/2,0))

# the first 75%:
iris %>%
  slice(1:round(nrow(.)/4*3,0))

# the middle 50% (i.e., after the first 25% and before the last 25%;
# the rownumbers are merely for you to check that you `slice`d the right rows:):
iris %>%
  mutate(r_num = row_number()) %>% 
  slice(round(nrow(.)/4,0):round(nrow(.)/4*3,0))