在一列中选择dataframe给定值的一半_R_Select_Subset_Rows

在一列中选择dataframe给定值的一半

r select

在一列中选择dataframe给定值的一半,r,select,subset,rows,R,Select,Subset,Rows,我想在其中一列中选择dataframe给定值的一半。换句话说，从下面给出的数据框中，我需要提取Y列中给定值行的一半： DF: id1 column Y value 9830 A 6 7609 A 0 9925 B 0 9922 B 5 9916 B 6 9917 B 8 9914 C 2 9914 C

我想在其中一列中选择dataframe给定值的一半。换句话说，从下面给出的数据框中，我需要提取Y列中给定值行的一半：

DF:
 id1  column Y   value
9830     A         6 
7609     A         0 
9925     B         0 
9922     B         5 
9916     B         6
9917     B         8 
9914     C         2
9914     C         7
9914     C         7
9914     C         2
9914     C         9

新数据框应如下所示：

  NEW DF:
     id1  column Y   value
    9830     A         6 
    9925     B         0 
    9922     B         5 
    9914     C         2
    9914     C         7

此外，了解选择给定列Y的所有行datefram DF的随机一半的解决方案（例如，不选择前50%）也很有帮助

感谢您的帮助。

谢谢

假设您希望每组行的前半部分在

列Y

中具有相同的值，其中对于奇数行，我们可以从

dplyr

使用

filter

：

library(dplyr)
df %>% group_by(`column Y`) %>% filter(row_number() <= floor(n()/2))
##Source: local data frame [5 x 3]
##Groups: column Y [3]
##
##    id1 column Y laclen
##  <int>   <fctr>  <int>
##1  9830        A      6
##2  9925        B      0
##3  9922        B      5
##4  9914        C      2
##5  9914        C      7

太棒了，谢谢！您知道如何随机选择50%的行，而不仅仅是前50%的行吗？

set.seed(123)
result <- df %>% group_by(`column Y`) %>% filter(row_number() %in% sample(seq_len(n()),floor(n()/2)))
##Source: local data frame [5 x 3]
##Groups: column Y [3]
##
##    id1 column Y laclen
##  <int>   <fctr>  <int>
##1  9830        A      6
##2  9922        B      5
##3  9917        B      8
##4  9914        C      2
##5  9914        C      9