Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/80.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/image-processing/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 基于非累积行和的数据帧子集_R_Sum_Subset_Rows - Fatal编程技术网

R 基于非累积行和的数据帧子集

R 基于非累积行和的数据帧子集,r,sum,subset,rows,R,Sum,Subset,Rows,我想基于非累积行和更多条件在r中子集一个数据帧 例如,我有以下数据框: x<-data.frame(x1=c(1,2,3,4,5,6,7,8,9),x2=c(70,1,6,23,98,21,45,8,6)) 因为x2值之和小于60,而x1值大于2 由于解决方案是动态的,另一个可能的结果可能是: x1 x2 7 7 45 8 8 8 9 9 6 或: 一旦我理解了如何实现它,我将通过添加更多的条件来限制可能的解决方案集 为Ronak Shah编辑 附加列x3,因此数据帧x变

我想基于非累积行和更多条件在r中子集一个数据帧

例如,我有以下数据框:

x<-data.frame(x1=c(1,2,3,4,5,6,7,8,9),x2=c(70,1,6,23,98,21,45,8,6))
因为x2值之和小于60,而x1值大于2

由于解决方案是动态的,另一个可能的结果可能是:

  x1 x2
7  7 45
8  8  8
9  9  6
或:

一旦我理解了如何实现它,我将通过添加更多的条件来限制可能的解决方案集

为Ronak Shah编辑

附加列x3,因此数据帧x变为:

x<-data.frame(x1=c(1,2,3,4,5,6,7,8,9),x2=c(70,1,6,23,98,21,45,8,6),x3=c(13,2,31,45,5,6,7,18,0))
x=x3_thresh)-1],]
}

我们可以编写一个函数来子集数据帧

subset_df_row <- function(x, x1_value, x2_thresh) {
    #Filter the dataframe based on x1_value
    df1 <- x[x$x1 > x1_value, ]
    #Shuffle rows to get random result
    df1 <- df1[sample(seq_len(nrow(df1))), ]
    #If the first value of x2 is greater than threshold shuffle again
    while(df1$x2[1] >= x2_thresh) {
      df1 <- df1[sample(seq_len(nrow(df1))), ]
    }
    #Return the subset
    df1[1 : (which.max(cumsum(df1$x2) >= x2_thresh) - 1), ]
}

如果您将自己限制在特定大小的“窗口”(
n
),您可以使用滚动求和并提取长度
n
的所有子集?好主意!谢谢假设现在数据帧x有第三列x3,我想对其应用一个非累积和条件,如x2。我应该添加第二个while循环,还是可以在同一个while循环中集成x2和x3来洗牌df1?在x3上,您的最后一行将如何随其他条件发生变化。为了简单起见,我修改了您的解决方案,如果我可以进一步改进,请更正。
  x1 x2
3  3  6
x<-data.frame(x1=c(1,2,3,4,5,6,7,8,9),x2=c(70,1,6,23,98,21,45,8,6),x3=c(13,2,31,45,5,6,7,18,0))
subset_df_row <- function(x, x1_value, x2_thresh, x3_thresh) {
  #Filter the dataframe based on x1_value
  df1 <- x[x$x1 > x1_value, ]
  #Shuffle rows to get random result
  df1 <- df1[sample(seq_len(nrow(df1))), ]
  #If the first value of x2 is greater than threshold shuffle again
  while(df1$x2[1] >= x2_thresh || df1$x3[1] >= x3_thresh) {
    df1 <- df1[sample(seq_len(nrow(df1))), ]
  }
  #Return the subset
  df1[1 : min((which.max(cumsum(df1$x2) >= x2_thresh) - 1),
              (which.max(cumsum(df1$x3) >= x3_thresh) - 1)), ]
}
subset_df_row <- function(x, x1_value, x2_thresh) {
    #Filter the dataframe based on x1_value
    df1 <- x[x$x1 > x1_value, ]
    #Shuffle rows to get random result
    df1 <- df1[sample(seq_len(nrow(df1))), ]
    #If the first value of x2 is greater than threshold shuffle again
    while(df1$x2[1] >= x2_thresh) {
      df1 <- df1[sample(seq_len(nrow(df1))), ]
    }
    #Return the subset
    df1[1 : (which.max(cumsum(df1$x2) >= x2_thresh) - 1), ]
}
subset_df_row(x, 2, 60)
#  x1 x2
#6  6 21
#8  8  8

subset_df_row(x, 3, 160)
#  x1 x2
#8  8  8
#5  5 98
#4  4 23