Python中的Agrupate变量_Python_R_Statistics_Grouping

Python中的Agrupate变量

python r statistics

Python中的Agrupate变量,python,r,statistics,grouping,Python,R,Statistics,Grouping,我在R中有这段代码，但我想在python中执行相同的过程，我不知道如何重命名de dataframe中变量的值 ingresos <- sample(0:100,15,replace = T) sexo <- sample(0:1,15,replace=T) base2 <- data.frame(ingresos, sexo) base2$grupo[as.numeric(base2$ingresos) >= 0 & as.nu

我在R中有这段代码，但我想在python中执行相同的过程，我不知道如何重命名de dataframe中变量的值

ingresos <- sample(0:100,15,replace = T)

sexo <- sample(0:1,15,replace=T)

base2 <- data.frame(ingresos, sexo)

base2$grupo[as.numeric(base2$ingresos) >= 0 &
              as.numeric(base2$ingresos)<=29] <- 1

base2$grupo[as.numeric(base2$ingresos) >= 30 &
              as.numeric(base2$ingresos)<=49] <- 2

base2$grupo[as.numeric(base2$ingresos) >= 50 &
              as.numeric(base2$ingresos)<=69] <- 3

base2$grupo[as.numeric(base2$ingresos) >= 70] <- 4

base2

感谢您抽出时间来参加R活动，您可以：

base2$bins = cut(base2$ingresos,breaks=c(0,30,50,70,+Inf),
include.lowest=TRUE,right=FALSE)

   ingresos sexo     bins
1        38    0  [30,50)
2        98    0 [70,Inf]
3        17    1   [0,30)
4        76    1 [70,Inf]
5        54    0  [50,70)
6        91    1 [70,Inf]
7         4    0   [0,30)
8        68    0  [50,70)
9         9    0   [0,30)
10       32    0  [30,50)
11       13    0   [0,30)
12       64    1  [50,70)
13       35    1  [30,50)
14       44    0  [30,50)
15       63    0  [50,70)

在熊猫中，您可以执行以下操作：

base2['bins'] = pd.cut(base2['Ingreso'],
bins=[0,30, 50,70,+np.Inf],include_lowest=True,right=False)


Ingreso Grupo   bins
Id          
1   57  1   [50.0, 70.0)
2   71  2   [70.0, inf)
3   25  3   [0.0, 30.0)
4   45  4   [30.0, 50.0)
5   26  5   [0.0, 30.0)
6   1   6   [0.0, 30.0)
7   51  7   [50.0, 70.0)
8   39  8   [30.0, 50.0)
9   67  9   [50.0, 70.0)
10  78  10  [70.0, inf)
11  58  11  [50.0, 70.0)
12  27  12  [0.0, 30.0)
13  48  13  [30.0, 50.0)
14  75  14  [70.0, inf)
15  22  15  [0.0, 30.0)

在R中，您可以使用

cut

获取这些组。也许可以尝试沿着“CreateClasseContinuousVariablePython”的思路进行搜索以获得答案？例如，请参见或

base2['bins'] = pd.cut(base2['Ingreso'],
bins=[0,30, 50,70,+np.Inf],include_lowest=True,right=False)


Ingreso Grupo   bins
Id          
1   57  1   [50.0, 70.0)
2   71  2   [70.0, inf)
3   25  3   [0.0, 30.0)
4   45  4   [30.0, 50.0)
5   26  5   [0.0, 30.0)
6   1   6   [0.0, 30.0)
7   51  7   [50.0, 70.0)
8   39  8   [30.0, 50.0)
9   67  9   [50.0, 70.0)
10  78  10  [70.0, inf)
11  58  11  [50.0, 70.0)
12  27  12  [0.0, 30.0)
13  48  13  [30.0, 50.0)
14  75  14  [70.0, inf)
15  22  15  [0.0, 30.0)