Python 当存在多个重复列时,删除某个列的重复项

Python 当存在多个重复列时,删除某个列的重复项,python,pandas,duplicates,Python,Pandas,Duplicates,我有一个具有多个重复列的dataframe,但我想删除“class”列的副本,同时保持其他重复列的完整性。下面您可以看到有许多重复的列。然而,我只想删除“类”列,只保留一份副本。其他列应保持不变,行号不应更改 数据帧: train=pd.DataFrame({'class':{0:1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 1, 9: 2, 10: 3, 11: 4, 12: 5, 13: 6, 14: 7, 15: 8}, 'class.1'

我有一个具有多个重复列的dataframe,但我想删除“class”列的副本,同时保持其他重复列的完整性。下面您可以看到有许多重复的列。然而,我只想删除“类”列,只保留一份副本。其他列应保持不变,行号不应更改

数据帧:

train=pd.DataFrame({'class':{0:1,
1: 2,
2: 3,
3: 4,
4: 5,
5: 6,
6: 7,
7: 8,
8: 1,
9: 2,
10: 3,
11: 4,
12: 5,
13: 6,
14: 7,
15: 8},
'class.1':{0:1,
1: 2,
2: 3,
3: 4,
4: 5,
5: 6,
6: 7,
7: 8,
8: 1,
9: 2,
10: 3,
11: 4,
12: 5,
13: 6,
14: 7,
15: 8},
'class.2':{0:1,
1: 2,
2: 3,
3: 4,
4: 5,
5: 6,
6: 7,
7: 8,
8: 1,
9: 2,
10: 3,
11: 4,
12: 5,
13: 6,
14: 7,
15: 8},
'x_feature_1':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'x_功能_1.1':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'x_feature_2':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'y_feature_1':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'y_feature_2':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'y_feature_2.1':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
‘z_特征_1’:{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
‘z_特征_1.1’:{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
‘z_特征_2’:{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296}})
预期:

expected=pd.DataFrame({'class':{0:1,
1: 2,
2: 3,
3: 4,
4: 5,
5: 6,
6: 7,
7: 8,
8: 1,
9: 2,
10: 3,
11: 4,
12: 5,
13: 6,
14: 7,
15: 8},
'x_feature_1':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'x_功能_1.1':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'x_feature_2':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'y_feature_1':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'y_feature_2':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
'y_feature_2.1':{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277,
7: 0.30530296,
8: -0.30424321,
9: 1.6273111,
10: 0.66127653,
11: 0.0051847840000000004,
12: 1.2861978,
13: -0.47925246,
14: 1.4743277,
15: 0.30530296},
‘z_特征_1’:{0:-0.30424321,
1: 1.6273111,
2: 0.66127653,
3: 0.0051847840000000004,
4: 1.2861978,
5: -0.47925246,
6: 1.4743277
IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
m1 = train.columns.str.startswith('class')
m2 = train.columns.str.split('.').str[0].duplicated()
train = train.loc[:, ~m1 | ~m2]
print (train)