使用pandas在python中将多数据类型数据帧转换为整数编码数据帧

使用pandas在python中将多数据类型数据帧转换为整数编码数据帧,python,pandas,dataframe,Python,Pandas,Dataframe,我在python中有这样一个数据帧- INSTRUMENT_TYPE_CD RISK_START_DT ... FIN_POS_IND PL_FINAL_IND 0 Physical Index 01-03-2017 00:00 ... 0 No 1 Fin Basis Swap 01-09-2018 00:00 ... 0 No

我在python中有这样一个数据帧-

  INSTRUMENT_TYPE_CD     RISK_START_DT     ...      FIN_POS_IND PL_FINAL_IND
0     Physical Index  01-03-2017 00:00     ...                0           No
1     Fin Basis Swap  01-09-2018 00:00     ...                0           No
2     Physical Index  01-09-2017 00:00     ...                0           No
3     Physical Index  01-12-2016 00:00     ...                0           No
4     Fin Basis Swap  01-02-2018 00:00     ...                0           No
如您所见,列中元素的值是重复的,通常是字符串。我想将此数据帧转换为整数编码的数据帧,该数据帧将列中的每个唯一字符串映射到某个唯一的整数/数字

到目前为止,我已经提出了这个(标准化方法),但它不起作用

normalise(dataframe)

def normalise(dataframe):
    for column in dataframe:
        dataframe[column] = dataframe.apply(unique_code_mapper(dataframe[column]))
    return dataframe

def unique_code_mapper(column):
    unique_array = []
    for val in column:
        if val in unique_array:
            column.loc[val] = unique_array.index(val)
        else:
            unique_array.append(val)
            column.loc[val] = unique_array.index(val)
    return column
它返回以下错误:

TypeError: ("'Series' object is not callable", 'occurred at index INSTRUMENT_TYPE_CD')
您可以使用:


@山-当然,最好的测试它。
print (df.dtypes)
INSTRUMENT_TYPE_CD            object
RISK_START_DT         datetime64[ns]
FIN_POS_IND                    int64
PL_FINAL_IND                  object
dtype: object

#select only object columns (obviously strings)
#cols = df.select_dtypes('object').columns

#select columns by names
cols = ['INSTRUMENT_TYPE_CD','PL_FINAL_IND']

for c in cols:
    df[c] = pd.factorize(df[c])[0]
print (df)
   INSTRUMENT_TYPE_CD     RISK_START_DT  FIN_POS_IND  PL_FINAL_IND
0                   0  01-03-2017 00:00            0             0
1                   1  01-09-2018 00:00            0             0
2                   0  01-09-2017 00:00            0             0
3                   0  01-12-2016 00:00            0             0
4                   1  01-02-2018 00:00            0             0