Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/305.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在python Dataframe中提高以下代码的性能,如果可能,请说明复杂性的顺序:_Python_Function_Dataframe_One Hot Encoding - Fatal编程技术网

如何在python Dataframe中提高以下代码的性能,如果可能,请说明复杂性的顺序:

如何在python Dataframe中提高以下代码的性能,如果可能,请说明复杂性的顺序:,python,function,dataframe,one-hot-encoding,Python,Function,Dataframe,One Hot Encoding,下面的代码运行良好,但希望提高代码的性能 我们可以通过索引或其他方式来实现吗 我试图实现复制40个一热编码器字段到一列 def soil_typ(row): if row['Soil_Type1'] == 1: return 1 elif row['Soil_Type2'] == 1: return 2 elif row['Soil_Type3'] == 1: return 3 elif row['Soil_Type

下面的代码运行良好,但希望提高代码的性能

我们可以通过索引或其他方式来实现吗

我试图实现复制40个一热编码器字段到一列

def soil_typ(row):
    if row['Soil_Type1'] == 1:
        return 1
    elif row['Soil_Type2'] == 1:
        return 2
    elif row['Soil_Type3'] == 1:
        return 3
    elif row['Soil_Type4'] == 1:
        return 4
    elif row['Soil_Type5'] == 1:
        return 5
    elif row['Soil_Type6'] == 1:
        return 6
    elif row['Soil_Type7'] == 1:
        return 7
    elif row['Soil_Type8'] == 1:
        return 8
    elif row['Soil_Type9'] == 1:
        return 9
    elif row['Soil_Type10'] == 1:
        return 10
    elif row['Soil_Type11'] == 1:
        return 11
    elif row['Soil_Type12'] == 1:
        return 12
    elif row['Soil_Type13'] == 1:
        return 13
    elif row['Soil_Type14'] == 1:
        return 14
    elif row['Soil_Type15'] == 1:
        return 15
    elif row['Soil_Type16'] == 1:
        return 16
    elif row['Soil_Type17'] == 1:
        return 17
    elif row['Soil_Type18'] == 1:
        return 18
    elif row['Soil_Type19'] == 1:
        return 19
    elif row['Soil_Type20'] == 1:
        return 20
    elif row['Soil_Type21'] == 1:
        return 21
    elif row['Soil_Type23'] == 1:
        return 22
    elif row['Soil_Type23'] == 1:
        return 23
    elif row['Soil_Type24'] == 1:
        return 24
    elif row['Soil_Type25'] == 1:
        return 25
    elif row['Soil_Type26'] == 1:
        return 26
    elif row['Soil_Type27'] == 1:
        return 27
    elif row['Soil_Type28'] == 1:
        return 28
    elif row['Soil_Type29'] == 1:
        return 29
    elif row['Soil_Type30'] == 1:
        return 30
    elif row['Soil_Type31'] == 1:
        return 31
    elif row['Soil_Type32'] == 1:
        return 32
    elif row['Soil_Type33'] == 1:
        return 33
    elif row['Soil_Type34'] == 1:
        return 34
    elif row['Soil_Type35'] == 1:
        return 35
    elif row['Soil_Type36'] == 1:
        return 36
    elif row['Soil_Type37'] == 1:
        return 37
    elif row['Soil_Type38'] == 1:
        return 38
    elif row['Soil_Type39'] == 1:
        return 39
    elif row['Soil_Type40'] == 1:
        return 40
    else:
        return 0
在此之后,我应用此函数创建了一个新变量,如下所示:

data_train['Soil'] = [soil_typ(row_[1]) for row_ in data_train.iterrows()]
该数据集包含近150万条记录


上面的代码是有效的,但希望探索这段代码的性能

无需在此处重复大量相同的代码。代码后面用“#”解释的步骤

n = 40

def soil_typ(row):
    for x in range(n+1):             # iters through a list of values and returns n+1
        y = 'Soil_Type%s' % x        # translates integer to string (label)
        if row[y] == True:           # value 1 is equal to "True"; less confusing if 
                                     # false or true being used here during a 0/1 com-
                                     # parison.
            return x
        else:
            return 0

.. code snippet ..