Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/311.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/jsf-2/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用熊猫阅读csv。为土壤类型重新指定值1至40,为森林覆盖类型重新指定值1至7_Python_Csv_Pandas - Fatal编程技术网

Python 使用熊猫阅读csv。为土壤类型重新指定值1至40,为森林覆盖类型重新指定值1至7

Python 使用熊猫阅读csv。为土壤类型重新指定值1至40,为森林覆盖类型重新指定值1至7,python,csv,pandas,Python,Csv,Pandas,在csv输入文件中,下面有56列。示例数据如下所示。请注意我的格式 Id,Elevation,Aspect,Slope,Horizontal_Distance_To_Hydrology,Vertical_Distance_To_Hydrology,Horizontal_Distance_To_Roadways,Hillshade_9am,Hillshade_Noon,Hillshade_3pm,Horizontal_Distance_To_Fire_Points,Wilderness_Area1

在csv输入文件中,下面有56列。示例数据如下所示。请注意我的格式

Id,Elevation,Aspect,Slope,Horizontal_Distance_To_Hydrology,Vertical_Distance_To_Hydrology,Horizontal_Distance_To_Roadways,Hillshade_9am,Hillshade_Noon,Hillshade_3pm,Horizontal_Distance_To_Fire_Points,Wilderness_Area1,Wilderness_Area2,Wilderness_Area3,Wilderness_Area4,Soil_Type1,Soil_Type2,Soil_Type3,Soil_Type4,Soil_Type5,Soil_Type6,Soil_Type7,Soil_Type8,Soil_Type9,Soil_Type10,Soil_Type11,Soil_Type12,Soil_Type13,Soil_Type14,Soil_Type15,Soil_Type16,Soil_Type17,Soil_Type18,Soil_Type19,Soil_Type20,Soil_Type21,Soil_Type22,Soil_Type23,Soil_Type24,Soil_Type25,Soil_Type26,Soil_Type27,Soil_Type28,Soil_Type29,Soil_Type30,Soil_Type31,Soil_Type32,Soil_Type33,Soil_Type34,Soil_Type35,Soil_Type36,Soil_Type37,Soil_Type38,Soil_Type39,Soil_Type40,Cover_Type
1,2596,51,3,258,0,510,221,232,148,6279,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,5
2,2590,56,2,212,-6,390,220,235,151,6225,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,5
3,2804,139,9,268,65,3180,234,238,135,6121,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
4,2785,155,18,242,118,3090,238,238,122,6211,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,2
5,2595,45,2,153,-1,391,220,234,150,6172,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,5
6,2579,132,6,300,-15,67,230,237,140,6031,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2
7,2606,45,7,270,5,633,222,225,138,6256,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,5
8,2605,49,4,234,7,573,222,230,144,6228,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,5
9,2617,45,9,240,56,666,223,221,133,6244,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,5
10,2612,59,10,247,11,636,228,219,124,6230,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,5
我需要转换这些数据。以下是要求。 -删除多个具有二进制值(0或1)的列,并为新列指定值的范围。对于荒野阿瑞斯,它是从1到4。对于土壤类型,它是从1到40

  • 将立柱从荒野_区域1移到荒野_区域4。在U区域中添加一列。根据输入行将1分配到4。示例-之前,上面示例输入中的第一行

    • 荒野面积1=1,现在应该是荒野面积1
    • 荒野面积2=1,现在应该是荒野面积2
    • 荒野面积3=1,现在应该是荒野面积3
    • 荒野面积4=1,现在应该是荒野面积4

    • 移除土壤类型1至土壤类型40的柱。添加单柱土壤类型。根据输入行将1分配到40。示例-之前,上面示例输入中的第一行
    • 土壤类型1=1,现在应该是土壤类型1
    • 土壤类型2=1,现在应该是土壤类型2
    • 土壤类型3=1,现在应为土壤类型3
    • 土壤类型4=1,现在应该是土壤类型4
我使用了下面的代码,但在我的数据框中仍然得到了40种土壤类型。我需要从df中删除这些列。我该如何做上述所有工作

df = pandas.read_csv(ifname)
df['Soil'] = 0
for i in range(1,41):
    df['Soil'] = df['Soil'] + i*df['Soil_Type'+str(i)]

print(df)
下面是我需要的例子

Id,Elevation,Aspect,Slope,Horizontal_Distance_To_Hydrology,Vertical_Distance_To_Hydrology,Horizontal_Distance_To_Roadways,Hillshade_9am,Hillshade_Noon,Hillshade_3pm,Horizontal_Distance_To_Fire_Points,Cover_Type,Soil,Wilderness_Area
1,2596,51,3,258,0,510,221,232,148,6279,5,29,1
2,2590,56,2,212,-6,390,220,235,151,6225,5,29,1
3,2804,139,9,268,65,3180,234,238,135,6121,2,12,1
4,2785,155,18,242,118,3090,238,238,122,6211,2,30,1
5,2595,45,2,153,-1,391,220,234,150,6172,5,29,1
6,2579,132,6,300,-15,67,230,237,140,6031,2,29,1
7,2606,45,7,270,5,633,222,225,138,6256,5,29,1
8,2605,49,4,234,7,573,222,230,144,6228,5,29,1
9,2617,45,9,240,56,666,223,221,133,6244,5,29,1
10,2612,59,10,247,11,636,228,219,124,6230,5,29,1

您几乎成功了,分配值后只需删除列:

In [158]:

soil_type_cols = [col for col in df if 'Soil_Type' in col]
wilderness_cols = [col for col in df if 'Wilderness_Area' in col]

for i in range(1,41):
    df['Soil'] = i*df['Soil_Type'+str(i)]

for i in range(1,5):
    df['Wilderness_Area'] = i*df['Wilderness_Area'+str(i)]

df = df.drop(soil_type_cols+wilderness_cols, axis=1)
df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 9
Data columns (total 14 columns):
Id                                    10 non-null int64
Elevation                             10 non-null int64
Aspect                                10 non-null int64
Slope                                 10 non-null int64
Horizontal_Distance_To_Hydrology      10 non-null int64
Vertical_Distance_To_Hydrology        10 non-null int64
Horizontal_Distance_To_Roadways       10 non-null int64
Hillshade_9am                         10 non-null int64
Hillshade_Noon                        10 non-null int64
Hillshade_3pm                         10 non-null int64
Horizontal_Distance_To_Fire_Points    10 non-null int64
Cover_Type                            10 non-null int64
Soil                                  10 non-null int64
Wilderness_Area                       10 non-null int64
dtypes: int64(14)
memory usage: 1.2 KB
[158]中的

soil_type_cols=[如果col中的“soil_type”,则df中col的col为col]
荒野面积=[如果“荒野面积”位于山谷中,则为df中山谷的山谷]
对于范围(1,41)内的i:
df['Soil']=i*df['Soil_Type'+str(i)]
对于范围(1,5)内的i:
df[‘荒野地区’]=i*df[‘荒野地区’+str(i)]
df=落差(土壤类型+荒野类型,轴=1)
df.info()
INT64索引:10个条目,0到9
数据列(共14列):
Id 10非空int64
标高10非空int64
方面10非空int64
斜率10非空int64
水平\u距离\u到\u 10非空int64
垂直距离\u到\u 10非空int64
水平距离到道路10非零int64
Hillshade_上午9点10非空int64
Hillshade_正午10非空int64
Hillshade_3pm 10非空int64
水平距离到火点10非空int64
封面类型10非空int64
Soil 10非空int64
荒野10区非空int64
数据类型:int64(14)
内存使用:1.2 KB

很酷。我把它添加到我的代码中。我得到了以下输出<代码>、Id、高程、坡向、坡度、水平距离、水文、垂直距离‌​_水文,水平距离,山体,上午9点,中午,山体‌​_下午3点,到火点的水平距离,覆盖类型,土壤,荒野区域**0**,12596,51,3258,05102212321486279,5,29,1**1**,22590,56,2212,-63902202351516225,5,29,1**2**,32804139,9268,653180232381356121,2,12,1我不理解熊猫添加的第一列(粗体)。你能解释一下吗?是否可以删除它。看起来这是熊猫的一个特征。我用了
df.to_csv(of name,index=False)
现在它可以工作了。我回答了你的问题吗?如果是这样,您可以接受我的答案,我的答案左上角将有一个空勾号