我如何在Python中定义一个函数来将对象列转换为浮点数？_Python_Pandas_Loops_Dataframe_User Defined Functions

我如何在Python中定义一个函数来将对象列转换为浮点数？

python pandas loops dataframe

我如何在Python中定义一个函数来将对象列转换为浮点数？,python,pandas,loops,dataframe,user-defined-functions,Python,Pandas,Loops,Dataframe,User Defined Functions,我导入了一个具有不同类型列的数据帧。见下文： <class 'pandas.core.frame.DataFrame'> RangeIndex: 1272 entries, 0 to 1271 Columns: 189 entries, Year to HUMAN_rank dtypes: float64(67), int64(1), object(121) memory usage: 1.8+ MB 范围索引：1272个条目，0到1271 栏目：189个条目，年度至人类排名数

我导入了一个具有不同类型列的数据帧。见下文：

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1272 entries, 0 to 1271
Columns: 189 entries, Year to HUMAN_rank
dtypes: float64(67), int64(1), object(121)
memory usage: 1.8+ MB


范围索引：1272个条目，0到1271
栏目：189个条目，年度至人类排名
数据类型：float64（67）、int64（1）、object（121）
内存使用率：1.8+MB

我想提出一个函数，它迭代DataFrame的每一列，标识每一列中的值的类型，如果该列具有type object，则将其转换为浮点

要仅对

对象

数据类型执行此操作，您可以使用：

例如：

>>> df
  col1      col2 col3  col4
0    1  0.452893    2     8
1    2  0.745232    3     6
2    1  0.374296    3     1
3    3  0.398660    3     4
4    2  0.902737    2     1
5    3  0.940392    3     0
6    3  0.382493    3     0
7    2  0.684829    3     4
8    2  0.506248    3     8
9    1  0.161701    3     3

>>> df.dtypes
col1     object
col2    float64
col3     object
col4      int64
dtype: object

>>> df[df.select_dtypes('object').columns] = df.select_dtypes('object').astype(float)

>>> df.dtypes
col1    float64
col2    float64
col3    float64
col4      int64
dtype: object

注意：如果某些列的某些值无法转换为浮点值，则上述方法将不起作用。您可以对它们进行迭代，使用

pd.将它们转换为带有errors='concurve'
和downcast='float'
的数值：
>>> df
  col1      col2 col3  col4
0    3  0.594651    2     6
1    3  0.677595    3     3
2    3  0.546434    1     0
3    3  0.454769    2     6
4    x  0.321130    2     3
5    2  0.473391    2     7
6    1  0.207182    2     7
7    2  0.883071    3     1
8    x  0.994372    2     4
9    1  0.052539    3     2

>>> df.dtypes
col1     object
col2    float64
col3     object
col4      int64
dtype: object

for col in df.select_dtypes('object').columns:
    df[col] = pd.to_numeric(df[col], errors='coerce', downcast='float')

>>> df
   col1      col2  col3  col4
0   3.0  0.594651   2.0     6
1   3.0  0.677595   3.0     3
2   3.0  0.546434   1.0     0
3   3.0  0.454769   2.0     6
4   NaN  0.321130   2.0     3
5   2.0  0.473391   2.0     7
6   1.0  0.207182   2.0     7
7   2.0  0.883071   3.0     1
8   NaN  0.994372   2.0     4
9   1.0  0.052539   3.0     2
>>> df.dtypes
col1    float32
col2    float64
col3    float32
col4      int64
dtype: object

>>> df
  col1      col2 col3  col4
0    3  0.594651    2     6
1    3  0.677595    3     3
2    3  0.546434    1     0
3    3  0.454769    2     6
4    x  0.321130    2     3
5    2  0.473391    2     7
6    1  0.207182    2     7
7    2  0.883071    3     1
8    x  0.994372    2     4
9    1  0.052539    3     2

>>> df.dtypes
col1     object
col2    float64
col3     object
col4      int64
dtype: object

for col in df.select_dtypes('object').columns:
    df[col] = pd.to_numeric(df[col], errors='coerce', downcast='float')

>>> df
   col1      col2  col3  col4
0   3.0  0.594651   2.0     6
1   3.0  0.677595   3.0     3
2   3.0  0.546434   1.0     0
3   3.0  0.454769   2.0     6
4   NaN  0.321130   2.0     3
5   2.0  0.473391   2.0     7
6   1.0  0.207182   2.0     7
7   2.0  0.883071   3.0     1
8   NaN  0.994372   2.0     4
9   1.0  0.052539   3.0     2
>>> df.dtypes
col1    float32
col2    float64
col3    float32
col4      int64
dtype: object