Python 如何根据表中列的数据类型使用适当的值填充空值？_Python_Pandas

Python 如何根据表中列的数据类型使用适当的值填充空值？

python pandas

Python 如何根据表中列的数据类型使用适当的值填充空值？,python,pandas,Python,Pandas,我正在看pandas中的csv。现在我需要填充空值并将其转储到表中。这就是我要做的 import pandas as pd from sqlalchemy import create_engine df = pd.read_csv(file_path) df.fillna('', inplace=True) engine = create_engine('postgresql://username:******@localhost:****/database') df.to_sql("my_t

我正在看

pandas

中的

csv

。现在我需要填充空值并将其转储到表中。这就是我要做的

import pandas as pd
from sqlalchemy import create_engine

df = pd.read_csv(file_path)
df.fillna('', inplace=True)
engine = create_engine('postgresql://username:******@localhost:****/database')
df.to_sql("my_table", engine)

现在的问题是，对于具有

integer

值且缺少少量值的列，

pandas

为缺少的值填充空

string

。因此，当将其转储到表中时，

pandas

将列类型分类为

字符串

，并将其转储到表中。因此，此列将

文本

作为数据类型（在

postgres

的情况下）获取，而不是不做任何事情来填充缺少的值，并且该列被正确分类为

整数

或

双精度

（在

postgres

的情况下），这是正确的行为

但是，对于具有

string

值以及缺少值的列，这不是问题，因为这些缺少的值将被分配一个空

string

，并且不会影响列类型

现在我想要的是一种方法，对于那些具有整数或浮点值的列，用

填充空值，对于那些具有

string

值的列，用

（空字符串）填充空值。我如何在熊猫身上做到这一点

注意：一些列也可以是

datetime

，我现在不打算用任何内容填充这些列。

您可以这样做：

np.random.seed(47)

df = pd.DataFrame({'attend' : np.random.choice(['yes', 'no', 'some', np.nan], 100),
                  'other_random_col' : np.random.choice(['a', 'b', 'c', np.nan], 100),
                  'int_col' : np.random.sample(100),
                  'none' : [np.nan] * 100})

#checks all rows where columns are numbers and fills NaN with 0
df.loc[:, df.dtypes == np.number] = df.fillna(0)

也适用于字符串，但请注意：

要选择字符串，必须使用
```
对象
```
数据类型，但请注意这将返回所有对象数据类型列

我们可以用它来做这件事

案例1：您只有
数值
和
字符串
列

注意：假设您只想用

填充字符串数据类型列的N/A值，其余（数字列）用

填充

df.transform(lambda x: x.fillna('') if x.dtype == 'object' else x.fillna(0))

案例2：需要自定义函数来处理更多数据类型

如果您想处理更多的数据类型，可以创建自己的函数并应用它来填充空值

def fill_null_values(value):
  dtype = value.dtype
  result = ''

  # to handle string data type
  if dtype  == 'object':
    result = ''

  # to handle numeric data type
  elif ('int' in dtype ) or ('float' in dtype ):
    result = 0

  # add more cases to handle more data type

  return value.fillna(result)

data.transform(fill_null_values)