Python Pandas-替换值而不保留旧的非匹配项
我如何让我的函数产生我想要的值Python Pandas-替换值而不保留旧的非匹配项,python,pandas,csv,Python,Pandas,Csv,我如何让我的函数产生我想要的值 import pandas as pd test_csv = """ time,val1,what_new_val1_should_be 2004-07-21 09:00:00,apple,1 2004-07-21 10:00:00,N, 2004-07-21 11:00:00,pear,2 2004-07-21 12:00:00,apple,1 2004-07-21 13:00:00,bread,3 2004-07-21 13:00:00,pear,2 200
import pandas as pd
test_csv = """
time,val1,what_new_val1_should_be
2004-07-21 09:00:00,apple,1
2004-07-21 10:00:00,N,
2004-07-21 11:00:00,pear,2
2004-07-21 12:00:00,apple,1
2004-07-21 13:00:00,bread,3
2004-07-21 13:00:00,pear,2
2004-07-21 13:00:00,,
2004-07-21 13:00:00,,
"""
from io import StringIO
test_csv = StringIO(test_csv)
df = pd.read_csv(test_csv)
def coded_val(df):
"""
Create a new column "new_val1" that has an integer responding to the wor din val1
:param df: dataframe. A pandas dataframe with column val1 where the values are food items or N for none or lank for none
:return: daraframe. A pandas dataframe with a new column "new_val1"
"""
replacement_dict = {
'apple': 1,
'pear': 2,
'bread': 3
}
df['new_val1'] = df['val1'].replace(to_replace=replacement_dict, inplace=False)
return df
df = coded_val(df=df)
print(df)
更改为
map
time val1 what_new_val1_should_be new_val1
0 2004-07-21 09:00:00 apple 1.0 1
1 2004-07-21 10:00:00 N NaN N
2 2004-07-21 11:00:00 pear 2.0 2
3 2004-07-21 12:00:00 apple 1.0 1
4 2004-07-21 13:00:00 bread 3.0 3
5 2004-07-21 13:00:00 pear 2.0 2
6 2004-07-21 13:00:00 NaN NaN NaN
7 2004-07-21 13:00:00 NaN NaN NaN
更改为
map
time val1 what_new_val1_should_be new_val1
0 2004-07-21 09:00:00 apple 1.0 1
1 2004-07-21 10:00:00 N NaN N
2 2004-07-21 11:00:00 pear 2.0 2
3 2004-07-21 12:00:00 apple 1.0 1
4 2004-07-21 13:00:00 bread 3.0 3
5 2004-07-21 13:00:00 pear 2.0 2
6 2004-07-21 13:00:00 NaN NaN NaN
7 2004-07-21 13:00:00 NaN NaN NaN
它需要一个名为
new_val1
的新列,我希望该列的类型为int@Vader您必须用一些int填充nan,然后调用。astype(int)
,nan
不能是类型int@Vaderint和np.nan将创建混合数据类型,我建议保持原样,PS:np.nan在pandas中是float,新列,只需重新分配它…@YOandBEN_W我很同意它是一个浮点,但是如何将它放在一个新列中?@Vader df['new']=df['val1'].map(replacement_dict)它需要是一个名为new_val1
的新列,我希望该列的类型为int@Vader您必须用一些int填充nan,然后调用.astype(int)
,NaN
不能是类型int@Vaderint和np.nan将创建混合数据类型,我建议保持原样,PS:pandas中的np.nan是float,新列,只需将其分配回…@YOandBEN_W我对它是float很好,那么如何将其放入新列?@Vader df['new']=df['val1'].map(替换)