Python Pandas-替换值而不保留旧的非匹配项_Python_Pandas_Csv

Python Pandas-替换值而不保留旧的非匹配项

python pandas csv

Python Pandas-替换值而不保留旧的非匹配项,python,pandas,csv,Python,Pandas,Csv,我如何让我的函数产生我想要的值 import pandas as pd test_csv = """ time,val1,what_new_val1_should_be 2004-07-21 09:00:00,apple,1 2004-07-21 10:00:00,N, 2004-07-21 11:00:00,pear,2 2004-07-21 12:00:00,apple,1 2004-07-21 13:00:00,bread,3 2004-07-21 13:00:00,pear,2 200

我如何让我的函数产生我想要的值

import pandas as pd

test_csv = """
time,val1,what_new_val1_should_be
2004-07-21 09:00:00,apple,1
2004-07-21 10:00:00,N,
2004-07-21 11:00:00,pear,2
2004-07-21 12:00:00,apple,1
2004-07-21 13:00:00,bread,3
2004-07-21 13:00:00,pear,2
2004-07-21 13:00:00,,
2004-07-21 13:00:00,,
"""

from io import StringIO
test_csv = StringIO(test_csv)
df = pd.read_csv(test_csv)


def coded_val(df):
    """

    Create a new column "new_val1" that has an integer responding to the wor din val1
    :param df: dataframe. A pandas dataframe with column val1 where the values are food items or N for none or lank for none
    :return: daraframe. A pandas dataframe with a new column "new_val1"
    """

    replacement_dict = {
        'apple': 1,
        'pear': 2,
        'bread': 3
    }

    df['new_val1'] = df['val1'].replace(to_replace=replacement_dict, inplace=False)
    return df


df = coded_val(df=df)
print(df)

更改为

map

                  time   val1  what_new_val1_should_be new_val1
0  2004-07-21 09:00:00  apple                      1.0        1
1  2004-07-21 10:00:00      N                      NaN        N
2  2004-07-21 11:00:00   pear                      2.0        2
3  2004-07-21 12:00:00  apple                      1.0        1
4  2004-07-21 13:00:00  bread                      3.0        3
5  2004-07-21 13:00:00   pear                      2.0        2
6  2004-07-21 13:00:00    NaN                      NaN      NaN
7  2004-07-21 13:00:00    NaN                      NaN      NaN

更改为

map

                  time   val1  what_new_val1_should_be new_val1
0  2004-07-21 09:00:00  apple                      1.0        1
1  2004-07-21 10:00:00      N                      NaN        N
2  2004-07-21 11:00:00   pear                      2.0        2
3  2004-07-21 12:00:00  apple                      1.0        1
4  2004-07-21 13:00:00  bread                      3.0        3
5  2004-07-21 13:00:00   pear                      2.0        2
6  2004-07-21 13:00:00    NaN                      NaN      NaN
7  2004-07-21 13:00:00    NaN                      NaN      NaN

它需要一个名为

new_val1

的新列，我希望该列的类型为int@Vader您必须用一些int填充nan，然后调用

。astype（int）

，

nan

不能是类型int@Vaderint和np.nan将创建混合数据类型，我建议保持原样，PS:np.nan在pandas中是float，新列，只需重新分配它…@YOandBEN_W我很同意它是一个浮点，但是如何将它放在一个新列中？@Vader df['new']=df['val1'].map（replacement_dict）它需要是一个名为

new_val1

的新列，我希望该列的类型为int@Vader您必须用一些int填充nan，然后调用

.astype（int）

，

NaN

不能是类型int@Vaderint和np.nan将创建混合数据类型，我建议保持原样，PS:pandas中的np.nan是float，新列，只需将其分配回…@YOandBEN_W我对它是float很好，那么如何将其放入新列？@Vader df['new']=df['val1'].map（替换）