Python 使用pandas.map更改值_Python_Pandas_Dictionary

Python 使用pandas.map更改值

python pandas dictionary

Python 使用pandas.map更改值,python,pandas,dictionary,Python,Pandas,Dictionary,我正在尝试使用map函数更改数据do数值中的字符串以下是数据： label sms_message 0 ham Go until jurong point, crazy.. Available only ... 1 ham Ok lar... Joking wif u oni... 2 spam Free entry in 2 a wkly comp to win FA Cup fina... 3 ham U dun say so ea

我正在尝试使用map函数更改数据do数值中的字符串

以下是数据：

    label   sms_message
0   ham     Go until jurong point, crazy.. Available only ...
1   ham     Ok lar... Joking wif u oni...
2   spam    Free entry in 2 a wkly comp to win FA Cup fina...
3   ham     U dun say so early hor... U c already then say...
4   ham     Nah I don't think he goes to usf, he lives aro...

我正在尝试使用以下方法将“垃圾邮件”更改为1，将“火腿”更改为0：

df['label'] = df.label.map({'ham':0, 'spam':1})

但结果是：

    label   sms_message
0   NaN     Go until jurong point, crazy.. Available only ...
1   NaN     Ok lar... Joking wif u oni...
2   NaN     Free entry in 2 a wkly comp to win FA Cup fina...
3   NaN     U dun say so early hor... U c already then say...
4   NaN     Nah I don't think he goes to usf, he lives aro...

有人能识别出问题吗？

您是对的，我认为您执行了两次相同的语句（1后1）。在Python交互终端上执行的以下语句澄清了这一点

注意：如果您传递字典，map（）会将序列中的所有值替换为

NaN

如果它与dictionary的键不匹配（我认为，您也做了相同的操作，即执行语句两次）。检查

文档说明：当arg是字典时，字典中未包含的系列值（作为键）将转换为NaN

获得相同结果的其他方法第1种方法-使用

map（）

和

dictionary

参数

第二种方式-将

map（）

与

函数一起使用
第三种方法-将apply（）
与函数一起使用
谢谢。
也许您的问题在于read\u table功能
尝试这样做：
df = pd.read_table('smsspamcollection/SMSSpamCollection',
                   sep='\t', 
                   header=None,
                   names=['label', 'sms_message'])

对我来说似乎很好。但是标签变为NaN而不是1或0。请检查列中是否有空白
>>> import pandas as pd
>>>
>>> d = {
...     "label": ['spam', 'ham', 'ham', 'ham', 'spam'],
...     "sms_message": ["M1", "M2", "M3", "M4", "M5"]
... }
>>>
>>> df = pd.DataFrame(d)
>>> df
  label sms_message
0  spam          M1
1   ham          M2
2   ham          M3
3   ham          M4
4  spam          M5
>>>

>>> new_values = {'spam': 1, 'ham': 0}
>>>
>>> df
  label sms_message
0  spam          M1
1   ham          M2
2   ham          M3
3   ham          M4
4  spam          M5
>>>
>>> df.label = df.label.map(new_values)
>>> df
   label sms_message
0      1          M1
1      0          M2
2      0          M3
3      0          M4
4      1          M5
>>>

>>> df.label = df.label.map(lambda v: 0 if v == 'ham' else 1)
>>> df
   label sms_message
0      1          M1
1      0          M2
2      0          M3
3      0          M4
4      1          M5
>>>

>>> df.label = df.label.apply(lambda v: 0 if v == "ham" else 1)
>>>
>>> df
   label sms_message
0      1          M1
1      0          M2
2      0          M3
3      0          M4
4      1          M5
>>>

df = pd.read_table('smsspamcollection/SMSSpamCollection',
                   sep='\t', 
                   header=None,
                   names=['label', 'sms_message'])