Python 大熊猫回南

Python 大熊猫回南,python,pandas,Python,Pandas,我有一个熊猫数据框,如示例所示: mydf.head() Date Merchant/Description Debit/Credit 0 10/05/2018 FAKE TRANSACTION 1 -£7.50 1 09/05/2018 FAKE TRANSACTION 2 -£5.79 2 09/05/2018 FAKE TRANSACTION 3 -£28.50 3 08/05/2018 FAKE TRANSACTION 4 -£3.

我有一个熊猫数据框,如示例所示:

mydf.head()

    Date        Merchant/Description Debit/Credit
0   10/05/2018  FAKE TRANSACTION 1  -£7.50
1   09/05/2018  FAKE TRANSACTION 2  -£5.79
2   09/05/2018  FAKE TRANSACTION 3  -£28.50
3   08/05/2018  FAKE TRANSACTION 4  -£3.99
4   08/05/2018  FAKE TRANSACTION 5  -£17.99
列['Debit/Credit']的数据类型为'object';它是弦和楠的混合体

我想把字符串转换成数字。我使用pandas.to_numeric尝试实现以下目标:

    cols = ['Debit/Credit']
    hsbcraw[cols] = hsbcraw[cols].apply(pd.to_numeric, errors='coerce')
这将把列['Debit/Credit']中的所有项目转换为NaN:

mydf.head()

    Date        Merchant/Description Debit/Credit
0   10/05/2018  FAKE TRANSACTION 1   NaN
1   09/05/2018  FAKE TRANSACTION 2   NaN
2   09/05/2018  FAKE TRANSACTION 3   NaN
3   08/05/2018  FAKE TRANSACTION 4   NaN
4   08/05/2018  FAKE TRANSACTION 5   NaN
我的代码或方法中有什么错误?

在转换为
数值之前,需要使用空字符串:

hsbcraw[cols]=hsbcraw[cols].replace('£','', regex=True).apply(pd.to_numeric, errors='coerce')
在转换为
数值之前,需要使用空字符串

hsbcraw[cols]=hsbcraw[cols].replace('£','', regex=True).apply(pd.to_numeric, errors='coerce')

您还可以使用
regex

Ex:

import pandas as pd
df = pd.DataFrame({"Debit/Credit": ["-£7.50", "-£5.79", "-£28.50", "-£3.99", "-£17.99"]})
df["Debit/Credit"] = df["Debit/Credit"].str.extract("(\d*\.\d+)", expand=True).apply(pd.to_numeric)
print(df)
   Debit/Credit
0          7.50
1          5.79
2         28.50
3          3.99
4         17.99
输出:

import pandas as pd
df = pd.DataFrame({"Debit/Credit": ["-£7.50", "-£5.79", "-£28.50", "-£3.99", "-£17.99"]})
df["Debit/Credit"] = df["Debit/Credit"].str.extract("(\d*\.\d+)", expand=True).apply(pd.to_numeric)
print(df)
   Debit/Credit
0          7.50
1          5.79
2         28.50
3          3.99
4         17.99

您还可以使用
regex

Ex:

import pandas as pd
df = pd.DataFrame({"Debit/Credit": ["-£7.50", "-£5.79", "-£28.50", "-£3.99", "-£17.99"]})
df["Debit/Credit"] = df["Debit/Credit"].str.extract("(\d*\.\d+)", expand=True).apply(pd.to_numeric)
print(df)
   Debit/Credit
0          7.50
1          5.79
2         28.50
3          3.99
4         17.99
输出:

import pandas as pd
df = pd.DataFrame({"Debit/Credit": ["-£7.50", "-£5.79", "-£28.50", "-£3.99", "-£17.99"]})
df["Debit/Credit"] = df["Debit/Credit"].str.extract("(\d*\.\d+)", expand=True).apply(pd.to_numeric)
print(df)
   Debit/Credit
0          7.50
1          5.79
2         28.50
3          3.99
4         17.99

我通常通过如下方式将其转换为浮点数:

df['Debit/Credit'] = df['Debit/Credit'].replace('£', '', regex = True).astype('float')

我通常通过如下方式将其转换为浮点数:

df['Debit/Credit'] = df['Debit/Credit'].replace('£', '', regex = True).astype('float')

你期待什么?
应该是什么类型的数字?你期望什么?
应该是什么类型的号码?