Python 如何替换'；k'；或'；m'；从dataframe中的对象列使用000s并替换非数字值？_Python_Pandas

Python 如何替换'；k'；或'；m'；从dataframe中的对象列使用000s并替换非数字值？

python pandas

Python 如何替换'；k'；或'；m'；从dataframe中的对象列使用000s并替换非数字值？,python,pandas,Python,Pandas,我有一个df，看起来像这样，dtype is object不能转换为int或float： col1 100 100k 100k-100m 10m 50 如何在类型为object的列中用000替换k，用000000替换m 此外，一旦我可以替换k或m我如何用零替换所有不是数字的东西新DF应该是这样的（不是空白）：已尝试此代码： df.col1 = (df.col1.replace(r'[KM]+$', '', regex=True).astype(float) * \ d

我有一个df，看起来像这样，dtype is object不能转换为int或float：

col1
100
100k
100k-100m
10m
50

如何在类型为object的列中用

替换

，用

替换

此外，一旦我可以替换

或

我如何用零替换所有不是数字的东西

新DF应该是这样的（不是空白）：

已尝试此代码：

 df.col1 = (df.col1.replace(r'[KM]+$', '', regex=True).astype(float) * \
          df.col1.str.extract(r'[\d\.]+([KM]+)', expand=False)
             .fillna(1)
             .replace(['K','M'], [10**3, 10**6]).astype(int))

但是列必须是float

创建映射字典并使用

str.replace

：

dct = {'k': '000', 'm': '000000'}

df.col1.str.replace(r'|'.join(dct.keys()), lambda x: dct[x.group()])

如果要删除第三行而不是替换，如在输出中：

(pd.to_numeric(df.col1.str.replace(r'|'.join(dct.keys()),
    lambda x: dct[x.group()]), errors='coerce'))

与@user3483203类似，但使用

str.translate

而不是

str.replace

df['col1'] = df.col1.str.translate(str.maketrans({'k':'000','m':'000000'}))
>>> df
               col1
0               100
1            100000
2  100000-100000000
3          10000000
4                50

# df['col1'] = pd.to_numeric(df.col1.str.translate(str.maketrans({'k':'000','m':'000000'})),errors='coerce')

#          col1
# 0       100.0
# 1    100000.0
# 2         NaN
# 3  10000000.0
# 4        50.0

这就是我想到的。让我知道你的想法。我额外做了一件事，去掉了小数点

import pandas as pd

df = pd.Series(['100','100k','100k-100m','10m','50'])

df = df.str.replace('k', '000', regex=True)
df = df.str.replace('m', '000000', regex=True)
df = pd.to_numeric(df, errors='coerce')
df = df.apply(str).str.split('.', expand=True).iloc[ : , 0 ]

print(df)

100k-100m

应该变成什么？@user3483203我想用零来代替它。

0         100.0
1      100000.0
2           NaN
3    10000000.0
4          50.0
Name: col1, dtype: float64

df['col1'] = df.col1.str.translate(str.maketrans({'k':'000','m':'000000'}))
>>> df
               col1
0               100
1            100000
2  100000-100000000
3          10000000
4                50

# df['col1'] = pd.to_numeric(df.col1.str.translate(str.maketrans({'k':'000','m':'000000'})),errors='coerce')

#          col1
# 0       100.0
# 1    100000.0
# 2         NaN
# 3  10000000.0
# 4        50.0

import pandas as pd

df = pd.Series(['100','100k','100k-100m','10m','50'])

df = df.str.replace('k', '000', regex=True)
df = df.str.replace('m', '000000', regex=True)
df = pd.to_numeric(df, errors='coerce')
df = df.apply(str).str.split('.', expand=True).iloc[ : , 0 ]

print(df)