Python 在pandas read_csv中将百分比字符串转换为浮点_Python_Pandas

Python 在pandas read_csv中将百分比字符串转换为浮点

python pandas

Python 在pandas read_csv中将百分比字符串转换为浮点,python,pandas,Python,Pandas,在pandas中使用read_csv时，是否有方法将“34%”等值直接转换为int或float？我希望它直接读作0.34 在read_csv中使用此选项无效： read_csv(..., dtype={'col':np.float}) 将csv加载为“df”后，这也无法处理错误“float（）的无效文本：34%” 我最终使用了这个方法，但很冗长： df['col'] = df['col'].apply(lambda x: np.nan if x in ['-'] else x[:-1]).as

在pandas中使用read_csv时，是否有方法将“34%”等值直接转换为int或float？我希望它直接读作0.34

在read_csv中使用此选项无效：

read_csv(..., dtype={'col':np.float})

将csv加载为“df”后，这也无法处理错误“float（）的无效文本：34%”

我最终使用了这个方法，但很冗长：

df['col'] = df['col'].apply(lambda x: np.nan if x in ['-'] else x[:-1]).astype(float)/100

可以定义自定义函数将百分比转换为浮动

In [149]:
# dummy data
temp1 = """index col 
113 34%
122 50%
123 32%
301 12%"""
# custom function taken from https://stackoverflow.com/questions/12432663/what-is-a-clean-way-to-convert-a-string-percent-to-a-float
def p2f(x):
    return float(x.strip('%'))/100
# pass to convertes param as a dict
df = pd.read_csv(io.StringIO(temp1), sep='\s+',index_col=[0], converters={'col':p2f})
df
Out[149]:
        col
index      
113    0.34
122    0.50
123    0.32
301    0.12
In [150]:
# check that dtypes really are floats
df.dtypes
Out[150]:
col    float64
dtype: object

我的浮动代码百分比得益于ashwini的回答：

您的

df

尝试非常接近。尝试更改：

df['col'] = df['col'].astype(float)

致：

Pandas支持Python的字符串处理能力。只需在您想要的字符串函数前面加上

.str

，看看它是否满足您的需要。（当然，这也包括字符串切片。）

上面我们利用

.str.rstrip（）

去除尾随的百分号，然后将整个数组除以100.0，从百分比转换为实际值。例如，45%等于0.45%

虽然

.str.rstrip（'%'）

也可以是

.str[：-1]

，但我更喜欢显式删除“%”，而不是盲目删除最后一个字符，以防万一……

我相信您不需要“100.0”；只要“100”就行了。到那时已经是64了。这似乎是更好的答案@EdChum你同意吗？如果列中混合了%字符串和转换为pandas对象的浮点数，则需要将上述内容更改为：pct=df['col'].str.contains（'%'）df.loc[pct，'col']=df.loc[pct，'col'].str.rstrip（'%'）.astype（'float'）/100.0 df['col']=df['col'].astype（float）以防止浮点数除以100

df['col'] = df['col'].astype(float)

df['col'] = df['col'].str.rstrip('%').astype('float') / 100.0
#                     ^ use str funcs to elim '%'     ^ divide by 100
# could also be:     .str[:-1].astype(...