如何在Python中从列值中去掉$符号_Python_Pandas

如何在Python中从列值中去掉$符号

python pandas

如何在Python中从列值中去掉$符号,python,pandas,Python,Pandas,我的数据集有很多列包含带逗号的$value，例如$150000.50。导入数据集后： datasets = pd.read_csv('salaries-by-college-type.csv') 插补器对象失败，因为这些列中有一堆值是$values。如何在python程序中更正它这是我的数据集。除学校类型外，rest都有带逗号的$value。是否有一种通用方法可以从这些列值中删除这些$和逗号 School Type 269 non-null

我的数据集有很多列包含带逗号的$value，例如$150000.50。导入数据集后：

datasets = pd.read_csv('salaries-by-college-type.csv')

插补器对象失败，因为这些列中有一堆值是$values。如何在python程序中更正它

这是我的数据集。除学校类型外，rest都有带逗号的$value。是否有一种通用方法可以从这些列值中删除这些$和逗号

School Type                          269 non-null object
Starting Median Salary               269 non-null float64
Mid-Career Median Salary             269 non-null float64
Mid-Career 10th Percentile Salary    231 non-null float64
Mid-Career 25th Percentile Salary    269 non-null float64
Mid-Career 75th Percentile Salary    269 non-null float64
Mid-Career 90th Percentile Salary    231 non-null float64

以下是我的数据集示例：

School Type Starting Median Salary  Mid-Career Median Salary    Mid-Career 10th Percentile Salary   Mid-Career 25th Percentile Salary   Mid-Career 75th Percentile Salary   Mid-Career 90th Percentile Salary
Engineering $72,200.00  $126,000.00     $76,800.00  $99,200.00  $168,000.00     $220,000.00 
Engineering $75,500.00  $123,000.00     N/A $104,000.00     $161,000.00     N/A
Engineering $71,800.00  $122,000.00     N/A $96,000.00  $180,000.00     N/A
Engineering $62,400.00  $114,000.00     $66,800.00  $94,300.00  $143,000.00     $190,000.00 
Engineering $62,200.00  $114,000.00     N/A $80,200.00  $142,000.00     N/A
Engineering $61,000.00  $114,000.00     $80,000.00  $91,200.00  $137,000.00     $180,000.00

假设您有一个如下所示的csv。
注意：我真的不知道你的csv是什么样子。确保相应地调整

read_csv

参数。最具体地说，是

sep

参数

h1|h2
a|$1,000.99
b|$500,000.00

使用

pd.read\u csv

传递一个字典，其中包含要转换为键的列的名称，以及执行转换的函数的值

pd.read_csv(
    'salaries-by-college-type.csv', sep='|',
    converters=dict(h2=lambda x: float(x.strip('$').replace(',', '')))
)

  h1         h2
0  a    1000.99
1  b  500000.00

或者假设您已经导入了数据帧

df = pd.read_csv(
    'salaries-by-college-type.csv', sep='|'
)

然后使用

pd.Series.str.replace

df.h2 = df.h2.str.replace('[^\d\.]', '').astype(float)

df

  h1         h2
0  a    1000.99
1  b  500000.00

df.replace(dict(h2='[^\d\.]'), '', regex=True).astype(dict(h2=float))

  h1         h2
0  a    1000.99
1  b  500000.00

或

pd.DataFrame.replace

df.h2 = df.h2.str.replace('[^\d\.]', '').astype(float)

df

  h1         h2
0  a    1000.99
1  b  500000.00

df.replace(dict(h2='[^\d\.]'), '', regex=True).astype(dict(h2=float))

  h1         h2
0  a    1000.99
1  b  500000.00

df.column=df.column.str.strip（“$”）

谢谢。。。150000.50？

…strip（“，”

@fallerenreaper中的逗号如何？strip不是这样工作的。它只删除不是数据集的开头和结尾的字符。这些是数据类型。我需要查看带有美元符号或其他符号的数据。这是我的数据集，除了第一列，其余都有$和逗号值，学校类型269非空对象起始工资中位数269非空浮动64职业中期工资中位数269非空浮动64职业中期第10百分位工资231非空浮动64职业中期第25百分位工资269非空浮动64职业中期第75位百分位数工资269非空浮动64职业中期90分位数工资231非空float64@Kda你需要编辑你的问题并在那里跳过数据。如果这是kaggle的数据集，问题可能是其他问题。这是一个csv，在熊猫中很容易加载。在创建df之后，只需要df.apply（lambda x:x.str.replace（'$|，'，''）即可