Python 在Dataframe中去掉标头末尾的“\xa0”标记_Python_Python 3.x

Python 在Dataframe中去掉标头末尾的“\xa0”标记

python python-3.x

Python 在Dataframe中去掉标头末尾的“\xa0”标记,python,python-3.x,Python,Python 3.x,我有一个数据集，其中包含一些以没有空格的十六进制代码结尾的标题。下面是我试图摆脱它，但它仍然存在输入： files=[file1,file2,file3] for f in files: for col in f.columns: col = col.replace("\xc2\xa0", "") col = col.replace(u'\xa0', u' ') print(f.columns.values) 输出： 'Name' 'Date'

我有一个数据集，其中包含一些以没有空格的十六进制代码结尾的标题。下面是我试图摆脱它，但它仍然存在

输入：

files=[file1,file2,file3]
for f in files:
    for col in f.columns:
        col = col.replace("\xc2\xa0", "")
        col = col.replace(u'\xa0', u' ')
    print(f.columns.values)

输出：

'Name' 'Date' 'rep_cur' 'Passenger Revenue\xa0' 'Cargo Revenue\xa0'
 'Other Revenue\xa0' 'Total Cargo & Other Revenue' 'Total Revenue\xa0'
 '% inc / (dec) to previous period' 'Employee Costs\xa0' 'Fuel and oil\xa0'

['Name',
 'Date',
 'rep_cur',
 'Passenger Revenue',
 'Cargo Revenue',
 'Other Revenue',
 'Total Cargo & Other Revenue',
 'Total Revenue',
 '% inc / (dec) to previous period',
 'Employee Costs',
 'Fuel and oil']

使用str.strip：

输出：

'Name' 'Date' 'rep_cur' 'Passenger Revenue\xa0' 'Cargo Revenue\xa0'
 'Other Revenue\xa0' 'Total Cargo & Other Revenue' 'Total Revenue\xa0'
 '% inc / (dec) to previous period' 'Employee Costs\xa0' 'Fuel and oil\xa0'

['Name',
 'Date',
 'rep_cur',
 'Passenger Revenue',
 'Cargo Revenue',
 'Other Revenue',
 'Total Cargo & Other Revenue',
 'Total Revenue',
 '% inc / (dec) to previous period',
 'Employee Costs',
 'Fuel and oil']

这对迭代使用的实际col没有任何影响。这相当于：

li = [1, 2, 3]
for n in li:
    n = n + 1
print(li)
# [1, 2, 3]

一个像样的IDE应该向您显示一条警告，在您的示例中，n或col是重新定义的，没有任何用途

例如，您应该使用熊猫提供的工具

注意.rename返回一个新的数据帧。您可以使用inplace=True更改原始数据帧：

df.rename(lambda col: col.replace('\xa0', ''), axis='columns', inplace=True)

如果您不想太花哨，您可以自己替换这些列的名称，这与您的原始代码尝试执行的操作类似：

df.columns = [column.replace('\xa0', '') for col in df.columns]

您好，它显示了以下错误：TypeError:rename获得了一个意外的关键字参数axis我的代码是：file1.renamelambda col:col.replace'\xa0'，axis='columns'，inplace=True file3.renamelambda col:col.replace'\xa0'，axis='columns'，inplace replace'\xa0'，axis='columns'，inplace=True printfile1.columns.values既然我是Python新手，你能根据我的重写代码吗，thanks@DucHaNguyentypefile1是否输出pd.DataFrame？此外，我尝试使用file1.columns=[col.replace'\xa0'，用于df.columns中的col]，但它显示错误：ValueError:长度不匹配：预期轴有99个元素，新值有1个elementsHello，那么我如何才能将所有这些新字符串赋给数据帧的标题呢？只赋给df.columns=new_lIt就不行了。错误显示为：ValueError:长度不匹配：预期轴有99个元素，新值有1个元素。不管怎么说，是我，我没有在每个元素之间加逗号，因为我使用Pycharm，打印时没有留下逗号。但这似乎是一个手动解决方案，因为我在每个文件中有90多个列，这使得代码很长，您还有其他解决方案吗？

df.columns = [column.replace('\xa0', '') for col in df.columns]