使用Python处理从Excel到CSV的特殊字符

使用Python处理从Excel到CSV的特殊字符,python,regex,export-to-csv,Python,Regex,Export To Csv,您好,我在使用python处理从Excel工作表到CSV的特殊字符时遇到问题 当我使用 else: # Encode strings into format to preserve content of cell row_values.append(cell.value.encode("UTF-8").strip()) 我得到的特殊字符是'a' 当我使用 else: #

您好,我在使用python处理从Excel工作表到CSV的特殊字符时遇到问题 当我使用

else:
                    # Encode strings into format to preserve content of cell
                    row_values.append(cell.value.encode("UTF-8").strip())
我得到的特殊字符是
'a'

当我使用

  else:
                    # Encode strings into ISO-8859-1 format to preserve content of cell
                    row_values.append(cell.value.encode("iso-8859-1").strip())
我得到的特殊字符是
'�'容易说吗?钻石色

我相信这与编码有关,但不确定使用哪种编码。这些字符来自转换为CSV的Excel工作表

这是我使用的代码

def convert_to_csv(excel_file, input_dir, output_dir):
    """Convert an excel file to a CSV file by removing irrelevant data"""
    try:
        sheet = read_excel(excel_file)
    except UnicodeDecodeError:
        print 'File %s is possibly corrupt. Please check again.' % (excel_file)
        sys.exit(1)
    row_num = sheet.get_highest_row()  # Number of rows
    col_num = sheet.get_highest_column()  # Number of columns
    all_rows = []
    # Loop through rows and columns
    for row in range(row_num):
        row_values = []
        for column in range(col_num):
            # Get cell element
            cell = sheet.cell(row=row, column=column)
            # Ignore empty cells
            if cell.value is not None:
                if type(cell.value) == int or type(cell.value) == float:
                    # String encoding not applicable for integers and floating point numbers
                    row_values.append(cell.value)
                else:
                    # Encode strings into ISO-8859-1 format to preserve content of cell
                    row_values.append(cell.value.encode("iso-8859-1").strip())
            else:
                row_values.append('')
        # Append rows only having more than three values each
        if len(set(row_values)-{''}) > 3:
            # print row_values
            all_rows.append(row_values)
    # Saving the data to a csv extension with the same name as the given excel file
    output_path = os.path.join(output_dir, excel_file.split('.')[0] + '.csv')
    with open(output_path, 'wb') as f:
        writer = csv.writer(f, delimiter=";", quoting=csv.QUOTE_ALL)

        writer.writerows(all_rows[1:])
使用Python 2.6.9 我想知道我们是否可以在写入CSV之前使用常规表达式 我们还能处理吗

提前谢谢

我们已经修好了

           ` else:
                # Encode strings into ISO-8859-1 format to preserve content of cell
                row_values.append(
                    re.sub(r'[^\x00-\x7f]', r'', cell.value).strip())`