Python 如何删除csv文件中的恼人数据_Python

Python 如何删除csv文件中的恼人数据

python

Python 如何删除csv文件中的恼人数据,python,Python,我想删除csv文件中的一些字符串（“Description”“这是一个模拟”），还想删除数据中的一些“=”和数据末尾的“，”。该文件如下所示 "time","student","items" ="09:00:00","Tim","apple", ="09:00:10","Jason","orange", "09:10:10","Emily","grape", "09:22:10","Ivy","kiwi", "Description" "This is a simulation"

我想删除csv文件中的一些字符串（“Description”“这是一个模拟”），还想删除数据中的一些“=”和数据末尾的“，”。该文件如下所示

"time","student","items"

="09:00:00","Tim","apple",

="09:00:10","Jason","orange",

"09:10:10","Emily","grape",

"09:22:10","Ivy","kiwi",

"Description"

"This is a simulation"

我已经试过了。它不起作用

ff= []

import csv

with open('file.csv') as f:

    for row in csv.DictReader(f):

        row.replace(',','')

        ff.append(row)

我想要这样：

"time","student","items"

"09:00:00","Tim","apple"

"09:00:10","Jason","orange"

"09:10:10","Emily","grape"

"09:22:10","Ivy","kiwi"

您可能希望将文件作为原始文本文件而不是csv来读取，以便更容易使用它执行字符串操作

编辑：我假设

tmp

是CSV文件的路径，

是由

CSV.DictReader

生成的字典列表。然后，您可以通过执行两个主要步骤来编写

convert（tmp）

。一种是重新格式化文件并将其转换为临时文件，另一种是使用

csv.DictReader

将临时文件读入字典数据列表。读取完数据后，将使用

os

模块删除临时文件：

import csv
import os

def convert(tmp):
    new_lines = []
    temp_file = tmp + '.tmp'
    with open(tmp) as fd:
        for line in fd:
            # remove new line characters
            line = line.replace('\n', '').replace('\r', '')

            # delete string
            line = line.replace('=', '').replace('"Description"', '').replace('"This is a simulation"', '')

            # don't add empty string
            if line.strip() == '':
                continue

            # remove last line commas
            if line[-1] == ',':
                line = line[:-1]

            new_lines.append(line)

    # write formatted data to temporary csv file
    with open(temp_file, 'w') as fd:
        fd.write('\n'.join(new_lines))

    # get list data
    ff = None
    with open(temp_file) as f:
        ff = list(csv.DictReader(f))

    # delete temporary file
    os.remove(temp_file)

    return ff

print convert('./file.csv')

主要利用内置的

str

方法，假设第一行始终是有效的头行

ff = []

with open('file.csv') as f:

    for row in f:
        # strip empty lines, and head/tail = ,
        line = row.strip().strip('=').strip(',')

        # skip empty lines
        if not line:
            continue

        # assume first row is always a valid header row
        # split by comma to see if it matches header row
        if not len(ff) or (len(line.split(',')) == len(ff[0].split(','))):
            ff.append(line)

如果

仅出现在行之前和/或之后，您可以利用方法清除该行，然后利用

str.split

方法通过逗号

，

查看每一行是否产生与标题行相同数量的元素（如果不是，则删除或不包括）。不是解决方案，而是（脏的）启发性的做法是，标题行后的“good”行中有一个冒号。添加一行

if'：'不在行中：continue

跳过没有冒号的行。另外，虽然

csv

模块功能强大，但它提供的灵活性比简单的逐行读取文件字符串要小，尤其是当您需要执行字符串操作来完成清理工作时。因此，只需对f中的行执行

，然后执行其余操作。我如何使用def convert（tmp）：retrun@9898这样的函数？我已经更新了答案，使用convert（tmp）
函数返回每行的字典数据列表。感谢您回答这个问题