使用python根据特定字段重新格式化CSV_Python_Csv

使用python根据特定字段重新格式化CSV

python csv

使用python根据特定字段重新格式化CSV,python,csv,Python,Csv,我有一个test.csv，我从某个网站上抓取了一些项目，但问题是“number”字段有冗余。因此，我需要删除一行，该行的编号与以前相同。这只是一个示例文件，在实际文件中，一些数字重复了50多次以上 http://example.com/item/all-atv-quad.html,David,"Punjab",+123456789123 http://example.com/item/70cc-2014.html,Qubee,"Capital",+987654321987 http://exam

我有一个test.csv，我从某个网站上抓取了一些项目，但问题是“number”字段有冗余。因此，我需要删除一行，该行的编号与以前相同。这只是一个示例文件，在实际文件中，一些数字重复了50多次以上

http://example.com/item/all-atv-quad.html,David,"Punjab",+123456789123
http://example.com/item/70cc-2014.html,Qubee,"Capital",+987654321987
http://example.com/item/quad-bike-zenith.html,Zenith,"UP",+123456789123

我需要像这样重新格式化csv：

import csv

with open('test.csv', newline='') as csvfile:
    csvreader = csv.reader(csvfile, delimiter=',')

    for column in csvreader:

        "Some logic here"

        if (column[3] == "+123456789123"):
            print (column[0])

            "or here"

这将过滤掉重复项：

http://example.com/item/all-atv-quad.html,David,"Punjab",+123456789123
http://example.com/item/70cc-2014.html,Qubee,"Capital",+987654321987

以及

csv

读写逻辑：

seen = set()
for line in csvreader:
    if line[3] in seen:
        continue
    seen.add(line[3])
    # write line to output file

这将过滤掉重复项：

http://example.com/item/all-atv-quad.html,David,"Punjab",+123456789123
http://example.com/item/70cc-2014.html,Qubee,"Capital",+987654321987

以及

csv

读写逻辑：

seen = set()
for line in csvreader:
    if line[3] in seen:
        continue
    seen.add(line[3])
    # write line to output file

您可以将其缩短：

seen=set（第[3]行表示csvreader中的行）

这假设顺序不重要。@PawełKordowski目的是只写入最后一列中第一次出现值的行。只看到集合

并没有什么用处。您可以把它缩短：seen=set（csvreader中的行是[3]行）
这假设顺序并不重要。@PawełKordowski目的是只写最后一列中第一次出现值的行。仅仅是让人看到集合，
并没有多大用处。