如何使Python只看逗号,在分隔符之前或之后没有空格

如何使Python只看逗号,在分隔符之前或之后没有空格,python,csv,delimiter,python-3.5,Python,Csv,Delimiter,Python 3.5,我有一个csv文件,我正试图读入python,操纵它,然后写入另一个csv文件 我当前的问题是,尽管文件是以逗号分隔的,但并非所有逗号都是分隔符 只有前面和/或后面有空格的而不是的逗号才应算作分隔符。(仅限“、”非“、”或“、”) 下面是我的代码的样子: import csv #open file for reading with open(mypath, 'r', encoding = 'utf_8') as csvfile: myfile = list(csv.reader(csv

我有一个csv文件,我正试图读入python,操纵它,然后写入另一个csv文件

我当前的问题是,尽管文件是以逗号分隔的,但并非所有逗号都是分隔符

只有前面和/或后面有空格的而不是的逗号才应算作分隔符。(仅限“、”非“、”或“、”)

下面是我的代码的样子:

import csv

#open file for reading
with open(mypath, 'r', encoding = 'utf_8') as csvfile:
    myfile = list(csv.reader(csvfile, dialect = 'excel', delimiter = ','))
    #specifying columns to be deleted
    BadCols = [29,28,27,25,21,20,19,18,16,15,14,13,12,11,8,7,4,3] 
    #Loop through column indices to be deleted
    for col in BadCols:        
        #Loop through each row to delete columns
        for i, row in enumerate(myfile):
            #Delete Column, which is basically a list item at that row
            myfile[i].pop(col)


#Open file for writing
with open(mypath2, "w", encoding = 'utf_8', newline='') as csvfile:
    csv_file = csv.writer(csvfile, dialect = 'excel', delimiter = ',')
    for i, row in enumerate(myfile):
        for j, col in enumerate(row):
            csvfile.write('%s, ' %col)
        csvfile.write('\n')
csvfile.close
Date,Name,City
May 30, 2016,Ryan,Boston
我的数据如下所示:

import csv

#open file for reading
with open(mypath, 'r', encoding = 'utf_8') as csvfile:
    myfile = list(csv.reader(csvfile, dialect = 'excel', delimiter = ','))
    #specifying columns to be deleted
    BadCols = [29,28,27,25,21,20,19,18,16,15,14,13,12,11,8,7,4,3] 
    #Loop through column indices to be deleted
    for col in BadCols:        
        #Loop through each row to delete columns
        for i, row in enumerate(myfile):
            #Delete Column, which is basically a list item at that row
            myfile[i].pop(col)


#Open file for writing
with open(mypath2, "w", encoding = 'utf_8', newline='') as csvfile:
    csv_file = csv.writer(csvfile, dialect = 'excel', delimiter = ',')
    for i, row in enumerate(myfile):
        for j, col in enumerate(row):
            csvfile.write('%s, ' %col)
        csvfile.write('\n')
csvfile.close
Date,Name,City
May 30, 2016,Ryan,Boston
以下是我在使用excel打开文件时希望看到的内容:

Date            Name    City
May 30, 2016    Ryan    Boston
以下是我从Excel中实际看到的内容:

Date     [Blank column name]    Name   City
May 30   2016                   Ryan   Boston
因此,日期被读取为两个元素,而不是一个


非常感谢您的帮助。

正则表达式可能是您最好的选择:

import re

patt = re.compile(r"\b,\b")
with open("in.csv") as f:
    for row in map(patt.split, f):
        print(row)
这将给你:

['Date', 'Name', 'City\n']
['May 30, 2016', 'Ryan', 'Boston']
您将不得不处理尾随空格,但这不应该是一个大问题。显然,如果您将
“foo,bar”
作为一个名称,您也会遇到问题,例如,如果不是这样,重新使用方法就可以了

另一种选择可能是用一个空格替换
”、“
”、“
”:

import csv
import re

patt = re.compile(r"\s(,)|(,)\s")

with open("in.csv") as f:
    for line in csv.reader(map(lambda s: patt.sub(" ", s), f)):
        print(line)
因此:

Date,Name,City
May 30, 2016,Ryan,Boston
May 31 ,2016,foo,Narnia
你会得到:

['Date', 'Name', 'City']
['May 30 2016', 'Ryan', 'Boston']
['May 31 2016', 'foo', 'Narnia']

CSV和一个字段分隔符也被用作没有“引用”的内容-shiver,我建议作为fast hack,首先用一个带外字符(比如管道(|))替换所有“好”分隔符,该字符不会出现在文件中的其他位置,而不是在该字符上拆分,或者让CSV模块用一种特殊方言或自动检测对其进行解析,这样就完成了。但也许在这里的晚上太晚了;-)或者,如果从右侧开始,则通过简单的
line.rsplit(',',2)
或类似方式从右侧开始解析,始终有两个逗号是“好的”+1对于@padraic cunningham的回答,您拥有的不是正确的CSV文件。修复文件…对于那些面临相同问题的人,您也可以尝试Pandas库,尤其是如果Padraic建议的解决方案不适合您。它很容易使用。