用python剥离csv中的零
您好,我有一个csv文件,我需要用python删除零: python中的第6列和第5列默认为7位数字。用这个用python剥离csv中的零,python,csv,Python,Csv,您好,我有一个csv文件,我需要用python删除零: python中的第6列和第5列默认为7位数字。用这个 AFI12001,01,C-,201405,P,0000430,2,0.02125000,US,60.0000 AFI12001,01,S-,201404,C,0001550,2,0.03500000,US,30.0000 我需要删除前面的零,然后我需要添加一个或多个零,以确保它总共有4位数字 所以我需要它看起来像这样: AFI12001,01,C-,201405,P,0430,2,0
AFI12001,01,C-,201405,P,0000430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,0001550,2,0.03500000,US,30.0000
我需要删除前面的零,然后我需要添加一个或多个零,以确保它总共有4位数字
所以我需要它看起来像这样:
AFI12001,01,C-,201405,P,0430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,1550,2,0.03500000,US,30.0000
此代码添加零的
import csv
new_rows = []
with open('csvpatpos.csv','r') as f:
csv_f = csv.reader(f)
for row in csv_f:
new_row = ""
col = 0
print row
for x in row:
col = col + 1
if col == 6:
if len(x) == 3:
x = "0" + x
new_row = new_row + x + ","
print new_row
但是,我在删除前面的零时遇到问题 您可以使用和方法。像这样:
with open('input') as in_file:
csv_reader = csv.reader(in_file)
for row in csv_reader:
stripped_data = row[5].lstrip('0')
new_data = stripped_data.zfill(4)
print new_data
这张照片是:
0430
1550
该行:
stripped_data = row[5].lstrip('0')
去掉左边所有的零。这句话:
new_data = stripped_data.zfill(4)
用零填充前面,使总位数为4
希望这有帮助。您可以使用和方法。像这样:
with open('input') as in_file:
csv_reader = csv.reader(in_file)
for row in csv_reader:
stripped_data = row[5].lstrip('0')
new_data = stripped_data.zfill(4)
print new_data
这张照片是:
0430
1550
该行:
stripped_data = row[5].lstrip('0')
去掉左边所有的零。这句话:
new_data = stripped_data.zfill(4)
用零填充前面,使总位数为4
希望这能有所帮助。您可以使用
.lstrip()
通过几个步骤来完成此操作,然后找到结果字符串长度,然后在前面添加4-len
0s。但是,我认为使用regex
更容易
with open('infilename', 'r') as infile:
reader = csv.reader(infile)
for row in reader:
stripped_value = re.sub(r'^0{3}', '', row[5])
屈服
0430
1550
在正则表达式中,我们使用的格式是sub(pattern,substitute,original)
。模式细分为:
'^' - match start of string
'0{3}' - match 3 zeros
您说过第6列中的所有字符串都有7位数字,您想要4位,所以用空字符串替换前3位
编辑:如果要替换行,我只需将其写入一个新文件:
with open('infilename', 'r') as infile, open('outfilename', 'w') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
row[5] = re.sub(r'^0{3}', '', row[5])
writer.writerow(row)
Edit2:根据您的最新请求,我建议您执行以下操作:
with open('infilename', 'r') as infile, open('outfilename', 'w') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
# strip all 0's from the front
stripped_value = re.sub(r'^0+', '', row[5])
# pad zeros on the left to smaller numbers to make them 4 digits
row[5] = '%04d'%int(stripped_value)
writer.writerow(row)
鉴于以下数字
['0000430', '0001550', '0013300', '0012900', '0100000', '0001000']
这就产生了
['0430', '1550', '13300', '12900', '100000', '1000']
您可能可以使用
.lstrip()
通过几个步骤完成此操作,然后查找结果字符串长度,然后在前面添加4-len
0s。但是,我认为使用regex
更容易
with open('infilename', 'r') as infile:
reader = csv.reader(infile)
for row in reader:
stripped_value = re.sub(r'^0{3}', '', row[5])
屈服
0430
1550
在正则表达式中,我们使用的格式是sub(pattern,substitute,original)
。模式细分为:
'^' - match start of string
'0{3}' - match 3 zeros
您说过第6列中的所有字符串都有7位数字,您想要4位,所以用空字符串替换前3位
编辑:如果要替换行,我只需将其写入一个新文件:
with open('infilename', 'r') as infile, open('outfilename', 'w') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
row[5] = re.sub(r'^0{3}', '', row[5])
writer.writerow(row)
Edit2:根据您的最新请求,我建议您执行以下操作:
with open('infilename', 'r') as infile, open('outfilename', 'w') as outfile:
reader = csv.reader(infile)
writer = csv.writer(outfile)
for row in reader:
# strip all 0's from the front
stripped_value = re.sub(r'^0+', '', row[5])
# pad zeros on the left to smaller numbers to make them 4 digits
row[5] = '%04d'%int(stripped_value)
writer.writerow(row)
鉴于以下数字
['0000430', '0001550', '0013300', '0012900', '0100000', '0001000']
这就产生了
['0430', '1550', '13300', '12900', '100000', '1000']
将列转换为int,然后再转换为任意格式的字符串
row[5] = "%04d" % int(row[5])
将列转换为int,然后再转换为任意格式的字符串
row[5] = "%04d" % int(row[5])
你可以保留最后4个字符
columns[5] = columns[5][-4:]
范例
data = '''AFI12001,01,C-,201405,P,0000430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,0001550,2,0.03500000,US,30.0000'''
for row in data.splitlines():
columns = row.split(',')
columns[5] = columns[5][-4:]
print ','.join(columns)
结果
AFI12001,01,C-,201405,P,0430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,1550,2,0.03500000,US,30.0000
编辑: 使用
csv
模块编码-而不是数据
来模拟文件
import csv
with open('csvpatpos.csv','r') as f:
csv_f = csv.reader(f)
for row in csv_f:
row[5] = row[5][-4:]
print row[5] # print one element
#print ','.join(row) # print full row
print row # print full row
你可以保留最后4个字符
columns[5] = columns[5][-4:]
范例
data = '''AFI12001,01,C-,201405,P,0000430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,0001550,2,0.03500000,US,30.0000'''
for row in data.splitlines():
columns = row.split(',')
columns[5] = columns[5][-4:]
print ','.join(columns)
结果
AFI12001,01,C-,201405,P,0430,2,0.02125000,US,60.0000
AFI12001,01,S-,201404,C,1550,2,0.03500000,US,30.0000
编辑: 使用
csv
模块编码-而不是数据
来模拟文件
import csv
with open('csvpatpos.csv','r') as f:
csv_f = csv.reader(f)
for row in csv_f:
row[5] = row[5][-4:]
print row[5] # print one element
#print ','.join(row) # print full row
print row # print full row
所有数字都在同一索引中吗?顺便说一句:正确的缩进使代码可读。所有数字都在同一索引中吗?顺便说一句:正确的缩进使代码可读。我如何将所有数据加上这些数据添加到另一个csv文件中?我会使用write命令吗?此外,这是将零添加到空白字段,是否有办法解决此问题?如何将所有数据加上此添加到另一个csv文件?我会使用write命令吗?此外,这是在空白字段中添加零,有没有办法解决这个问题?我得到的列未定义,之后有大量的行具有不同的数据。请参阅带有模块
csv
和不带data
模拟文件的新示例。这实际上是有效的。但是,它会将零添加到空白列中。你知道如何解决这个问题吗?我发现列是未定义的,之后有大量的行具有不同的数据。请参阅带有模块csv
和不带data
模拟文件的新示例。这实际上是有效的。但是,它会将零添加到空白列中。你知道该怎么处理吗?嗨,比尔,这可能有用。我是否也需要写入文件来替换数据?看起来它实际上不起作用。因为某种原因,它保持零。但是它看起来走对了方向。尼克,你在这方面还有问题吗?还没开始工作。它似乎在给超过4位数的数字加零。AFI12001,01,W-,201405,P,0560,2,0.01375000,美国,55.0000 AFI12001,01,17201404,C,0013300,2,0.15625000,美国,21.0000 AFI12001,01,17201404,C,0013400,2,0.06250000,美国,30.0000 AFI12001,01,17201404,C,0013500,2,0.03125000,美国,10.0000 AFI12001,01,01404,C,0013700,2,0.01563000,美国,17201680000,2800,P,38.0000 AFI12001,01,17201404,P,0012900,2,0.10938000,美国,5.000您确定这是我的解决方案吗?此re.sub
调用不会添加任何零。它将只替换字符串中的3个前导零。你想让它做得更多吗?具体来说,1)你所有的数字都是7位数吗?2.)是否要从数字(即0013300->13300)中删除所有前导零?嗨,比尔,这实际上可能有效。我是否也需要写入文件来替换数据?看起来它实际上不起作用。因为某种原因,它保持零。但是它看起来走对了方向。尼克,你在这方面还有问题吗?还没开始工作。它似乎在给超过4位数的数字加零。AFI12001,01,W-,201405,P,0560,2,0.01375000,美国,55.0000 AFI12001,01,17201404,C,0013300,2,0.15625000,美国,21.0000 AFI12001,01,17201404,C,0013400,2,0.06250000,美国,30.0000 AFI12001,01,17201404,C,0013500,2,0.03125000,美国,10.0000 AFI12001,01,01404,C,0013700,2,0.01563000,美国,17201680000,2800,P,38.0000 AFI12001,01,17201404,P,0012900,2,0.10938000,美国,5.000您确定这是我的解决方案吗?此re.sub
调用不会添加任何零。它将只替换3个引线