Python 获取文件大小并附加到CSV文件的新列

Python 获取文件大小并附加到CSV文件的新列,python,csv,Python,Csv,Python 2.4 例如,我有一个2列csv文件 例如: 我想获得csv文件中每一行的路径上对象的文件大小,然后将该值添加到csv文件的新列上。 使之成为: HOST, PATH, FILESIZE server1, /path/to/file1, 6546542 server2, /path/to/file2, 46546343 server3, /path/to/file3, 87523 我试过两种方法,但都不太成功 下面的代码在路径上执行fileSizeCmd(du-b)并正确

Python 2.4 例如,我有一个2列csv文件

例如:

我想获得csv文件中每一行的路径上对象的文件大小,然后将该值添加到csv文件的新列上。 使之成为:

 HOST, PATH, FILESIZE
 server1, /path/to/file1, 6546542
 server2, /path/to/file2, 46546343
 server3, /path/to/file3, 87523
我试过两种方法,但都不太成功

下面的代码在路径上执行fileSizeCmd(du-b)并正确输出filezie,但我还没有弄清楚如何使用数据添加到csv文件中

 import datetime
 import csv
 import os, time
 from subprocess import Popen, PIPE, STDOUT

 now = datetime.datetime.now()
 fileSizeCmd = "du -b"
 SP = " "

 # Try to get disk size and append to another row after entry above
 #st = os.stat(row[3])
 #except IOError:
 #print "failed to get information about", file
 #else:
 #print "file size:", st[ST_SIZE]
 #print "file modified:", time.asctime(time.localtime(st[ST_MTIME]))

 incsv = open('my_list.csv', 'rb')
 try:
     reader = csv.reader(incsv)
     outcsv = open('results/results_' + now.strftime("%m-%d-%Y") + '.csv', 'wb')
     try:
         writer = csv.writer(outcsv)

         for row in reader:
         p = Popen(fileSizeCmd + SP + row[1], shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
         stdout, empty = p.communicate()


         print 'Command: %s\nOutput: %s\n' % (fileSizeCmd + SP + row[1], stdout)

         #  Results in bytes example
         #
         #  Output:
         #  8589935104      /path/to/file
         #

     #  Write 8589935104 to new column of csv FILE

   finally:
      outcsv.close()

 finally:
incsv.close()
你可以

1) 将cvs内容读入(服务器、文件名)的元组列表

2) 收集此列表中每个元素的文件大小

3) 将结果打包到另一个元组(服务器、文件名、文件大小)和另一个列表(“结果”)


4) 将结果写入新文件

首先,获取文件大小要比使用
子流程
容易得多(请参阅):

其次,您的
writer
对象写入到另一个文件的过程是正确的,但是您只需要在从
阅读器
返回的
列表中添加一列,然后将它们反馈到
writer
上的
writerow
(请参阅)。大概是这样的:

>>> writerfp = open('out.csv', 'w')
>>> writer = csv.writer(writerfp)
>>> for row in csv.reader(open('in.csv', 'r')):
...     row.append('column')
...     writer.writerow(row)
...
>>> writerfp.close()

没有错误处理的草图:

#!/usr/bin/env python

import csv
import os

filename = "sample.csv"
# localhost, 01.html.bak
# localhost, 01.htmlbak
# ...

def filesize(filename):
    # no need to shell out for filesize
    return os.stat(filename).st_size

with open(filename, 'rb') as handle:
    reader = csv.reader(handle)
    # result is written to sample.csv.updated.csv
    writer = csv.writer(open('%s.updated.csv' % filename, 'w'))
    for row in reader:
        # need to strip filename, just in case
        writer.writerow(row + [ filesize(row[1].strip()) ])

# result
# localhost, 01.html.bak,10021
# localhost, 01.htmlbak,218982
# ...

我似乎无法让它与2.4一起工作。我想我把你的陈述转化为正确的,但我仍然没有太多luck@miku我让它工作了。谢谢如果文件不存在,它会失败,但是添加`if os.path.exists(filename):`似乎可以解决我的问题,因为文件可能不存在
>>> writerfp = open('out.csv', 'w')
>>> writer = csv.writer(writerfp)
>>> for row in csv.reader(open('in.csv', 'r')):
...     row.append('column')
...     writer.writerow(row)
...
>>> writerfp.close()
#!/usr/bin/env python

import csv
import os

filename = "sample.csv"
# localhost, 01.html.bak
# localhost, 01.htmlbak
# ...

def filesize(filename):
    # no need to shell out for filesize
    return os.stat(filename).st_size

with open(filename, 'rb') as handle:
    reader = csv.reader(handle)
    # result is written to sample.csv.updated.csv
    writer = csv.writer(open('%s.updated.csv' % filename, 'w'))
    for row in reader:
        # need to strip filename, just in case
        writer.writerow(row + [ filesize(row[1].strip()) ])

# result
# localhost, 01.html.bak,10021
# localhost, 01.htmlbak,218982
# ...