Python 如何向CSV文件中添加新列？_Python_Csv_Python 3.x

Python 如何向CSV文件中添加新列？

python csv python-3.x

Python 如何向CSV文件中添加新列？,python,csv,python-3.x,Python,Csv,Python 3.x,我有几个文件如下所示：输入名称代码黑莓1 wineberry 2 拉斯贝里1号蓝莓1号桑树2 我想在所有CSV文件中添加一个新列，使其如下所示：输出姓名代码Berry 黑莓1 wineberry 2 wineberry 树莓1号树莓蓝莓1蓝莓桑树2号桑树到目前为止，我的剧本是： import csv with open(input.csv,'r') as csvinput: with open(output.csv, 'w') as csvoutput:

我有几个文件如下所示：

输入
名称代码
黑莓1
wineberry 2
拉斯贝里1号
蓝莓1号
桑树2

我想在所有CSV文件中添加一个新列，使其如下所示：

输出
姓名代码Berry
黑莓1
wineberry 2 wineberry
树莓1号树莓
蓝莓1蓝莓
桑树2号桑树

到目前为止，我的剧本是：

import csv
with open(input.csv,'r') as csvinput:
    with open(output.csv, 'w') as csvoutput:
        writer = csv.writer(csvoutput)
        for row in csv.reader(csvinput):
            writer.writerow(row+['Berry'])

（Python 3.2）

但在输出中，脚本跳过每一行，新列中只有Berry：

输出
姓名代码Berry
黑莓1号
wineberry 2浆果
树莓1号
蓝莓1浆果
桑椹2号浆果

也许你就是这么想的

此外，csv代表逗号分隔的值。所以，你需要逗号来分隔你的价值观，我认为：

Name,Code
blackberry,1
wineberry,2
rasberry,1
blueberry,1
mulberry,2

这应该让你知道该怎么做：

>>> v = open('C:/test/test.csv')
>>> r = csv.reader(v)
>>> row0 = r.next()
>>> row0.append('berry')
>>> print row0
['Name', 'Code', 'berry']
>>> for item in r:
...     item.append(item[0])
...     print item
...     
['blackberry', '1', 'blackberry']
['wineberry', '2', 'wineberry']
['rasberry', '1', 'rasberry']
['blueberry', '1', 'blueberry']
['mulberry', '2', 'mulberry']
>>>

编辑，注意在py3k中必须使用

next（r）

谢谢你接受答案。这是您的奖金（您的工作脚本）：

请注意

csv.writer

中的

lineterminator

参数。默认情况下是这样的设置为

'\r\n'

，这就是为什么要使用双间距

使用列表附加所有行并将它们写入使用

writerows

进行一次拍摄。如果你的文件非常非常大可能不是一个好主意（RAM），但对于普通文件，我认为是这样更快，因为I/O更少

如本文评论所示，请注意使用语句嵌套两个

，可以在同一行中执行：
将open（'C:/test/test.csv'，'r'）作为csv输入，将open（'C:/test/output.csv'，'w'）作为csv输出：


我看不出您要在哪里添加新列，但请尝试以下操作：
    import csv
    i = 0
    Berry = open("newcolumn.csv","r").readlines()
    with open(input.csv,'r') as csvinput:
        with open(output.csv, 'w') as csvoutput:
            writer = csv.writer(csvoutput)
            for row in csv.reader(csvinput):
                writer.writerow(row+","+Berry[i])
                i++

我很惊讶没有人推荐熊猫。尽管使用一组依赖项（如Pandas）可能看起来比完成如此简单的任务所需的更为繁重，但它生成了一个非常短的脚本，Pandas是执行各种CSV（实际上是所有数据类型）数据操作的伟大库。不能与4行代码争论：
import pandas as pd
csv_input = pd.read_csv('input.csv')
csv_input['Berries'] = csv_input['Name']
csv_input.to_csv('output.csv', index=False)

查看更多信息
output.csv的内容
：
Name,Code,Berries
blackberry,1,blackberry
wineberry,2,wineberry
rasberry,1,rasberry
blueberry,1,blueberry
mulberry,2,mulberry

我用熊猫，效果很好。。。
当我使用它时，我必须打开一个文件，并向其中添加一些随机列，然后只保存回同一个文件
此代码添加了多个列条目，您可以根据需要进行编辑
import pandas as pd

csv_input = pd.read_csv('testcase.csv')         #reading my csv file
csv_input['Phone1'] = csv_input['Name']         #this would also copy the cell value 
csv_input['Phone2'] = csv_input['Name']
csv_input['Phone3'] = csv_input['Name']
csv_input['Phone4'] = csv_input['Name']
csv_input['Phone5'] = csv_input['Name']
csv_input['Country'] = csv_input['Name']
csv_input['Website'] = csv_input['Name']
csv_input.to_csv('testcase.csv', index=False)   #this writes back to your file

如果您希望该单元格值不被复制，那么首先在csv文件中手动创建一个空列，就像您将其命名为Hours
那么，现在你可以在上面的代码中添加这一行
csv_input['New Value'] = csv_input['Hours']

或者简单地说，我们可以，不添加手动列，我们可以
csv_input['New Value'] = ''    #simple and easy

我希望它能有所帮助。
此代码将满足您的要求，我已经对示例代码进行了测试
import csv

with open(in_path, 'r') as f_in, open(out_path, 'w') as f_out:
    csv_reader = csv.reader(f_in, delimiter=';')
    writer = csv.writer(f_out)

    for row in csv_reader:
    writer.writerow(row + [row[0]]

是的，这是一个老问题，但它可能会帮助一些人
import csv
import uuid

# read and write csv files
with open('in_file','r') as r_csvfile:
    with open('out_file','w',newline='') as w_csvfile:

        dict_reader = csv.DictReader(r_csvfile,delimiter='|')
        #add new column with existing
        fieldnames = dict_reader.fieldnames + ['ADDITIONAL_COLUMN']
        writer_csv = csv.DictWriter(w_csvfile,fieldnames,delimiter='|')
        writer_csv.writeheader()


        for row in dict_reader:
            row['ADDITIONAL_COLUMN'] = str(uuid.uuid4().int >> 64) [0:6]
            writer_csv.writerow(row)

使用不带标题名的python在现有csv文件中追加新列
  default_text = 'Some Text'
# Open the input_file in read mode and output_file in write mode
    with open('problem-one-answer.csv', 'r') as read_obj, \
    open('output_1.csv', 'w', newline='') as write_obj:
# Create a csv.reader object from the input file object
    csv_reader = reader(read_obj)
# Create a csv.writer object from the output file object
    csv_writer = csv.writer(write_obj)
# Read each row of the input csv file as list
    for row in csv_reader:
# Append the default text in the row / list
        row.append(default_text)
# Add the updated row / list to the output file
        csv_writer.writerow(row)

谢谢
对于大文件，您可以使用pandas.read\u csv
和chunksize
参数，该参数允许读取每个数据块的数据集：
import pandas as pd

INPUT_CSV = "input.csv"
OUTPUT_CSV = "output.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(OUTPUT_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

如果要就地更新文件，可以使用临时文件并在最后将其删除
import pandas as pd

INPUT_CSV = "input.csv"
TMP_CSV = "tmp.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(TMP_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

os.replace(TMP_CSV, INPUT_CSV)

对于向现有CSV文件（带标题）添加新列，如果要添加的列的值数量足够小，这里有一个方便的函数（有点类似于@joaquin的解决方案）。该函数采用
现有CSV文件名
输出CSV文件名（将有更新的内容）和
具有标题名称和列值的列表
例如：
new_list1 = ['test_hdr',4,4,5,5,9,9,9]
add_col_to_csv('exists.csv','new-output.csv',new_list1)

现有CSV文件：

输出（更新）CSV文件：
可能的重复是否可能因为您只在文件中写入“Berry”，所以您的最后一列中只有“Berry”？（row+['Berry']）你想写什么？@Dhara：我想把Berry作为标题，把列值命名为Berry的行值。请参见上文。您也可以按照本文中的建议使用熊猫数据框，谢谢您的注释。我尝试了，但它给了我属性错误：“\u csv.reader”对象没有属性“next”。你知道吗？我看到你在py3k。那么你必须用next（r）而不是r。谢谢你的奖金！！注意：不要用

语句嵌套

，你可以在同一行用逗号分隔它们，例如：用open（input_filename）作为输入文件，open（output_filename，'w'）作为输出文件
@Caumons你是对的，这将是现在的做法。注意：我的答案试图保持OP代码结构，以专注于解决他的问题。创建一个关于堆栈溢出的新问题。这应该是可以接受的答案，因为它不会一次将所有输入行放入内存中。关于使用uuid
？只是向列中添加一些随机数据，没有说明！！！如何在同一csv中更新或添加新列？？input.csv？？？@AnkitMaheshwari，将本例中的output.csv
名称更改为input.csv
。它将做同样的事情，但输出到input.csv
@AnkitMaheshwari Yes。。。这就是预期的功能。您想用新内容替换旧内容（包含Name
和code
的内容），新内容包含与旧内容相同的两列，再加上带有Berries的新列，正如OP所要求的那样。警告一句：熊猫非常适合大小适中的文件。这个答案会将所有数据加载到内存中，这对于大文件来说可能会很麻烦。@pedrostrusso但是除非加载4-16 gb的文件，否则RAM应该很好。除非你用土豆。
import pandas as pd

INPUT_CSV = "input.csv"
TMP_CSV = "tmp.csv"
CHUNKSIZE = 1_000 # Maximum number of rows in memory

header = True
mode = "w"
for chunk_df in pd.read_csv(INPUT_CSV, chunksize=CHUNKSIZE):
    chunk_df["Berry"] = chunk_df["Name"]
    # You apply any other transformation to the chunk
    # ...
    chunk_df.to_csv(TMP_CSV, header=header, mode=mode)
    header = False # Do not save the header for the other chunks
    mode = "a" # 'a' stands for append mode, all the other chunks will be appended

os.replace(TMP_CSV, INPUT_CSV)

def add_col_to_csv(csvfile,fileout,new_list):
    with open(csvfile, 'r') as read_f, \
        open(fileout, 'w', newline='') as write_f:
        csv_reader = csv.reader(read_f)
        csv_writer = csv.writer(write_f)
        i = 0
        for row in csv_reader:
            row.append(new_list[i])
            csv_writer.writerow(row)
            i += 1 

new_list1 = ['test_hdr',4,4,5,5,9,9,9]
add_col_to_csv('exists.csv','new-output.csv',new_list1)