Python 我需要读取具有未知列数的csv,然后将数据写入具有设置列数的csv
我有一个文件,看起来像这样:Python 我需要读取具有未知列数的csv,然后将数据写入具有设置列数的csv,python,csv,Python,Csv,我有一个文件,看起来像这样: name,number,email,job1,job2,job3,job4 name,number,email,job1 name,number,email,job2 name,number,email,job3 name,number,email,job4 我需要将其转换为如下所示: name,number,email,job1,job2,job3,job4 name,number,email,job1 name,number,email,job2 name
name,number,email,job1,job2,job3,job4
name,number,email,job1
name,number,email,job2
name,number,email,job3
name,number,email,job4
我需要将其转换为如下所示:
name,number,email,job1,job2,job3,job4
name,number,email,job1
name,number,email,job2
name,number,email,job3
name,number,email,job4
在Python中如何实现这一点?正如在一篇评论中所说的,您可以使用pandas读取、写入和操作csv文件 下面是一个如何用python解决pandas问题的示例
import pandas as pd
# df = pd.read_csv("filename.csv") # read csv file from disk
# comment out below line when open from disk
df = pd.DataFrame([['ss','0152','ss@','student','others']],columns=['name','number','email','job1','job2'])
print(df)
该行输出为
name number email job1 job2
0 ss 0152 ss@ student others
现在我们需要知道有多少列:
x = len(df.columns)
print(x)
它将以x存储列数
5
现在,让我们创建一个空的Dataframe
,其中列=[姓名、号码、电子邮件、工作]
c = pd.DataFrame(columns=['name','number','email','job'])
print(c)
输出:
Columns: [name, number, email, job]
Index: []
name number email job
0 ss 0152 ss@ others
0 ss 0152 ss@ student
现在,我们使用从范围3到列末尾的循环,并将concat datafarme用于空数据帧:
for i in range(3,x):
df1 = df.iloc[:,0:3].copy() # we took first 3 column
df2 = df.iloc[:,[i]].copy() # we took ith coulmn
df1['job'] = df2; # added ith coulmn to the df1
c = pd.concat([df1,c]); # concat df1 and c
print(c)
import pandas as pd
df = pd.DataFrame(columns=['name','number','email','job1','job2','job3','job4'])
df = df.append({'name':'jon', 'number':123, 'email':'smth@smth.smth', 'job1':'a','job2':'b','job3':'c','job4':'d'},ignore_index=True)
new_df = pd.DataFrame(columns=['name','number','email','job'])
输出:
Columns: [name, number, email, job]
Index: []
name number email job
0 ss 0152 ss@ others
0 ss 0152 ss@ student
数据帧c有您想要的输出。现在您可以使用
c.to_csv('ouput.csv')
正如在评论中所说的,您可以使用pandas读取、写入和操作csv文件 下面是一个如何用python解决pandas问题的示例
import pandas as pd
# df = pd.read_csv("filename.csv") # read csv file from disk
# comment out below line when open from disk
df = pd.DataFrame([['ss','0152','ss@','student','others']],columns=['name','number','email','job1','job2'])
print(df)
该行输出为
name number email job1 job2
0 ss 0152 ss@ student others
现在我们需要知道有多少列:
x = len(df.columns)
print(x)
它将以x存储列数
5
现在,让我们创建一个空的Dataframe
,其中列=[姓名、号码、电子邮件、工作]
c = pd.DataFrame(columns=['name','number','email','job'])
print(c)
输出:
Columns: [name, number, email, job]
Index: []
name number email job
0 ss 0152 ss@ others
0 ss 0152 ss@ student
现在,我们使用从范围3到列末尾的循环,并将concat datafarme用于空数据帧:
for i in range(3,x):
df1 = df.iloc[:,0:3].copy() # we took first 3 column
df2 = df.iloc[:,[i]].copy() # we took ith coulmn
df1['job'] = df2; # added ith coulmn to the df1
c = pd.concat([df1,c]); # concat df1 and c
print(c)
import pandas as pd
df = pd.DataFrame(columns=['name','number','email','job1','job2','job3','job4'])
df = df.append({'name':'jon', 'number':123, 'email':'smth@smth.smth', 'job1':'a','job2':'b','job3':'c','job4':'d'},ignore_index=True)
new_df = pd.DataFrame(columns=['name','number','email','job'])
输出:
Columns: [name, number, email, job]
Index: []
name number email job
0 ss 0152 ss@ others
0 ss 0152 ss@ student
数据帧c有您想要的输出。现在您可以使用
c.to_csv('ouput.csv')
下:
input.csv
header row
name,number,email,job1,job2,job3,job4
name1,number1,email1,job11,job21,job31,job41
output.csv
name,number,email,job1
name,number,email,job2
name,number,email,job3
name,number,email,job4
name1,number1,email1,job11
name1,number1,email1,job21
name1,number1,email1,job31
name1,number1,email1,job41
下:
input.csv
header row
name,number,email,job1,job2,job3,job4
name1,number1,email1,job11,job21,job31,job41
output.csv
name,number,email,job1
name,number,email,job2
name,number,email,job3
name,number,email,job4
name1,number1,email1,job11
name1,number1,email1,job21
name1,number1,email1,job31
name1,number1,email1,job41
假设这是数据帧:
for i in range(3,x):
df1 = df.iloc[:,0:3].copy() # we took first 3 column
df2 = df.iloc[:,[i]].copy() # we took ith coulmn
df1['job'] = df2; # added ith coulmn to the df1
c = pd.concat([df1,c]); # concat df1 and c
print(c)
import pandas as pd
df = pd.DataFrame(columns=['name','number','email','job1','job2','job3','job4'])
df = df.append({'name':'jon', 'number':123, 'email':'smth@smth.smth', 'job1':'a','job2':'b','job3':'c','job4':'d'},ignore_index=True)
new_df = pd.DataFrame(columns=['name','number','email','job'])
我们定义了一个新的数据帧:
for i in range(3,x):
df1 = df.iloc[:,0:3].copy() # we took first 3 column
df2 = df.iloc[:,[i]].copy() # we took ith coulmn
df1['job'] = df2; # added ith coulmn to the df1
c = pd.concat([df1,c]); # concat df1 and c
print(c)
import pandas as pd
df = pd.DataFrame(columns=['name','number','email','job1','job2','job3','job4'])
df = df.append({'name':'jon', 'number':123, 'email':'smth@smth.smth', 'job1':'a','job2':'b','job3':'c','job4':'d'},ignore_index=True)
new_df = pd.DataFrame(columns=['name','number','email','job'])
现在,我们在旧的基础上循环,根据作业将其拆分。我假设您有4份工作要分配:
for i, row in df.iterrows():
for job in range(1,5):
job_col = "job" + str(job)
new_df = new_df.append({'name':row['name'], 'number':row['number'], 'email':row['email'], 'job':row[job_col]}, ignore_index=True)
假设这是数据帧:
for i in range(3,x):
df1 = df.iloc[:,0:3].copy() # we took first 3 column
df2 = df.iloc[:,[i]].copy() # we took ith coulmn
df1['job'] = df2; # added ith coulmn to the df1
c = pd.concat([df1,c]); # concat df1 and c
print(c)
import pandas as pd
df = pd.DataFrame(columns=['name','number','email','job1','job2','job3','job4'])
df = df.append({'name':'jon', 'number':123, 'email':'smth@smth.smth', 'job1':'a','job2':'b','job3':'c','job4':'d'},ignore_index=True)
new_df = pd.DataFrame(columns=['name','number','email','job'])
我们定义了一个新的数据帧:
for i in range(3,x):
df1 = df.iloc[:,0:3].copy() # we took first 3 column
df2 = df.iloc[:,[i]].copy() # we took ith coulmn
df1['job'] = df2; # added ith coulmn to the df1
c = pd.concat([df1,c]); # concat df1 and c
print(c)
import pandas as pd
df = pd.DataFrame(columns=['name','number','email','job1','job2','job3','job4'])
df = df.append({'name':'jon', 'number':123, 'email':'smth@smth.smth', 'job1':'a','job2':'b','job3':'c','job4':'d'},ignore_index=True)
new_df = pd.DataFrame(columns=['name','number','email','job'])
现在,我们在旧的基础上循环,根据作业将其拆分。我假设您有4份工作要分配:
for i, row in df.iterrows():
for job in range(1,5):
job_col = "job" + str(job)
new_df = new_df.append({'name':row['name'], 'number':row['number'], 'email':row['email'], 'job':row[job_col]}, ignore_index=True)
您可以使用模块和Python的解包语法从输入文件获取数据并将其写入输出文件
导入csv
以open('input.csv',换行符='')作为填充,以open('output.csv','w',换行符='')作为输出文件:
reader=csv.reader(infle)
writer=csv.writer(输出文件)
#如有必要,跳过标题行
下一位(读者)
#使用序列解包获得固定变量和
#以及任意数量的“工作”。
对于reader中的姓名、号码、电子邮件、*作业:
工作中的工作:
writer.writerow([姓名、号码、电子邮件、职务])
您可以使用模块和Python的解包语法从输入文件获取数据并将其写入输出文件
导入csv
以open('input.csv',换行符='')作为填充,以open('output.csv','w',换行符='')作为输出文件:
reader=csv.reader(infle)
writer=csv.writer(输出文件)
#如有必要,跳过标题行
下一位(读者)
#使用序列解包获得固定变量和
#以及任意数量的“工作”。
对于reader中的姓名、号码、电子邮件、*作业:
工作中的工作:
writer.writerow([姓名、号码、电子邮件、职务])
查看熊猫图书馆。您可以使用熊猫创建csv文件。您可以获取列名,然后在希望签出库时对其进行编辑。您可以使用熊猫创建csv文件。您可以获取列名,然后根据需要进行编辑,非常感谢!那很好用。如何使其跳过标题行?我将如何使它循环,直到它通过所有行(未知数量)?请分享一个“input.csv”的真实示例,以便我们可以讨论它。(只分享几行)代码被修改了forename,lastname,email,mobilenumber,jobtitle,jobtitle2,jobtitle3,jobtitle4,jobtitle5,jobtitle6 bob,ainsworth,email@email.com,7891234567,Ashtanga,瑜珈,力量瑜珈,与我的瑜珈师一起滚动,S&M,Vinyasa Flow瑜珈alice,frobisher,email2@email.com,7891234000,丽芙阿姨,巴雷45,波因特,回拨,Dancehall,Floorwork
@BenNewton检查最新代码是否适用于您的示例嗨,非常感谢!那很好用。如何使其跳过标题行?我将如何使它循环,直到它通过所有行(未知数量)?请分享一个“input.csv”的真实示例,以便我们可以讨论它。(只分享几行)代码被修改了forename,lastname,email,mobilenumber,jobtitle,jobtitle2,jobtitle3,jobtitle4,jobtitle5,jobtitle6 bob,ainsworth,email@email.com,7891234567,Ashtanga,瑜珈,力量瑜珈,与我的瑜珈师一起滚动,S&M,Vinyasa Flow瑜珈alice,frobisher,email2@email.com,7891234000,丽芙阿姨,巴雷45,波因特,回拨,Dancehall,Floorwork
@BenNewton检查最新代码是否适用于您的示例谢谢,这非常有用。不幸的是,这需要为未知数量的工作而工作。谢谢,这真的很有用。不幸的是,这需要为未知数量的工作岗位工作。