使用Python重新排列.csv数据

使用Python重新排列.csv数据,python,python-3.x,csv,format,Python,Python 3.x,Csv,Format,我有一个csv文件,其中包含的数据每天都被分割成一个单独的列 'Time', 'Sun 01', 'Mon 02', 'Tue 03', 'Wed 04', 'Thu 05', 'Fri 06', 'Sat 07', 'Sun 08', 'Mon 09', 'Tue 10', 'Wed 11', 'Thu 12', 'Fri 13', 'Sat 14', 'Sun 15', 'Mon 16', 'Tue 17', 'Wed 18', 'Thu 19', 'Fri 20', 'Sat 21', '

我有一个csv文件,其中包含的数据每天都被分割成一个单独的列

'Time', 'Sun 01', 'Mon 02', 'Tue 03', 'Wed 04', 'Thu 05', 'Fri 06', 'Sat 07', 'Sun 08', 'Mon 09', 'Tue 10', 'Wed 11', 'Thu 12', 'Fri 13', 'Sat 14', 'Sun 15', 'Mon 16', 'Tue 17', 'Wed 18', 'Thu 19', 'Fri 20', 'Sat 21', 'Sun 22', 'Mon 23', 'Tue 24', 'Wed 25', 'Thu 26', 'Fri 27', 'Sat 28', 'Sun 29', 'Mon 30'
'00:00-00:05', '0.30', '0.30', '0.30', '0.30', '0.30', '0.40', '0.10', '0.20', '0.20', '0.20', '0.10', '0.20', '0.20', '0.30', '0.30', '0.10', '0.20', '0.20', '0.10', '0.10', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.20', '0.10', '0.10'
'00:05-00:10', '0.30', '0.30', '0.30', '0.30', '0.30', '0.50', '0.20', '0.10', '0.10', '0.20', '0.10', '0.30', '0.10', '0.20', '0.30', '0.10', '0.20', '0.10', '0.20', '0.20', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.10', '0.10', '0.10'
'00:10-00:15', '0.30', '0.30', '0.30', '0.30', '0.30', '0.40', '0.20', '0.20', '0.20', '0.20', '0.20', '0.30', '0.10', '0.30', '0.30', '0.20', '0.10', '0.20', '0.10', '0.10', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.20', '0.20', '0.10'
'00:15-00:20', '0.30', '0.30', '0.30', '0.30', '0.40', '0.50', '0.10', '0.10', '0.10', '0.20', '0.10', '0.30', '0.20', '0.30', '0.30', '0.10', '0.20', '0.20', '0.20', '0.20', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.10', '0.10', '0.00'
'00:20-00:25', '0.30', '0.30', '0.40', '0.40', '0.30', '0.40', '0.20', '0.20', '0.20', '0.20', '0.10', '0.30', '0.10', '0.30', '0.30', '0.10', '0.20', '0.10', '0.20', '0.10', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.00', '0.20', '0.10', '0.20'
使用python,是否有一种方法可以重新排列数据,以便将每天的数据添加到前几天数据的末尾,其中有一个长的collumn

例如:

Date, Time, Value,
01-01-2000, 00:00, 0.01
01-01-2000, 00:00, 0.01
01-01-2000, 00:05, 0.01
01-01-2000, 00:10, 0.01
02-01-2000, 00:00, 0.01
02-01-2000, 00:05, 0.01
02-01-2000, 00:10, 0.01
我被困在试图通过数据递归。 如果我将csv中的数据设置为一个变量,我将丢失单独的列表,并且不确定如何再次分离数据,以便我可以将每天附加到新csv的底部。 是否有一种方法可用于将csv数据存储在一个变量中,该变量将为每行维护单独的列表

到目前为止,我已经:

import csv
month_year = "01-2000"
filename = 'test.csv'

converted_data = "converted_" + filename
cols = ['Time', 'Date(dd-mm-yyyy', 'kWh']

interval_count = 0
day = 1

with open(converted_data, 'w') as csvfile:
    csvwriter = csv.writer(csvfile)

    csvwriter.writerow(cols)

    with open(filename, 'r') as csvfile:
        data = csv.reader(csvfile)
        next(data)

        for line in data:
            total_count = len(line[1:]) * 288       # 288 = amount of 5 min intervals in 24 hours

            time_full = line[0]
            time_clean = (time_full[:5])
            if day <= 9:
                date = "0{0}{1}".format(day, month_year)
            else:
                date = "{0}{1}".format(day, month_year)
            # print(line)
            row = [time_clean, date, line[day]]
            print(row)
            csvwriter.writerow(row)
            interval_count += 1
            if interval_count % 288 == 0:
                day += 1
                interval_count = 0
导入csv
年月=“01-2000”
文件名='test.csv'
已转换的\u data=“已转换的”+文件名
cols=['时间','日期(dd-mm-yyy','千瓦时']
间隔计数=0
天=1
将打开的(转换的_数据,'w')作为csvfile:
csvwriter=csv.writer(csvfile)
csvwriter.writerow(cols)
将open(filename,'r')作为csvfile:
data=csv.reader(csvfile)
下一步(数据)
对于行输入数据:
总计数=长度(第[1:]行)*288#288=24小时内5分钟间隔的数量
完整时间=行[0]
时间\清洁=(时间\满[:5])

if day我在我的代码中进行了注释。基本上,你使用它来获取数据的列视图。然后应用一些逻辑,为每天的输出添加fimes。然后你将所有数据写入输出文件:



然后我们读回并处理它:

import csv

# we append each row into a list - we get lists of rows:
with open("d.txt","r",newline='') as r:
    reader = csv.reader(r, delimiter = ',', quotechar = "'", skipinitialspace = True)
    data = []
    for row in reader:
        data.append(row)

# we transpose these lists of rows into lists of columns, we seperate out the 
# time-row, we will need it multiple times - once for each day
time, *dataX = list(map(list,zip(*data)))
print(time)   # see (shortened) debug 
print(dataX)  # output below

# now we open a new csv, same settings then your old one:
with open("mod.txt","w",newline='') as w:
    writer = csv.writer(w,delimiter=',', quotechar="'",skipinitialspace=True, quoting=csv.QUOTE_ALL)
    # write a custom header
    writer.writerow(["date","time","value"])
    # for each row of data we need to create a new output row
    for r in dataX:
        # that we construct using the times we split out earlier
        for i,t in enumerate(time):
            if i==0: # this is just the text "'Time'" - dont need it
                continue
            # here we take the day ('Sun 01', 'Mon 02', ...), add the time t and index into the data
            writer.writerow([r[0],t,r[i]])


# read created file back in and print line-wise:
with open("mod.txt","r") as r:
    for row in r:
        print(row, end="")
输出:

# the time we split off
['Time', '00:00-00:05', '00:05-00:10', '00:10-00:15', '00:15-00:20', '00:20-00:25']

# the rest of the data
[['Sun 01', '0.30', '0.30', '0.30', '0.30', '0.30'], 
 ['Mon 02', '0.30', '0.30', '0.30', '0.30', '0.30'], 
 ['Tue 03', '0.30', '0.30', '0.30', '0.30', '0.40'], 
        **snipp - you get the gist of it **
 ['Sun 29', '0.10', '0.10', '0.20', '0.10', '0.10'], 
 ['Mon 30', '0.10', '0.10', '0.10', '0.00', '0.20']]

# the created file 
'date','time','value'
'Sun 01','00:00-00:05','0.30'
'Sun 01','00:05-00:10','0.30'
'Sun 01','00:10-00:15','0.30'
'Sun 01','00:15-00:20','0.30'
'Sun 01','00:20-00:25','0.30'
'Mon 02','00:00-00:05','0.30'
'Mon 02','00:05-00:10','0.30'
'Mon 02','00:10-00:15','0.30'
'Mon 02','00:15-00:20','0.30'
'Mon 02','00:20-00:25','0.30'
'Tue 03','00:00-00:05','0.30'
'Tue 03','00:05-00:10','0.30'
'Tue 03','00:10-00:15','0.30'
'Tue 03','00:15-00:20','0.30'
'Tue 03','00:20-00:25','0.40'
 **snipp - you get the gist of it **
'Sun 29','00:00-00:05','0.10'
'Sun 29','00:05-00:10','0.10'
'Sun 29','00:10-00:15','0.20'
'Sun 29','00:15-00:20','0.10'
'Sun 29','00:20-00:25','0.10'
'Mon 30','00:00-00:05','0.10'
'Mon 30','00:05-00:10','0.10'
'Mon 30','00:10-00:15','0.10'
'Mon 30','00:15-00:20','0.00'
'Mon 30','00:20-00:25','0.20'
如果您打印(每行)而不是
r[0]
类似于
r[0].split()[-1]+“-01-2000”
的内容,您将更接近所需的输出。如果您确实希望有其他报价选项,请继续阅读


HTH

为什么您的csv文件项会被引用?Pandas软件包允许这样做-请看。如果Pandas还可以,您就不需要重新发明轮子。此外,代码也不需要递归,我没有标记。谢谢,我下次会阅读Pandas。我设法找到了一个更复杂的解决方案。我将修改我的代码使用你的方法。谢谢帕特里克!
# the time we split off
['Time', '00:00-00:05', '00:05-00:10', '00:10-00:15', '00:15-00:20', '00:20-00:25']

# the rest of the data
[['Sun 01', '0.30', '0.30', '0.30', '0.30', '0.30'], 
 ['Mon 02', '0.30', '0.30', '0.30', '0.30', '0.30'], 
 ['Tue 03', '0.30', '0.30', '0.30', '0.30', '0.40'], 
        **snipp - you get the gist of it **
 ['Sun 29', '0.10', '0.10', '0.20', '0.10', '0.10'], 
 ['Mon 30', '0.10', '0.10', '0.10', '0.00', '0.20']]

# the created file 
'date','time','value'
'Sun 01','00:00-00:05','0.30'
'Sun 01','00:05-00:10','0.30'
'Sun 01','00:10-00:15','0.30'
'Sun 01','00:15-00:20','0.30'
'Sun 01','00:20-00:25','0.30'
'Mon 02','00:00-00:05','0.30'
'Mon 02','00:05-00:10','0.30'
'Mon 02','00:10-00:15','0.30'
'Mon 02','00:15-00:20','0.30'
'Mon 02','00:20-00:25','0.30'
'Tue 03','00:00-00:05','0.30'
'Tue 03','00:05-00:10','0.30'
'Tue 03','00:10-00:15','0.30'
'Tue 03','00:15-00:20','0.30'
'Tue 03','00:20-00:25','0.40'
 **snipp - you get the gist of it **
'Sun 29','00:00-00:05','0.10'
'Sun 29','00:05-00:10','0.10'
'Sun 29','00:10-00:15','0.20'
'Sun 29','00:15-00:20','0.10'
'Sun 29','00:20-00:25','0.10'
'Mon 30','00:00-00:05','0.10'
'Mon 30','00:05-00:10','0.10'
'Mon 30','00:10-00:15','0.10'
'Mon 30','00:15-00:20','0.00'
'Mon 30','00:20-00:25','0.20'