列表优化和csv读取python

列表优化和csv读取python,python,optimization,Python,Optimization,我正在尝试优化一个python脚本,该脚本经过459*458*23次循环 目前脚本大约需要2天 以下是脚本: for i in range(0, len(file_names)): for q in range(0, len(original_tuples)): for j in range(0, len(original_tuples)): cur_freq = int(original_tuples[j][0])

我正在尝试优化一个python脚本,该脚本经过459*458*23次循环

目前脚本大约需要2天

以下是脚本:

for i in range(0, len(file_names)):
        for q in range(0, len(original_tuples)):
            for j in range(0, len(original_tuples)):
                cur_freq = int(original_tuples[j][0])
                cur_clamp = int(original_tuples[j][1])
                freq_num = int(raw_tuples[j][0])
                clamp_num = int(raw_tuples[j][1])
                perf  = str(cur_freq) + "/" + str(cur_freq)+ "-" + str(cur_clamp) + "/" + file_names[i] + "-perf.csv"
                power = str(cur_freq) + "/" + str(cur_freq)+ "-" + str(cur_clamp) + "/" + file_names[i] + "-power.csv"
                dataset = r_script.parse_files(perf,power)
                a, b,c,d,e,f,g,h,i =  r_script.avg(dataset)
                s = freq_logs[freq_num][0]%(a,b,c,d,h,e,g,f)
                index = s.find('=')+1
                predicted = float(eval(s[index:]))
                switching_power[i][q][freq_num][clamp_num].append(float(predicted))
                real_power[i][freq_num][clamp_num].append(float(i))

                for k in range(0, len(possible_frequency)):
                    if int(possible_frequency[k]) != int(cur_freq):
                        temp_freq  = int(possible_frequency[k])
                        temp_clamp = clamp_num
                        temp_freq_num  = possible_frequency.index(possible_frequency[k])
                        perf1  = str(temp_freq) + "/" + str(temp_freq)+ "-" + str(temp_clamp) + "/" + file_names[i] + "-perf.csv"
                        power1 = str(temp_freq) + "/" + str(temp_freq)+ "-" + str(temp_clamp) + "/" + file_names[i] + "-power.csv"
                        dataset1 = r_script.parse_files(perf1,power1)
                        a1, b1,c1,d1,e1,f1,g1,h1,i1 =  r_script.avg(dataset1)
                        s = freq_logs[temp_freq_num][0]%(a,b,c,d,h,e,g,f)
                        index = s.find('=')+1
                        predicted = float(eval(s[index:]))
                        switching_power[i][q][temp_freq_num][temp_clamp].append(float(predicted))

                for l in range(0, len(possible_frequency)):
                    for m in range(0, len(clamp_range)):
                        if int(clamp_range[m]) != int(cur_clamp):
                            cl_temp_freq  = int(possible_frequency[l])
                            cl_temp_clamp = int(clamp_range[m])
                            cl_temp_freq_num  = int(possible_frequency.index(possible_frequency[l]))
                            cl_temp_clamp_num = int(clamp_range.index(clamp_range[m]))
                            if (cl_temp_clamp_num != cl_temp_clamp):
                                sys.exit("buggy...clamp # not matching")

                            perf2  = str(cl_temp_freq) + "/" + str(cl_temp_freq)+ "-" + str(cl_temp_clamp) + "/" + file_names[i] + "-perf.csv"
                            power2 = str(cl_temp_freq) + "/" + str(cl_temp_freq)+ "-" + str(cl_temp_clamp) + "/" + file_names[i] + "-power.csv"
                            dataset2 = r_script.parse_files(perf2,power2)
                            a2, b2,c2,d2,e2,f2,g2,h2,i2 =  r_script.avg(dataset2)
                            previous_predicted_power = switching_power[i][q][cl_temp_freq_num][temp_clamp][0]
                            clamper = float(temp_clamp)/float(cl_temp_clamp_num)
                            s = clamp_logs[temp_freq_num][0]%(previous_predicted_power, clamper)
                            index = s.find('=')+1
                            predicted = float(eval(s[index:]))
                            switching_power[i][q][temp_freq_num][temp_clamp].append(float(predicted))

    for n in range(0, len(file_names)):
        for fo in range(0, len(original_tuples)):
            for o in range(0, len(original_tuples)):
                freq_num = int(raw_tuples[o][0])
                clamp_num = int(raw_tuples[o][1])
                diff_power[n][fo][freq_num][clamp_num] = float(float(real_power[n][freq_num][clamp_num][0])-float(switching_power[n][fo][freq_num][clamp_num][0]))
以下是名单:

possible_clamp_levels = int(len(clamp_range)*len(possible_frequency))
original_tuples = []
raw_tuples = []
switching_power = [[[[[] for d in range(0, len(clamp_range))] for c in range(0, len(possible_frequency))] for b in range(0, possible_clamp_levels)] for a in range(0, len(file_names))]
diff_power = [[[[[] for d in range(0, len(clamp_range))] for c in range(0, len(possible_frequency))] for b in range(0, possible_clamp_levels)] for a in range(0, len(file_names))]
real_power = [[[[] for d in range(0, len(clamp_range))]for c in range(0, len(possible_frequency))] for a in range(0, len(file_names))]

for a in range(0, len(possible_frequency)):
    for b in range(0, len(clamp_range)):
        test = (possible_frequency[a], clamp_range[b])
        test1 = (a,b)
        original_tuples.append(test)
        raw_tuples.append(test1)
如果您需要有关脚本本身的任何指针来帮助我优化它,请告诉我。 频率测井和钳位测井基本上是线性方程代换。 r_脚本是读取这些csv文件的另一个脚本。解析它需要不到10毫秒的时间

将使用索引的迭代转化为iterable上的实迭代 改变

for i in range(len(lst):
    item = lst[i]
    print item #do something useful here
进入

若您确实需要知道当前处理项的索引,请使用enumerate

您的代码将看起来更像python,您可能会获得一点速度

分治-多处理 如果您可以重新设计解决方案,例如,您可以在多个组中处理数据,并最终合并结果。这假定:

以后可以将作业划分为多个部分并合并结果 I/O不是限制,如果您受到从单个磁盘读取的限制,映射-减少方法将不会加快速度。
分析你的代码,看看它在哪里花费了大部分时间。但在分析它之前,重构你的代码。我最喜欢的重构经验法则之一是SRP单一责任原则:它意味着每个函数都应该做一件事,而且只能做一件事。另一条经验法则是,python等表达性语言中的函数应该小于10行代码。这个问题似乎离题了,因为它是关于改进工作代码的。这更适合codereview.stackexchange.comIt可能不会使它更快,但如果您阅读并特别关注行长度,并使用更清晰的变量名,其他人会发现更容易理解您的代码,从而提供帮助。另外,将0默认参数改为range,或者更好的方法是重构以直接遍历容器。
for itm in lst:
    print itm # do something useful here
for i, itm in enumerate(lst):
    print itm # do something useful here