Python 通过循环和参数组合多个函数_Python_Csv

Python 通过循环和参数组合多个函数

python csv

Python 通过循环和参数组合多个函数,python,csv,Python,Csv,更新我的问题已得到完全回答，我已使用jarmod的答案将其应用到我的程序中，尽管代码看起来更整洁，但它并未影响速度（当我的图形出现时（我使用matplotlib绘制此数据），我对为什么我的程序运行缓慢以及如何提高速度感到有点困惑（大约需要30秒，我知道这部分代码正在减慢速度）我已经在第二块代码中显示了我的真实代码。而且，速度很大程度上取决于我设置的范围，在短范围内，它非常快我这里有一个示例代码，它显示了我进行预测和提取值所需的计算。我使用for循环运行我标记为1-100的特定CSV文件范围。我

更新我的问题已得到完全回答，我已使用jarmod的答案将其应用到我的程序中，尽管代码看起来更整洁，但它并未影响速度（当我的图形出现时（我使用matplotlib绘制此数据），我对为什么我的程序运行缓慢以及如何提高速度感到有点困惑（大约需要30秒，我知道这部分代码正在减慢速度）我已经在第二块代码中显示了我的真实代码。而且，速度很大程度上取决于我设置的范围，在短范围内，它非常快

我这里有一个示例代码，它显示了我进行预测和提取值所需的计算。我使用for循环运行我标记为1-100的特定CSV文件范围。我返回每个月（1-12）的数字，以获得给定月份预测量的预测平均值

我的完整代码包括12个全年预测函数，但我觉得代码效率很低，因为除了一个数字外，函数非常相似，而且多次读取csv文件会减慢程序的速度

是否有一种方法可以组合这些函数，或者添加另一个参数使其运行。我最大的担心是，很难返回单独的数字并对它们进行分类。换句话说，我希望理想情况下，所有12个月的准确度预测都只有一个函数，并且我可以看到如何做到这一点我需要添加另一个参数和另一个循环序列，但不知道如何进行，也不知道是否可能。基本上，我想存储onemonthaccuracy的所有值（它在当前文件之前进入文件，并比较与当前文件关联的日期的预测值）然后存储twomonthaccurary的所有值等等…这样我以后就可以使用这些变量进行绘图和其他用途

import csv
import pandas as pd

def onemonthaccuracy(basefilenumber):
    basefileread = pd.read_csv(str(basefilenumber)+'.csv', encoding='latin-1')
    basefilevalue = basefileread.loc[basefileread['Customer'].str.contains('Customer A', na=False), 'Jun-16\nQty'] 

    onemonthread = pd.read_csv(str(basefilenumber-1)+'.csv', encoding='latin-1')
    onemonthvalue = onemonthread.loc[onemonthread['Customer'].str.contains('Customer A', na=False),'Jun-16\nQty']   

    onetotal = int(onemonthvalue)/int(basefilevalue)   

    return onetotal


def twomonthaccuracy(basefilenumber):
    basefileread = pd.read_csv(str(basefilenumber)+'.csv', encoding='Latin-1')
    basefilevalue = basefileread.loc[basefileread['Customer'].str.contains('Customer A', na=False), 'Jun-16\nQty']

    twomonthread = pd.read_csv(str(basefilenumber-2)+'.csv', encoding = 'Latin-1')
    twomonthvalue = twomonthread.loc[twomonthread['Customer'].str.contains('Customer A', na=False), 'Jun-16\nQty']

    twototal = int(twomonthvalue)/int(basefilevalue)    

    return twototal


onetotal = 0
twototal = 0
onetotallist = []
twototallist = []


for basefilenumber in range(24,36):
    onetotal += onemonthaccuracy(basefilenumber)
    twototal +=twomonthaccuracy(basefilenumber)
    onetotallist.append(onemonthaccuracy(i)) 
    twototallist.append(twomonthaccuracy(i))

onetotalpermonth = onetotal/12
twototalpermonth = twototal/12   
x = [1,2]
y = [onetotalpermonth, twototalpermonth]
z = [1,2]
w = [(onetotallist),(twototallist)]

for ze, we in zip(z, w):
    plt.scatter([ze] * len(we), we, marker='D', s=5)

plt.scatter(x,y)
plt.show()

这是我在程序中使用的真正的代码块，也许是我不知道有什么东西在减慢它

#other parts of code 
#StartRange = yearvalue+Value
#EndRange = endValue + endyearvalue
#Range = EndRange - StartRange
# Department
#more code.... 

def nmonthaccuracy(basefilenumber, n):
    basefileread = pd.read_csv(str(basefilenumber)+'.csv', encoding='Latin-1')
    baseheader = getfileheader(basefilenumber)
    basefilevalue = basefileread.loc[basefileread['Customer'].str.contains(Department, na=False), baseheader]

    nmonthread = pd.read_csv(str(basefilenumber-n)+'.csv', encoding = 'Latin-1')
    nmonthvalue = nmonthread.loc[nmonthread['Customer'].str.contains(Department, na=False), baseheader]

    return (1-(int(basefilevalue)/int(nmonthvalue))+1) if int(nmonthvalue) > int(basefilevalue) else int(nmonthvalue)/int(basefilevalue)  

N = 13
total = [0] * N
total_by_month_list  = [[] for _ in range(N)]
for basefilenumber in range(int(StartRange),int(EndRange)):
    for n in range(N):
    total[n] += nmonthaccuracy(basefilenumber, n)
    total_by_month_list[n].append(nmonthaccuracy(basefilenumber,n)) 

onetotal=total[1]/ Range
twototal=total[2]/ Range
threetotal=total[3]/ Range
fourtotal=total[4]/ Range
fivetotal=total[5]/ Range #... all the way to 12 

onetotallist=total_by_month_list[1]
twototallist=total_by_month_list[2]
threetotallist=total_by_month_list[3]
fourtotallist=total_by_month_list[4]
fivetotallist=total_by_month_list[5] #... all the way to 12 
# alot more code after this

大概是这样的：

def nmonthaccuracy(basefilenumber, n):
    basefileread = pd.read_csv(str(basefilenumber)+'.csv', encoding='Latin-1')
    basefilevalue = basefileread.loc[basefileread['Customer'].str.contains('Lam DepT', na=False), 'Jun-16\nQty']

    nmonthread = pd.read_csv(str(basefilenumber-n)+'.csv', encoding = 'Latin-1')
    nmonthvalue = nmonthread.loc[nmonthread['Customer'].str.contains('Lam DepT', na=False), 'Jun-16\nQty']

    return int(nmonthvalue)/int(basefilevalue)    

N = 2
total_by_month = [0] * N
total_aggregate = 0

for basefilenumber in range(20,30):
    for n in range(N):
        a = nmonthaccuracy(basefilenumber, n)
        total_by_month[n] += a
        total_aggregate += a

如果您想知道以下代码的作用：

N = 2
total_by_month = [0] * N

它将

设置为所需的月数（2，但可以设置为12或其他值），然后创建一个

total\u by\u month

数组，该数组可以存储N个结果，每个月一个。然后将

total\u by\u month

初始化为全零（

zero）因此，

每个月的总数从零开始。

唯一的区别是-1/-2偏移量，是吗？为什么不将该值作为参数传递？@jarmod是的，这是区别，但是我仍然希望将-1的所有内容存储到一个变量中，将-2的所有内容存储到另一个变量中，因此我不确定如何准确地存储正确地输入变量。谢谢你，这看起来不错。我已经更新了我的问题，以表明我需要使用返回值（针对特定月份）这就是为什么这个问题让我感到困惑的主要原因，因为我不确定我是否能够像这样具体地使用这些变量。我相信，这种方法汇总了所有的数据，而不是在特定的月份对它们进行分类，除非我错误了第n个月的总计[n]。哦，太好了，这确实有效，非常感谢！我更新了我的问题，表明我还需要附加特定的值，所有这些，这个代码结构是否也允许我这样做（我可以自己计算这部分，我只想看看我是否朝着正确的方向前进）或者，如果您能够解释代码的N=2；和Total=[0]*N部分，我将不胜感激