Python 如何访问列表中列表的所有数据点_Python_Machine Learning_Cluster Computing

Python 如何访问列表中列表的所有数据点

python machine-learning cluster-computing

Python 如何访问列表中列表的所有数据点,python,machine-learning,cluster-computing,Python,Machine Learning,Cluster Computing,下面是集群数据集的代码 def agglomerate(labels, grid): clusters = labels while len(clusters) >1000:#need to have a stopping rule here i did manually # find 2 closest clusters #print clusters distances = [] for i,row in

下面是集群数据集的代码

def agglomerate(labels, grid):
    clusters = labels
    while len(clusters) >1000:#need to have a stopping rule here i did manually
        # find 2 closest clusters
        #print clusters
        distances = []
        for i,row in enumerate(grid[:5]):
            distances += [(i, i+j+1, c) for j,c in enumerate(row[i+1:])]
        i,j,_ = max(distances, key=lambda x:x[2])
        clusters[i]=[clusters[i],clusters[j]]
        clusters.pop(j)
        grid = add(grid, i, j)
    return clusters

现在在这里，我将集群作为列表列表，并希望访问每个集群中的所有点来汇总数据为此

def add(grid, lefti, righti):
    for r in grid:
        r[lefti] = max(r[lefti], r.pop(righti))
    grid[lefti] = map(max, zip(grid[lefti], grid.pop(righti)))
    return grid

首先，请使用有意义的变量名；单字母标识符使代码更难理解

我想你对你的名单管理感到困惑。您将l的值传递到例程中并对其进行更改，然后将其作为返回值发送回去。。。然后忽略该返回值

递归深度的第一个疑点是例程中的无限递归；逻辑跟踪的输出在哪里？对于任何此类问题，您的首要任务之一是确定调用该问题的调用序列：使用调试器跟踪它。一种低技术的方法是将跟踪打印语句放入所有例程中，跟踪输入、输出和传递的值
请注意，您有10^15个数据点；您似乎试图将它们全部放在一个列表中。你有足够的内存吗？因为您没有包含整个错误消息，所以我无法判断错误可能发生在代码堆栈中的什么位置
您没有描述您的数据结构；如果它是一个相对平衡的二叉树（如代码所示），那么根据平衡情况，树的深度至少为51，可能更高一点。第二个疑点是树是不平衡的，一个分支比默认递归限制更深。逻辑跟踪将捕捉到这一点

def getPoints(a,l): c = a[:] if (isinstance(c, list)and len(l)!=1): getPoints(c[0],l) getPoints(c[1],l) else: l.append(c) return l getting recursion error as i call getPoints(a,l) ----> 5 getPoints(c[0],l) 6 getPoints(c[1],l) 7 else: RuntimeError: maximum recursion depth exceeded since the dataset consist of 10^15 datapoints