Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/317.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将嵌套列表转换为numpy数组的有效方法_Python_List_Numpy - Fatal编程技术网

Python 将嵌套列表转换为numpy数组的有效方法

Python 将嵌套列表转换为numpy数组的有效方法,python,list,numpy,Python,List,Numpy,我有一个不同大小和类型的嵌套列表 def read(f,tree,objects): Event=[] for o in objects: #find different features of one class temp=[i.GetName() for i in tree.GetListOfBranches() if i.GetName().startswith(o)] tempList=[] #contains one class of objects

我有一个不同大小和类型的嵌套列表

def read(f,tree,objects):

Event=[]
for o in objects:
    #find different features of one class 
    temp=[i.GetName() for i in tree.GetListOfBranches() if i.GetName().startswith(o)]
    tempList=[] #contains one class of objects
    for t in temp:
        #print t
        tempList.append(t)
        comp=np.asarray(getattr(tree,t))
        tempList.append(comp)
Event.append(tempList)

return Event



def main():
    path="path/to/file"
    objects= ['TauJet', 'Jet', 'Electron', 'Muon', 'Photon', 'Tracks', 'ETmis', 'CaloTower']

    f=ROOT.TFile(path)
    tree=f.Get("RecoTree")
    tree.GetEntry(100)
    event=read(f,tree,objects)
例如,事件[0]的结果是

['TauJet', array(1), 'TauJet_E', array([ 31.24074173]), 'TauJet_Px', array([-28.27997971]), 'TauJet_Py', array([-13.18042469]), 'TauJet_Pz', array([-1.08304048]), 'TauJet_Eta', array([-0.03470514]), 'TauJet_Phi', array([-2.70545626]), 'TauJet_PT', array([ 31.20065498]), 'TauJet_Charge', array([ 1.]), 'TauJet_NTracks', array([3]), 'TauJet_EHoverEE', array([ 1745.89221191]), 'TauJet_size', array(1)]
如何将其转换为numpy数组

注1:np.asarray(事件,“对象”)速度较慢。我在寻找更好的方法。另外,np.fromiter()不适用,因为我没有固定的类型

注2:我不知道我的活动的长度


注3:如果能让事情变得更简单,我还可以了解名字。

你可以试试这样的东西,但我不确定它会有多快。这将为第一行创建一个numpy记录数组

data = event[0]
keys = data[0::2]
vals = data[1::2]
#there are some zero-rank arrays in there, so need to check for those, 
#but I think just recasting them to a np.float should work. 
temp = [np.float(v) for v in vals]
#you could also just create a np array from the line above with np.array(temp)
dtype={"names":keys, "formats":("f4")*len(vals)}
myArr = np.rec.fromarrays(temp, dtype=dtype)

#test it out
In [53]: data["TauJet_Pz"]
Out[53]: array(-1.0830404758453369, dtype=float32)


#alternatively, you could try something like this, which just creates a 2d numpy array
vals = np.array([[np.float(v) for v in row[1::2]] for row in event])
#now create a nice record array from that using the dtypes above
myRecordArray = np.rec.fromarrays(vals, dtype=dtype)

我建议你看看熊猫。我记得(希望是正确的)在某个地方读到有一些对不同长度的列的支持。除此之外,他们还支持一些numpy算术,我无法完成这项工作。正如我所说,我有不同长度的列表。因此,当我有一个长度与我在这里提到的不同的条目时,我会得到:temp2=[vals中v的np.float(v)]TypeError:只有长度为1的数组可以转换为Python标量,即使对于这个长度,我也会得到:myArr=np.rec.fromarrays(temp2,dtype=dtype)文件“/usr/lib/pymodules/python2.7/numpy/core/records.py”,第537行,在fromarrays descr=sb.dtype(dtype)TypeError:data type not UnderstoEdit听起来像是行中有数组包含多个元素,即类似于[“TauJet_e”,array([31.56,45.14])]。这将重现您看到的错误。为什么会这样?我的代码和我在问题中发布的代码一模一样!正如我所说的,当我在做(comp=np.asarray(getattr(tree,t)))时,向量存储的长度可能会因对象而异。例如,我可能有:“Jet_E”,数组([391.62017822,31.24074173]),我用v.astype(float)修复了这个问题,但我仍然有TypeError的问题:在最后一部分中没有理解数据类型。那么当粒子有多个值时,单独的值是什么呢?例如,带有值[23,32,23]的电子_E也就是说,我们有三个电子,能量23,32,23。