Python 元组过滤器numpy数组_Python_Numpy_Filter_Scikit Learn_Stockquotes

Python 元组过滤器numpy数组

python numpy filter scikit-learn

Python 元组过滤器numpy数组,python,numpy,filter,scikit-learn,stockquotes,Python,Numpy,Filter,Scikit Learn,Stockquotes,Scikit learnlibrary提供了一个数据集群的出色示例-。它在美国股市中运行良好。但是当你添加其他市场的股票时，numpy的错误似乎是数组应该具有相同的大小-这是事实，例如，德国股票有不同的交易日历好的，在下载报价后，我添加了共享日期的准备： quotes=[quotes\u historical\u yahoo\u ochl（符号，d1，d2，asobject=True）符号中的符号] def intersect（列表1、列表2）：返回列表（集合（列表1）和集合（列表2））

Scikit learn

library提供了一个数据集群的出色示例-。它在美国股市中运行良好。但是当你添加其他市场的股票时，

numpy

的错误似乎是数组应该具有相同的大小-这是事实，例如，德国股票有不同的交易日历

好的，在下载报价后，我添加了共享日期的准备：

quotes=[quotes\u historical\u yahoo\u ochl（符号，d1，d2，asobject=True）
符号中的符号]
def intersect（列表1、列表2）：
返回列表（集合（列表1）和集合（列表2））
日期\u all=quotes[0]。日期
对于引号中的q：
日期\u符号=q日期
日期\全部=相交（日期\全部，日期\符号）

然后我不得不过滤元组的numpy数组。以下是一些尝试：

#对于索引，枚举中的q（引号）：
#过滤=[如果i.date in dates\u all中的i为i in q]
#quotes[index]=np.rec.array（已筛选，dtype=q.dtype）
#quotes[index]=np.asanyarray（已筛选，dtype=q.dtype）
#
#quotes[index]=np.where（a.date in dates\U all for a in q）
#
#quotes[index]=np.where（q[0]。日期中的日期\U all）

如何将筛选器应用于numpy数组，或者如何将记录列表（在筛选器之后）真正转换回

numpy

的

recarray

引号[0]。数据类型：

'(numpy.record, [('date', 'O'), ('year', '<i2'), ('month', 'i1'), ('day', 'i1'), ('d', '<f8'), ('open', '<f8'), ('close', '<f8'), ('high', '<f8'), ('low', '<f8'), ('volume', '<f8'), ('aclose', '<f8')])'

”（numpy.record，[（'date'，'O'），（'year'，”因此引号是一个重新排列的列表，在date\u all
中收集date
字段中所有值的交集
我可以使用以下方法重新创建一个这样的阵列：
In [286]: dt=np.dtype([('date', 'O'), ('year', '<i2'), ('month', 'i1'), ('day', 
     ...:
     ...: ), ('low', '<f8'), ('volume', '<f8'), ('aclose', '<f8')])
In [287]: 
In [287]: arr=np.ones((2,), dtype=dt)  # 2 element structured array
In [288]: arr
Out[288]: 
array([(1, 1, 1, 1,  1.,  1.,  1.,  1.,  1.,  1.,  1.),
       (1, 1, 1, 1,  1.,  1.,  1.,  1.,  1.,  1.,  1.)], 
      dtype=[('date', 'O'), ('year', '<i2'), ('month', 'i1'), ('day', 'i1'), ... ('aclose', '<f8')])
In [289]: type(arr[0])
Out[289]: numpy.void

在任何情况下，date
字段都是对象数组，可能是datetime

q
是这些数组之一；i
是一个元素，i.date
是日期字段
 [i for i in q if i.date in dates_all]

因此，filtered
是重新排列元素的列表。np.stack
可以更好地将它们重新组合到一个数组中（这也适用于重新排列）
或者，您可以收集匹配记录的索引，并为quote数组编制索引
In [319]: [i for i,v in enumerate(arr) if v['date'] in alist]
Out[319]: [0, 1]
In [320]: arr[_]

或者首先拉出日期字段：
In [321]: [i for i,v in enumerate(arr['date']) if v in alist]
Out[321]: [0, 1]

inad
也可以用于搜索
In [322]: np.in1d(arr['date'],alist)
Out[322]: array([ True,  True], dtype=bool)
In [323]: np.where(np.in1d(arr['date'],alist))
Out[323]: (array([0, 1], dtype=int32),)

元组数组我猜你指的是一个结构化数组
（或重新排列
）。如果是这样，我们想知道数组的形状
和数据类型
。谢谢你的注释。添加了！谢谢你的详细回复和精彩技巧：I['date']
，inad其他！
np.stack([i for i in arr if i['date'] in alist])

In [319]: [i for i,v in enumerate(arr) if v['date'] in alist]
Out[319]: [0, 1]
In [320]: arr[_]

In [321]: [i for i,v in enumerate(arr['date']) if v in alist]
Out[321]: [0, 1]

In [322]: np.in1d(arr['date'],alist)
Out[322]: array([ True,  True], dtype=bool)
In [323]: np.where(np.in1d(arr['date'],alist))
Out[323]: (array([0, 1], dtype=int32),)