Python Cassandra datastax驱动程序在一个太大的查询上超时_Python_Pandas_Datastax_Cassandra 2.0

Python Cassandra datastax驱动程序在一个太大的查询上超时

python pandas

Python Cassandra datastax驱动程序在一个太大的查询上超时,python,pandas,datastax,cassandra-2.0,Python,Pandas,Datastax,Cassandra 2.0,这个查询很好，但是如果我更改返回数据的时间量，我会得到低于200000行的错误。我不确定问题出在查询还是填充/重采样操作上。群集中只有一台机器 rsltES = session.execute( """SELECT * FROM tickdata.timeseries WHERE curve = 0 AND symbol = 1000 AND time > '2016-05-23T08:00:00-0400' AND time < '2

这个查询很好，但是如果我更改返回数据的时间量，我会得到低于200000行的错误。我不确定问题出在查询还是填充/重采样操作上。群集中只有一台机器

rsltES = session.execute( """SELECT * FROM tickdata.timeseries  
    WHERE 
    curve = 0 
    AND symbol = 1000
    AND time > '2016-05-23T08:00:00-0400'
    AND time < '2016-05-25T19:00:00-0400'
    order by time
    allow filtering;""")

dfes = dfes.set_index(['time'])
dfes.index.tz_localize('US/Eastern')
df_ohlcES = dfes.resample('5Min').ohlc()
df_ohlcES = df_ohlcES.ffill()
df_ohlcES['DateTime'] = np.arange(len(df_ohlcES))

# Move the DateTime Column to the Front
colsES = df_ohlcES.columns
colsES = colsES[-1:] | colsES[:-1]
df_ohlcES = df_ohlcES[colsES]

如果查询返回的数据太多，则查询将超时。有没有办法增加超时时间

Traceback (most recent call last):
  File "pandascas.py", line 36, in <module>
    allow filtering;""")
  File "cassandra/cluster.py", line 1647, in cassandra.cluster.Session.execute (cassandra/cluster.c:28041)
  File "cassandra/cluster.py", line 3243, in cassandra.cluster.ResponseFuture.result (cassandra/cluster.c:61954)
cassandra.ReadTimeout: code=1200 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'received_responses': 0, 'required_responses': 1, 'consistency': 'LOCAL_ONE'}

这是中设置的服务器端读取超时。这需要服务器设置并重新启动

如果您确实获得了太多的行，您还可以尝试减少行数以使请求的页面更小

您可能还想知道您的工作负载是否经常被覆盖，这种情况可能会导致许多逻辑删除，从而导致读取速度缓慢。您可以做的一个经验性检查是提高超时时间并打开以查看花费了多长时间。

默认情况下，数据库的超时时间为2秒。您可以使用fetchSize和块中的fetch结果来代替增加此超时。请记住，允许筛选是一种糟糕的做法，它基本上是一个完整的表搜索，命中集群中的所有节点，这可能是导致超时的原因，即使您没有查询数百万行