为什么在运行pandas操作时会收到dask警告?
我有一个笔记本,上面有熊猫和达斯克的操作 当我还没有启动客户端时,一切都如预期的那样。但一旦启动dask.distributed客户端,我会在运行pandas操作的单元格中收到警告,例如,为什么在运行pandas操作时会收到dask警告?,dask,dask-distributed,Dask,Dask Distributed,我有一个笔记本,上面有熊猫和达斯克的操作 当我还没有启动客户端时,一切都如预期的那样。但一旦启动dask.distributed客户端,我会在运行pandas操作的单元格中收到警告,例如,pd.read\u parquet(“我的文件”) 当我开始工作时,我得到了保姆的确切数量 警告示例: distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.26s. This is often caused by lo
pd.read\u parquet(“我的文件”)
当我开始工作时,我得到了保姆的确切数量
警告示例:
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.26s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.37s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Scheduler for 1.37s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.36s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
我想知道原因,以及如何让它们停止。此警告意味着Dask工作进程在很长一段时间内没有响应。这是不好的,因为工作进程将无法向其他工作进程提供数据、与调度程序通信等。即使在运行计算时,这也是不正常的,因为这些计算是在单独的线程中运行的 这个问题有两个主要原因:
分布式环境下修复的错误。您可能需要升级
tick-maximum-delay: 10 s
关于第2点:我确实得到了
“dask.distributed”没有属性“\uuu version\uuuu”
。dask的哪个版本有此错误?导入分发;打印(分布式。uuu版本uuu)
我明白了,分布式
也可以作为独立软件包提供。我很困惑,因为还有dask.distributed
。你知道一个好方法来找出哪个函数没有释放GIL吗?如果这是同步代码,抛出一个异常而不是一个日志消息就可以了,但是由于它是异步的,我不知道谁在占用这段时间?