Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/tensorflow/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
为什么在运行pandas操作时会收到dask警告?_Dask_Dask Distributed - Fatal编程技术网

为什么在运行pandas操作时会收到dask警告?

为什么在运行pandas操作时会收到dask警告?,dask,dask-distributed,Dask,Dask Distributed,我有一个笔记本,上面有熊猫和达斯克的操作 当我还没有启动客户端时,一切都如预期的那样。但一旦启动dask.distributed客户端,我会在运行pandas操作的单元格中收到警告,例如,pd.read\u parquet(“我的文件”) 当我开始工作时,我得到了保姆的确切数量 警告示例: distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.26s. This is often caused by lo

我有一个笔记本,上面有熊猫和达斯克的操作

当我还没有启动客户端时,一切都如预期的那样。但一旦启动dask.distributed客户端,我会在运行pandas操作的单元格中收到警告,例如,
pd.read\u parquet(“我的文件”)

当我开始工作时,我得到了保姆的确切数量

警告示例:

distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.26s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.38s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.37s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Scheduler for 1.37s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
distributed.core - WARNING - Event loop was unresponsive in Nanny for 1.36s.  This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.

我想知道原因,以及如何让它们停止。

此警告意味着Dask工作进程在很长一段时间内没有响应。这是不好的,因为工作进程将无法向其他工作进程提供数据、与调度程序通信等。即使在运行计算时,这也是不正常的,因为这些计算是在单独的线程中运行的

这个问题有两个主要原因:

  • 您的任务运行的函数不会释放GIL。这在现在是罕见的(大多数熊猫公司都会发放GIL),但也可能发生。我相信所有的read_拼花地板都会释放GIL
  • 如果这种情况只发生一次,并且只在启动时发生,那么这是一个在
    分布式环境下修复的错误。您可能需要升级
  • 您还可以通过增加~/.dask/config.yaml文件中允许的最大滴答时间来消除警告

    tick-maximum-delay: 10 s
    

    关于第2点:我确实得到了
    “dask.distributed”没有属性“\uuu version\uuuu”
    。dask的哪个版本有此错误?
    导入分发;打印(分布式。uuu版本uuu)
    我明白了,
    分布式
    也可以作为独立软件包提供。我很困惑,因为还有
    dask.distributed
    。你知道一个好方法来找出哪个函数没有释放GIL吗?如果这是同步代码,抛出一个异常而不是一个日志消息就可以了,但是由于它是异步的,我不知道谁在占用这段时间?