Scala 在Apache spark memory MemoryStore tryToPut中,空闲意味着什么

Scala 在Apache spark memory MemoryStore tryToPut中,空闲意味着什么,scala,memory,apache-spark,Scala,Memory,Apache Spark,具有多个作业的独立spark群集内存不足。 在调查过程中,我们发现了这些信息,并开始怀疑可用内存太少 16/09/23 12:30:38 INFO MemoryStore: Block broadcast_50802_piece0 stored as bytes in memory (estimated size 5.1 KB, free 233.5 KB) 16/09/23 12:30:38 INFO TorrentBroadcast: Reading broadcast variable 5

具有多个作业的独立spark群集内存不足。 在调查过程中,我们发现了这些信息,并开始怀疑可用内存太少

16/09/23 12:30:38 INFO MemoryStore: Block broadcast_50802_piece0 stored as bytes in memory (estimated size 5.1 KB, free 233.5 KB)
16/09/23 12:30:38 INFO TorrentBroadcast: Reading broadcast variable 50802 took 9 ms
16/09/23 12:30:38 INFO MemoryStore: Block broadcast_50802 stored as values in memory (estimated size 11.3 KB, free 244.9 KB)
在另一个集群中,我们通常将空闲报告为500MB+,堆栈溢出上的许多日志跟踪以GBs为单位显示空闲

分析代码后,这条消息似乎有误导性。报告的可用内存实际上已被阻塞

文档说明其使用的内存不是空闲的

/**
   * Amount of storage memory, in bytes, used for caching blocks.
   * This does not include memory used for unrolling.
   */
  private def blocksMemoryUsed: Long = memoryManager.synchronized {
    memoryUsed - currentUnrollMemory
  }

问题是,如果它实际使用了内存,或者我的解释有误,为什么称它为free。

这似乎是一个bug,它在Spark 2.0中(在包含许多其他更改的PR中)出现过


事实上,报告完全是错误的,显示的是占用的内存而不是空闲内存。

感谢PR证明了我的担忧
/**
   * Amount of storage memory, in bytes, used for caching blocks.
   * This does not include memory used for unrolling.
   */
  private def blocksMemoryUsed: Long = memoryManager.synchronized {
    memoryUsed - currentUnrollMemory
  }