Google cloud dataflow UserCodeException:java.lang.OutOfMemoryError:流式自动缩放时的java堆空间

Google cloud dataflow UserCodeException:java.lang.OutOfMemoryError:流式自动缩放时的java堆空间,google-cloud-dataflow,apache-beam,Google Cloud Dataflow,Apache Beam,我有两条流式管道在制作中运行,到目前为止没有问题(都是n1-standard-4)。然而,当我决定尝试自动缩放时,它给了我所说的错误。我试过使用普通的自动缩放和流媒体引擎(使用n2-highmem1、n1-highmen1),但都不起作用 这是我的管道结构。这是BigQuery的Pubsub。 根据粘贴的日志,您的一些ParDo函数或MapElement可能会消耗更多的内存,请查看PubSubroAnalyticStablerRowfn。 这个beam文档可能会帮助您调整管道。显然,我发现问题

我有两条流式管道在制作中运行,到目前为止没有问题(都是n1-standard-4)。然而,当我决定尝试自动缩放时,它给了我所说的错误。我试过使用普通的自动缩放和流媒体引擎(使用n2-highmem1、n1-highmen1),但都不起作用

这是我的管道结构。这是BigQuery的Pubsub。


根据粘贴的日志,您的一些ParDo函数或MapElement可能会消耗更多的内存,请查看PubSubroAnalyticStablerRowfn。
这个beam文档可能会帮助您调整管道。

显然,我发现问题出在设置中,我试图初始化一个时区查找库,该库从(lat,lng)中提供tz_字符串。内存消耗太高,我通过在构建时加载上述类解决了这个问题。

Hi KalanyuZ,你能分享更多关于管道的信息并分享异常调用堆栈吗?@MikhailGryzykhin是的。我收到了以下错误。虽然很长,但我会尽可能多地发布。你能添加更多关于管道/管道中运行的DOFN的详细信息吗?
java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.OutOfMemoryError: Java heap space
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:194)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:165)
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
org.apache.beam.runners.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.create(IntrinsicMapTaskExecutorFactory.java:125)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1203)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1024)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.beam.sdk.util.UserCodeException: java.lang.OutOfMemoryError: Java heap space
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
drivemode.com.dataflow.reader.PubsubToAnalyticsTableRowFN$DoFnInvoker.invokeSetup(Unknown Source)
org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:80)
org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:62)
org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:95)
org.apache.beam.runners.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:75)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.createParDoOperation(IntrinsicMapTaskExecutorFactory.java:264)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.access$000(IntrinsicMapTaskExecutorFactory.java:86)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:183)\
 11 more\nCaused by: java.lang.OutOfMemoryError: Java heap space
org.tukaani.xz.lz.LZDecoder.<init>(Unknown Source)
org.tukaani.xz.LZMA2InputStream.<init>(Unknown Source)
org.tukaani.xz.LZMA2InputStream.<init>(Unknown Source)
org.apache.commons.compress.archivers.sevenz.LZMA2Decoder.decode(LZMA2Decoder.java:39)
org.apache.commons.compress.archivers.sevenz.Coders.addDecoder(Coders.java:76)
org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecoderStack(SevenZFile.java:933)
org.apache.commons.compress.archivers.sevenz.SevenZFile.buildDecodingStream(SevenZFile.java:909)
org.apache.commons.compress.archivers.sevenz.SevenZFile.getNextEntry(SevenZFile.java:222)
net.iakovlev.timeshape.TimeZoneEngine.lambda$initialize$0(TimeZoneEngine.java:111)
net.iakovlev.timeshape.TimeZoneEngine$$Lambda$76/1740730188.apply(Unknown Source)
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151 java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) net.iakovlev.timeshape.Index.build(Index.java:73) net.iakovlev.timeshape.TimeZoneEngine.initialize(TimeZoneEngine.java:126) net.iakovlev.timeshape.TimeZoneEngine.initialize(TimeZoneEngine.java:94) drivemode.com.dataflow.reader.AnalyticsTableRowFN.setUp(AnalyticsTableRowFN.java:60) drivemode.com.dataflow.reader.PubsubToAnalyticsTableRowFN$DoFnInvoker.invokeSetup(Unknown Source)
org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:80) org.apache.beam.runners.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:62) org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:95) org.apache.beam.runners.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:75)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.createParDoOperation(IntrinsicMapTaskExecutorFactory.java:264) org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.access$000(IntrinsicMapTaskExecutorFactory.java:86) org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:183)
org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:165)
org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
org.apache.beam.vendor.guava.v20_0.com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: Could not initialize class drivemode.com.dataflow.reader.AnalyticsTableRowFN
 org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2212)
 org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache.get(LocalCache.java:4053)
 org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4899)
 org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:91)
 org.apache.beam.runners.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:75)
 org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.createParDoOperation(IntrinsicMapTaskExecutorFactory.java:264)
 org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.access$000(IntrinsicMapTaskExecutorFactory.java:86)
 org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:183)
 org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory$1.typedApply(IntrinsicMapTaskExecutorFactory.java:165)
 org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:63)
 org.apache.beam.runners.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:50)
 org.apache.beam.runners.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:87)
 org.apache.beam.runners.dataflow.worker.IntrinsicMapTaskExecutorFactory.create(IntrinsicMapTaskExecutorFactory.java:125)
 org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1203)
 org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1000(StreamingDataflowWorker.java:149)
 org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$6.run(StreamingDataflowWorker.java:1024)
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 java.lang.Thread.run(Thread.java:745
Caused by: java.lang.NoClassDefFoundError: Could not initialize class drivemode.com.dataflow.reader.AnalyticsTableRowFN java.io.ObjectStreamClass.hasStaticInitializer(Native Method)
java.io.ObjectStreamClass.computeDefaultSUID(ObjectStreamClass.java:1787) java.io.ObjectStreamClass.access$100(ObjectStreamClass.java:72) java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:253)
java.io.ObjectStreamClass$1.run(ObjectStreamClass.java:251)
 java.security.AccessController.doPrivileged(Native Method)
java.io.ObjectStreamClass.getSerialVersionUID(ObjectStreamClass.java:250)
java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:611)
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630)
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521)
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1630) java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1521) java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1781)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) org.apache.beam.sdk.util.SerializableUtils.deserializeFromByteArray(SerializableUtils.java:71) org.apache.beam.runners.dataflow.worker.UserParDoFnFactory$UserDoFnExtractor.getDoFnInfo(UserParDoFnFactory.java:62) org.apache.beam.runners.dataflow.worker.UserParDoFnFactory.lambda$create$0(UserParDoFnFactory.java:93)
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4904) org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3628) org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2336)
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2295)
org.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2208)
... 18 more