Apache flink 如何在Flink中反序列化外部检查点清单?

Apache flink 如何在Flink中反序列化外部检查点清单?,apache-flink,Apache Flink,我正在使用Flink 1.4.2和RocksDB的增量检查点,并将检查点保存到S3存储桶中。 检查点的结构是一个清单文件,指向一些包含状态的文件。 当我在文本编辑器中打开清单文件时,我看到一些不可读的块和一些S3URL 如何反序列化此清单文件以获取S3 URL列表?类SavepointStore form Apache Flink运行库包含存储和加载保存点的方法 就为了我当前的场景,我创建了这个片段来检索与检查点相关的文件 import org.apache.flink.runtime.chec

我正在使用Flink 1.4.2和RocksDB的增量检查点,并将检查点保存到S3存储桶中。 检查点的结构是一个清单文件,指向一些包含状态的文件。 当我在文本编辑器中打开清单文件时,我看到一些不可读的块和一些S3URL

如何反序列化此清单文件以获取S3 URL列表?

类SavepointStore form Apache Flink运行库包含存储和加载保存点的方法

就为了我当前的场景,我创建了这个片段来检索与检查点相关的文件

import org.apache.flink.runtime.checkpoint.savepoint.Savepoint;
import org.apache.flink.runtime.checkpoint.savepoint.SavepointStore;
import org.apache.flink.runtime.state.IncrementalKeyedStateHandle;
import org.apache.flink.runtime.state.KeyGroupsStateHandle;
import org.apache.flink.runtime.state.StreamStateHandle;
import org.apache.flink.runtime.state.filesystem.FileStateHandle;
import org.apache.flink.runtime.state.memory.ByteStreamStateHandle;

import java.io.IOException;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class CheckpointFileLocator {

    public static void main(String[] args) throws IOException {
        System.out.println(new CheckpointFileLocator()
                .getS3Locations("/Users/ezequiel/Downloads/chk-3-checkpoint_metadata-f350e54becb2"));
    }

    public Set<String> getS3Locations(String manifestPath) throws IOException {
        Savepoint savepoint = SavepointStore.loadSavepoint(manifestPath, this.getClass().getClassLoader());

        Stream<String> rawStream = savepoint.getOperatorStates().stream()
                .flatMap(operatorState -> operatorState.getSubtaskStates().values().stream())
                .flatMap(operatorSubtaskState -> operatorSubtaskState.getRawKeyedState().stream())
                .map(keyedStateHandle -> (KeyGroupsStateHandle) keyedStateHandle)
                .map(KeyGroupsStateHandle::getDelegateStateHandle)
                .map(this::getPath);

        Stream<String> metadataStream = savepoint.getOperatorStates().stream()
                .flatMap(operatorState -> operatorState.getSubtaskStates().values().stream())
                .flatMap(operatorSubtaskState -> operatorSubtaskState.getManagedKeyedState().stream())
                .map(keyedStateHandle -> (IncrementalKeyedStateHandle) keyedStateHandle)
                .map(IncrementalKeyedStateHandle::getMetaStateHandle)
                .map(this::getPath);

        return Stream.concat(rawStream, metadataStream).collect(Collectors.toSet());
    }

    private String getPath(StreamStateHandle streamStateHandle) {
        if (streamStateHandle instanceof FileStateHandle) {
            return ((FileStateHandle) streamStateHandle).getFilePath().toString();
        } else if (streamStateHandle instanceof ByteStreamStateHandle) {
            return ((ByteStreamStateHandle) streamStateHandle).getHandleName();
        }
        return null;
    }

}
和都包含可以读取和写入保存点/检查点的连接器。您可能会发现其中一个或两个都很有用,可以直接使用,也可以作为示例。请参阅这张Jira罚单--跟踪正在进行的工作,为使用Flink快照创建更好的工具