Spring integration Spring集成中的多处理_Spring Integration

Spring integration Spring集成中的多处理

spring-integration

Spring integration Spring集成中的多处理,spring-integration,Spring Integration,我有一个Spring集成服务，它需要从一个S3存储桶中读取数百个XML文件，然后处理每个文件并生成一个输出。我正在使用S3InboundFileSynchronizingMessageSource以及AbstractInboundFileSynchronizer的自定义实现 @Bean(name = "s3FileSource") @InboundChannelAdapter(value = "s3Channel" , poller = @Poller(fixedRate = "3000"))

我有一个Spring集成服务，它需要从一个S3存储桶中读取数百个XML文件，然后处理每个文件并生成一个输出。我正在使用

S3InboundFileSynchronizingMessageSource

以及

AbstractInboundFileSynchronizer

的自定义实现

@Bean(name = "s3FileSource")
@InboundChannelAdapter(value = "s3Channel" , poller = @Poller(fixedRate = "3000"))
public S3InboundFileSynchronizingMessageSource s3InboundFileSynchronizingMessageSource() {
    S3InboundFileSynchronizingMessageSource messageSource =
            new S3InboundFileSynchronizingMessageSource(s3InboundFileSynchronizer());
    messageSource.setAutoCreateLocalDirectory(true);
    messageSource.setLocalDirectory(new File(inboundDir));
    messageSource.setLocalFilter(new AcceptOnceFileListFilter<File>());
    return messageSource;
}

@Bean
public CustomAbstractInboundFileSynchronizer s3InboundFileSynchronizer() {

    CustomAbstractInboundFileSynchronizer synchronizer = new CustomAbstractInboundFileSynchronizer(new S3SessionFactory(amazonS3));
    synchronizer.setDeleteRemoteFiles(true);
    synchronizer.setPreserveTimestamp(true);
    synchronizer.setRemoteDirectory(s3BucketName.concat("/").concat(s3InboundFolder));
    synchronizer.setFilter(new S3RegexPatternFileListFilter(".*\\.xml\\.{0,1}\\d{0,2}"));
    Expression expression = PARSER.parseExpression("#this.contains('/') ? #this.substring(#this.lastIndexOf('/') + 1) : #this");
    synchronizer.setLocalFilenameGeneratorExpression(expression);
    return synchronizer;
}
@Bean
public DirectChannel s3Channel() {
    return new DirectChannel();
}

我尝试使用拆分器

split（）

，但似乎没有帮助。我还尝试使用

ThreadPooltaskExecutor

，但每当我使用它时，我都会间歇性地收到错误消息，说

S3键无效

，它会将文件保留在入站目录中，并带有

.writing

扩展名

Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 5A96A2596B6504D6), S3 Extended Request ID: a757llOA1GbivlaRu1Pf41Sz8XLL52WrgbWCa1PnAzanhyEMCwwR3Zx1H/uytjrEJsrh0Yj8M80=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1378)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:924)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:702)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:454)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:416)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:365)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3995)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1291)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1166)
at org.springframework.integration.aws.support.S3Session.read(S3Session.java:153)
at com.application.service.sample.CustomAbstractInboundFileSynchronizer.copyFileToLocalDirectory(CustomAbstractInboundFileSynchronizer.java:124)

还有一件事，每个xml文件在处理后被移动到S3中不同的“文件夹”。如何在没有任何错误的情况下同时处理多个xml文件

添加

CustomAbstractInboundFileSynchronizer

：

    @Override
protected void copyFileToLocalDirectory(String remoteDirectoryPath, S3ObjectSummary remoteFile, File localDirectory,
                                        Session<S3ObjectSummary> session) throws IOException {
    String remoteFileName = this.getFilename(remoteFile);
    //String localFileName = this.generateLocalFileName(remoteFileName);
    String localFileName = remoteFileName;
    String remoteFilePath = remoteDirectoryPath != null
            ? (remoteDirectoryPath + remoteFileName)
            : remoteFileName;
    if (!this.isFile(remoteFile)) {
        if (this.logger.isDebugEnabled()) {
            this.logger.debug("cannot copy, not a file: " + remoteFilePath);
        }
        return;
    }

    File localFile = new File(localDirectory, localFileName);
    if (!localFile.exists()) {
        String tempFileName = localFile.getAbsolutePath() + this.temporaryFileSuffix;
        File tempFile = new File(tempFileName);
        OutputStream outputStream = new BufferedOutputStream(new FileOutputStream(tempFile));
        try {
            session.read(remoteFilePath, outputStream);
        }
        catch (Exception e) {
            if (e instanceof RuntimeException) {
                throw (RuntimeException) e;
            }
            else {
                throw new MessagingException("Failure occurred while copying from remote to local directory", e);
            }
        }
        finally {
            try {
                outputStream.close();
            }
            catch (Exception ignored2) {
            }
        }

        if (tempFile.renameTo(localFile)) {
            if (this.deleteRemoteFiles) {
                session.remove(remoteFilePath);
                if (this.logger.isDebugEnabled()) {
                    this.logger.debug("deleted " + remoteFilePath);
                }
            }
        }
        if (this.preserveTimestamp) {
            localFile.setLastModified(getModified(remoteFile));
        }
    }
}

@覆盖
受保护的无效copyFileToLocalDirectory（字符串remoteDirectoryPath、S3ObjectSummary remoteFile、文件localDirectory、，
会话）引发IOException{
String remoteFileName=this.getFilename（remoteFile）；
//字符串localFileName=this.generateLocalFileName（remoteFileName）；
字符串localFileName=remoteFileName；
字符串remoteFilePath=remoteDirectoryPath！=null
？（remoteDirectoryPath+remoteFileName）
：remoteFileName；
如果（！this.isFile（remoteFile））{
if（this.logger.isDebugEnabled（））{
this.logger.debug（“无法复制，不是文件：“+remoteFilePath”）；
}
返回；
}
File localFile=新文件（localDirectory，localFileName）；
如果（！localFile.exists（））{
字符串tempFileName=localFile.getAbsolutePath（）+this.temporaryFileSuffix；
文件tempFile=新文件（tempFileName）；
OutputStream OutputStream=new BufferedOutputStream（new FileOutputStream（tempFile））；
试一试{
读取（远程文件路径，输出流）；
}
捕获（例外e）{
if（运行时异常的实例）{
抛出（运行时异常）e；
}
否则{
抛出新的MessaginException（“从远程目录复制到本地目录时发生故障”，e）；
}
}
最后{
试一试{
outputStream.close（）；
}
捕获（忽略异常2）{
}
}
if（tempFile.renameTo（localFile））{
if（this.deleteRemoteFiles）{
删除（远程文件路径）；
if（this.logger.isDebugEnabled（））{
this.logger.debug（“已删除”+远程文件路径）；
}
}
}
if（这个时间戳）{
setLastModified（getModified（remoteFile））；
}
}
}

要并行处理消息，必须使用

要逐行分割文件内容，必须使用

.split（Files.splitter（））

。参见其JavaDocs：

/**
 * The {@link AbstractMessageSplitter} implementation to split the {@link File} Message
 * payload to lines.
 * <p>
 * With {@code iterator = true} (defaults to {@code true}) this class produces an
 * {@link Iterator} to process file lines on demand from {@link Iterator#next}. Otherwise
 * a {@link List} of all lines is returned to the to further
 * {@link AbstractMessageSplitter#handleRequestMessage} process.
 * <p>
 * Can accept {@link String} as file path, {@link File}, {@link Reader} or
 * {@link InputStream} as payload type. All other types are ignored and returned to the
 * {@link AbstractMessageSplitter} as is.
 * <p>
 * If {@link #setFirstLineAsHeader(String)} is specified, the first line of the content is
 * treated as a header and carried as a header with the provided name in the messages
 * emitted for the remaining lines. In this case, if markers are enabled, the line count
 * in the END marker does not include the header line and, if
 * {@link #setApplySequence(boolean) applySequence} is true, the header is not included in
 * the sequence.
 *
 * @author Artem Bilan
 * @author Gary Russell
 * @author Ruslan Stelmachenko
 *
 * @since 4.1.2
 */
public class FileSplitter extends AbstractMessageSplitter {

/**
*用于拆分{@link File}消息的{@link AbstractMessageSplitter}实现
*有效载荷到线路。
*
*使用{@code iterator=true}（默认为{@code true}），该类生成
*{@link Iterator}根据{@link Iterator#next}的要求处理文件行。否则
*所有行的{@link List}将返回给to进一步
*{@link AbstractMessageSplitter#handleRequestMessage}进程。
*
*可以接受{@link String}作为文件路径、{@link file}、{@link Reader}或
*{@link InputStream}作为有效负载类型。所有其他类型将被忽略并返回到
*{@link AbstractMessageSplitter}原样。
*
*如果指定了{@link#setFirstLineAsHeader（String）}，则内容的第一行为
*作为标题处理，并作为标题携带，在消息中具有提供的名称
*为其余行发出。在这种情况下，如果启用了标记，则行计数
*在结束标记中不包括标题行，如果
*{@link#setApplySequence（boolean）applySequence}为true，则标头不包括在
*顺序。
*
*@作者Artem Bilan
*@作者加里·拉塞尔
*@作者Ruslan Stelmachenko
*
*@自4.1.2
*/
公共类FileSpliter扩展了AbstractMessageSplitter{

你所有其他的担忧都值得单独提问。

这里很难进行讨论，而且我们对评论和回答内容都有限制。

你在一个线程中混合了很多不相关的东西，问了太多问题。请给我们解释一下关于

CustomAbstractInboundFileSynchronizer

的原因，好吗？你为什么需要

s3Channel

作为一个

QueueChannel

，

@InboundChannelAdapter

确实会为每个任务生成一条消息。您可以根据自己的需要修改

@Poller

的

maxMessagesPerPoll

。

.writing

后缀是关于本地临时文件的，实际上它发生在连接到AWS S3之前。

/**
 * The {@link AbstractMessageSplitter} implementation to split the {@link File} Message
 * payload to lines.
 * <p>
 * With {@code iterator = true} (defaults to {@code true}) this class produces an
 * {@link Iterator} to process file lines on demand from {@link Iterator#next}. Otherwise
 * a {@link List} of all lines is returned to the to further
 * {@link AbstractMessageSplitter#handleRequestMessage} process.
 * <p>
 * Can accept {@link String} as file path, {@link File}, {@link Reader} or
 * {@link InputStream} as payload type. All other types are ignored and returned to the
 * {@link AbstractMessageSplitter} as is.
 * <p>
 * If {@link #setFirstLineAsHeader(String)} is specified, the first line of the content is
 * treated as a header and carried as a header with the provided name in the messages
 * emitted for the remaining lines. In this case, if markers are enabled, the line count
 * in the END marker does not include the header line and, if
 * {@link #setApplySequence(boolean) applySequence} is true, the header is not included in
 * the sequence.
 *
 * @author Artem Bilan
 * @author Gary Russell
 * @author Ruslan Stelmachenko
 *
 * @since 4.1.2
 */
public class FileSplitter extends AbstractMessageSplitter {