MongoDB Atlas数据湖-装载CSV文件

MongoDB Atlas数据湖-装载CSV文件,mongodb,atlas-data-lake,Mongodb,Atlas Data Lake,我正在尝试在S3存储桶中装载一组CSV或TSV文件。Data lake配置似乎正常,但任何查询都会失败,并出现内部错误 > use s3-logs switched to db s3-logs > show collections bucket1 > db.bucket1.find() Error: error: { "ok" : 0, "errmsg" : "an internal error occurred", "code" : 1 } 这是我的配置: > db

我正在尝试在S3存储桶中装载一组CSV或TSV文件。Data lake配置似乎正常,但任何查询都会失败,并出现内部错误

> use s3-logs
switched to db s3-logs
> show collections
bucket1
> db.bucket1.find()
Error: error: { "ok" : 0, "errmsg" : "an internal error occurred", "code" : 1 }

这是我的配置:

> db.runCommand( { "storageGetConfig" : 1 } )
{
    "ok" : 1,
    "storage" : {
        "stores" : [
            {
                "s3" : {
                    "name" : "s3-logs",
                    "region" : "us-east-1",
                    "bucket" : "my-bucket",
                    "delimiter" : "/",
                    "prefix" : "/"
                }
            }
        ],
        "databases" : {
            "s3-logs" : {
                "bucket1" : [
                    {
                        "store" : "s3-logs",
                        "definition" : "/{filename string}"
                    }
                ]
            }
        }
    }
}
S3存储桶中满是文件(S3访问日志)

文件示例-注意-无标题

53deb06d07d2d3404c3c9face2eae419ba989a5efe0a07bff7f148c6433488ab anton-iot-demo [25/Apr/2019:22:11:42 +0000] 24.246.45.35 arn:aws:iam::824967973088:user/antonum 148770891A83B6F4 REST.GET.ENCRYPTION - "GET /anton-iot-demo?encryption= HTTP/1.1" 404 ServerSideEncryptionConfigurationNotFoundError 357 - 3 - "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.526 Linux/4.9.152-0.1.ac.221.79.329.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.202-b08 java/1.8.0_202 vendor/Oracle_Corporation" - 0EH+FcDKZvG3EJaLOg7D8CvgSncCp5DWiaZOg1tWR/sAtTCLrsmUnI+s8/FA2LOETrNZUNSiHhI= SigV4 ECDHE-RSA-AES128-SHA AuthHeader s3.amazonaws.com TLSv1.2
53deb06d07d2d3404c3c9face2eae419ba989a5efe0a07bff7f148c6433488ab anton-iot-demo [25/Apr/2019:22:11:42 +0000] 24.246.45.35 arn:aws:iam::824967973088:user/antonum 3970F31AEE6A9434 REST.GET.TAGGING - "GET /anton-iot-demo?tagging= HTTP/1.1" 404 NoSuchTagSet 294 - 82 - "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.526 Linux/4.9.152-0.1.ac.221.79.329.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.202-b08 java/1.8.0_202 vendor/Oracle_Corporation" - 78c0hM+56hRGipoSUcBOeHHRZ9sfUmzrGtPOozqe+KkGkfFGqGyRstZQhI52os8XcR+5GPEUnJU= SigV4 ECDHE-RSA-AES128-SHA AuthHeader s3.amazonaws.com TLSv1.2                    

我到底做错了什么?我希望能够指定文件头和文档的详细信息,如分隔符,但在其中找不到任何内容

这是一个文件格式问题。我不认为该文件是CSV或TSV。看起来字段是由空格而不是制表符分隔的。我们还需要一个名称行(这是您可以在特定字段上编写查询的方式)。您还可以使用文件扩展名和/或特定的defaultFormat来告诉我们文件格式是什么。

Oh!您是对的-文件确实是以空格分隔的。即使它是基于逗号或制表符的,没有标题记录也不能使用。我需要先弄清楚如何将其转换为Mongo格式。
53deb06d07d2d3404c3c9face2eae419ba989a5efe0a07bff7f148c6433488ab anton-iot-demo [25/Apr/2019:22:11:42 +0000] 24.246.45.35 arn:aws:iam::824967973088:user/antonum 148770891A83B6F4 REST.GET.ENCRYPTION - "GET /anton-iot-demo?encryption= HTTP/1.1" 404 ServerSideEncryptionConfigurationNotFoundError 357 - 3 - "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.526 Linux/4.9.152-0.1.ac.221.79.329.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.202-b08 java/1.8.0_202 vendor/Oracle_Corporation" - 0EH+FcDKZvG3EJaLOg7D8CvgSncCp5DWiaZOg1tWR/sAtTCLrsmUnI+s8/FA2LOETrNZUNSiHhI= SigV4 ECDHE-RSA-AES128-SHA AuthHeader s3.amazonaws.com TLSv1.2
53deb06d07d2d3404c3c9face2eae419ba989a5efe0a07bff7f148c6433488ab anton-iot-demo [25/Apr/2019:22:11:42 +0000] 24.246.45.35 arn:aws:iam::824967973088:user/antonum 3970F31AEE6A9434 REST.GET.TAGGING - "GET /anton-iot-demo?tagging= HTTP/1.1" 404 NoSuchTagSet 294 - 82 - "-" "S3Console/0.4, aws-internal/3 aws-sdk-java/1.11.526 Linux/4.9.152-0.1.ac.221.79.329.metal1.x86_64 OpenJDK_64-Bit_Server_VM/25.202-b08 java/1.8.0_202 vendor/Oracle_Corporation" - 78c0hM+56hRGipoSUcBOeHHRZ9sfUmzrGtPOozqe+KkGkfFGqGyRstZQhI52os8XcR+5GPEUnJU= SigV4 ECDHE-RSA-AES128-SHA AuthHeader s3.amazonaws.com TLSv1.2