Amazon s3 雅典娜-抓取桶中的最新文件_Amazon S3_Amazon Athena

Amazon s3 雅典娜-抓取桶中的最新文件

amazon-s3

Amazon s3 雅典娜-抓取桶中的最新文件,amazon-s3,amazon-athena,Amazon S3,Amazon Athena,我对雅典娜和S3都是新手。我们已经将Athena设置为访问连接到数据库的S3存储桶，每个存储桶每天都持有相同的数据表。例如： database-name - "sales" tables: ["19.02.2019", "18.02.2019",..."01.02.2019"] 要查询该表，我需要运行以下示例： SELECT a.creation_date, a.number, pa.customer_number, a.customer_type, a.name, a.city, a.c

我对雅典娜和S3都是新手。我们已经将Athena设置为访问连接到数据库的S3存储桶，每个存储桶每天都持有相同的数据表。例如：

database-name - "sales"
tables: ["19.02.2019", "18.02.2019",..."01.02.2019"]

要查询该表，我需要运行以下示例：

SELECT 
a.creation_date,
a.number,
pa.customer_number,
a.customer_type,
a.name,
a.city,
a.country,
a.type,
a.business,
b.industry,
cu.group,
cu.closing_date,
cu.interest_flag,
FROM 
    (SELECT a.creation_date,
     a.type,
     a.number,
     a.customer_type,
     a.business,
     a.id,
     b.industry,
     customer.id,
     concat (p.first_name, ' ' ,p.last_name) AS name, p.address, p.country
    FROM "accounts"."2019_02_19_01_32_18" AS a
    LEFT JOIN "customers"."2019_02_19_02_31_03" AS c
        ON a.id=c.id
    LEFT JOIN "people"."2019_02_19_06_05_10" AS p
        ON c.person_id=p.id
    LEFT JOIN "strategic_partners"."2019_02_18_05_57_59" AS par
        ON par.uid=p.strapartner_uid
    WHERE a.number is NOT null  and a.customer_type = (1)

    UNION

    SELECT a.creation_date,
    a.type,
    a.number,
    a.customer_type,
    a.business_name,
    a.id,
    b.industry,
    customer.id,
    concat (p.first_name, ' ',p.last_name) AS name, p.address, p.country
    FROM "accounts"."2019_02_19_01_32_18" AS a
    LEFT JOIN "customers"."2019_02_19_02_31_03" AS c
        ON a.id=c.id
    LEFT JOIN "people"."2019_02_19_06_05_10" AS p
        ON c.person_id=p.id
    LEFT JOIN "strategic_partners"."2019_02_18_05_57_59" AS par
        ON par.uid=p.strapartner_uid
    WHERE a.number is NOT null and a.customer_type IN (4,8)
    ) AS a

    LEFT JOIN "progressive_accounts"."2019_02_18_18_15_28" AS pa
     ON pa.credit_number = a.credit_number
    LEFT JOIN "progressive_customer"."2019_02_18_18_15_01" AS cu
     ON pa.prog_number=cu.prog_number
     WHERE a.creation_date>='2018-10-01' AND a.creation_date<='2018-12-31'
     ORDER BY a.creation_date desc, a.business_name asc

我也不明白为什么雅典娜试图在雅典娜右侧下拉菜单中选择的数据库上创建视图，而不是在“公共”数据库上创建视图（如PostgreSQL或类似数据库）

任何指导都会很好

不能使用子查询返回要在查询中使用的表名

相反，您可以每天使用

CREATE或REPLACE VIEW

创建一个指向“latest”表的视图。然后，只需查询视图

您可能有一些日常任务正在创建这些表中的每一个，因此让它也更新视图。

当我尝试使用标准语法创建视图时，遇到以下错误：写入位置时拒绝访问：s3://paht/path/path/path/.txt此查询针对“数据库”数据库运行，除非被查询限定。请在我们的论坛上发布错误消息或联系客户支持。。。尚未找到该错误的任何引用，我不确定需要重新配置哪个服务才能将这些权限添加到我的角色。这里有什么想法吗？如果您删除

CREATE VIEW

部分，并将其作为

SELECT

语句运行，它是否有效？你能编辑你的问题来显示失败的语句吗？你有一个非常非常复杂的查询，你正试图创建一个视图。这将给大多数数据库带来压力！我建议你从简单做起，然后逐步提高。首先创建一个简单有效的视图（例如，只需从一个表中选择*）。然后，在联合体的一半上尝试，等等。您还可以尝试从内部选择中拉出一些连接，然后将它们放在外部选择中。对于ETL来说，为简单查询创建一个新表，而不是每次都要运行复杂的查询，这看起来是一个很好的选择。这在数据仓库技术中非常常见。感谢您提供的信息-实际上我已经有了一个简单查询的视图-我希望按照您的建议首先创建该视图，但我甚至无法创建一个简单的视图。我在尝试在一个表上仅使用简单的SELECT语句创建视图时遇到相同的错误。在Amazon Athena中，如果单击设置链接，您可以定义在何处存储查询结果。很可能它当前配置为不存在的bucket，或者用户/Athena没有写入给定bucket的权限。请参阅：@JohnRotenstein-我已经定义了桶的位置。因此，我可以保存选择样式查询的结果，但不能创建视图，这真的很奇怪。错误是否仍然与上述相同？

Access denied when writing to location: s3://dp-jupyterlabXXXXXXXXXXXXXX/notebooks/<username>/athena/Unsaved/2019/02/25/<unique reference id>.txt

This query ran against the "database name" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: xxxxxx-xxxx-xxxxx-xxxx-xxxxxxxxxxx.

- glue access to the bucket

- all glue policies to the user