Javascript 如何运行python脚本来生成/更新S3存储桶中的JSON文件？_Javascript_Python_Html_Json_Amazon S3

Javascript 如何运行python脚本来生成/更新S3存储桶中的JSON文件？

javascript python html json amazon-s3

Javascript 如何运行python脚本来生成/更新S3存储桶中的JSON文件？,javascript,python,html,json,amazon-s3,Javascript,Python,Html,Json,Amazon S3,我的问题：如何运行python脚本来生成/更新S3存储桶中的JSON文件？此外，我应该在每次文件添加到我的存储桶时运行此脚本，还是在有人访问我的网页时运行此脚本？请继续阅读说明… 我有一个网站托管在AWS上，我的静态页面位于公共S3存储桶中。目的是与音乐系学生分享我写的音乐。我有一个Python脚本，它扫描同一S3存储桶中文件夹中的所有对象（sheet music PDF）。Python脚本然后创建一个JSON文件，其中包含所有s3对象的名称以及每个对象的名称以下是JSON文件的格式： {

我的问题：如何运行python脚本来生成/更新S3存储桶中的JSON文件？此外，我应该在每次文件添加到我的存储桶时运行此脚本，还是在有人访问我的网页时运行此脚本？

请继续阅读说明…

我有一个网站托管在AWS上，我的静态页面位于公共S3存储桶中。目的是与音乐系学生分享我写的音乐。我有一个Python脚本，它扫描同一S3存储桶中文件夹中的所有对象（sheet music PDF）。Python脚本然后创建一个JSON文件，其中包含所有s3对象的名称以及每个对象的名称

以下是JSON文件的格式：

{
  "bass": [{
      "bass-song1": "http://www.website.com/bass-song1.pdf"
    }, {
      "bass-song2": "http://www.website.com/bass-song2.pdf"
    }],
  "drum": [{
      "drum-song1": "http://www.website.com/drum-song1.pdf"
    }, {
      "drum-song2": "http://www.website.com/drum-song2.pdf"
    }],
  "guitar": [{
      "guitar-song1": "http://www.website.com/guitar-song1.pdf"
    }, {
      "guitar-song2": "http://www.website.com/guitar-song2.pdf"
    }]
}

我的工作Python脚本，供参考：

import boto3
import json
import pprint

# This program creates a json file
# with temporary URLs for bucket objects, organized by folder
# For use with javascript that generates links to bucket objects for website

# create a session to retrieve credentials from ~/.aws/credentials
session = boto3.Session(profile_name='<MY PROFILE>')

# use your credentials to create a low-level client with the s3 service
s3 = session.client('s3')

# Store dictionary of objects from bucket that starts with the prefix '__'
response = s3.list_objects_v2(Bucket='<MY BUCKET>', Prefix='<FOLDER IN BUCKET>')

folder_list = []
url_json = {}

# (the value of 'Contents' is a list of dictionaries)
# For all the dictionaries in the 'Contents' list,
# IF they don't end with '/' (meaning, if its not a directory)...
for i in response['Contents']:
    if i['Key'].endswith('/') != True:

        # get the directory of the current Key, save as string in 'dir'
        full_path = i['Key']
        # this retrieves the folder after '__'
        dir = full_path.split("/")[1]
        # capitalize the directory name, update variable
        dir = dir.capitalize()
        # this retrieves the file name
        filename = full_path.split("/")[2]

        # if the name of the directory ('dir') is not in folder_list:
        # add it to folder_list,
        # and add an item to dictionary 'url_json' where key is current 'dir' and value is empty list
        if dir not in folder_list:
            folder_list.append(dir)
            url_json[folder_list[-1]] = []

        # generate a temporary URL for the current bucket object
        url = s3.generate_presigned_url('get_object', Params={'Bucket':'<MY BUCKET>', 'Key':i['Key']}, ExpiresIn=3600)

        # create a dictionary for each bucket object
        # store the object's name (Key) and URL (value)
        object_dict = {filename:url}

        # Append the newly created URL to a list in the 'url_json' dictionary,
        # whose key is the last directory in 'folder_list'
        url_json[folder_list[-1]].append(object_dict)

# Dump content of 'url_json' directory to 'urls.json' file
# if it already exists, overwrite it
with open('url_list.json', mode='w') as outfile:
    json.dump(url_json, outfile)

<script type="text/javascript">

async function getData(url) {
  const response = await fetch(url);
  return response.json()
}

async function main() {
  const data = await getData('<JSON FILE URL>');

  // 'instrument' example: 'Bass'
  for (var instrument in data) {

    var h = document.createElement("H3"); // Create the H1 element
    var t = document.createTextNode(instrument); // Create a text element
    h.appendChild(t); // Append the text node to the H1 element
    document.body.appendChild(h); // Append the H1 element to the document body

    // store the list of bass songs
    var song_list = data[instrument]
    // for each song (list element),
    for (var song_object of song_list) {
      // for every song name in the object (in python, dictionary)
      for (var song_name in song_object) {
        // create a var with the name and URL of the PDF song file
        var link_str = song_name;
        var link_url = song_object[song_name];

        // Create link to appear on website
        const a = document.createElement("a");
        var lineBreak = document.createElement("br");
        a.href = link_url;
        a.innerText = link_str;
        a.style.backgroundColor="rgba(255, 255, 255, 0.7)"
        document.body.appendChild(a);
        document.body.append(lineBreak);
      }
    }
  }
}
main();

</script>

导入boto3
导入json
导入pprint
#这个程序创建一个json文件
#使用按文件夹组织的bucket对象的临时URL
#用于生成指向网站bucket对象的链接的javascript
#创建会话以从~/.aws/credentials检索凭据
会话=bot3.会话（配置文件名称=“”）
#使用您的凭据创建具有s3服务的低级客户端
s3=会话。客户端（'s3'）
#存储以前缀“\u\ u1”开头的bucket中的对象字典
response=s3.列出对象（Bucket=''，前缀=''）
文件夹列表=[]
url_json={}
#（Contents的值是字典列表）
#对于“目录”列表中的所有词典，
#如果它们不以“/”结尾（意思是，如果不是目录的话）。。。
对于响应['Contents']的i：
如果我['Key'].endswith（'/'）！=正确：
#获取当前密钥的目录，另存为'dir'中的字符串
完整路径=i['Key']
#这将检索“\uu\”之后的文件夹
dir=完整路径分割（“/”[1]
#将目录名大写，更新变量
dir=dir.capitalize（）
#这将检索文件名
filename=full_path.split（“/”[2]
#如果目录（“目录”）的名称不在文件夹列表中：
#将其添加到文件夹列表中，
#并向字典“url_json”中添加一项，其中键为当前“dir”，值为空列表
如果目录不在文件夹列表中：
文件夹\u列表。追加（目录）
url\u json[文件夹\u列表[-1]]=[]
#为当前bucket对象生成临时URL
url=s3.generate_presigned_url（'get_object'，Params={'Bucket'：''，'Key'：i['Key']}，ExpiresIn=3600）
#为每个bucket对象创建一个字典
#存储对象的名称（键）和URL（值）
object_dict={filename:url}
#将新创建的URL追加到“URL\u json”字典中的列表中，
#其密钥是“文件夹列表”中的最后一个目录
url\u json[文件夹\u列表[-1]]。追加（对象\u dict）
#将“url_json”目录的内容转储到“url.json”文件
#如果它已经存在，请覆盖它
以open（'url_list.json'，mode='w'）作为输出文件：
dump（url_json，outfile）

此外，在我的网页正文中，我编写了一些内嵌JavaScript来加载/读取JSON文件，并使用URL及其关联文本创建链接，供人们访问我网页上的这些文件

我的工作JavaScript，供参考：

import boto3
import json
import pprint

# This program creates a json file
# with temporary URLs for bucket objects, organized by folder
# For use with javascript that generates links to bucket objects for website

# create a session to retrieve credentials from ~/.aws/credentials
session = boto3.Session(profile_name='<MY PROFILE>')

# use your credentials to create a low-level client with the s3 service
s3 = session.client('s3')

# Store dictionary of objects from bucket that starts with the prefix '__'
response = s3.list_objects_v2(Bucket='<MY BUCKET>', Prefix='<FOLDER IN BUCKET>')

folder_list = []
url_json = {}

# (the value of 'Contents' is a list of dictionaries)
# For all the dictionaries in the 'Contents' list,
# IF they don't end with '/' (meaning, if its not a directory)...
for i in response['Contents']:
    if i['Key'].endswith('/') != True:

        # get the directory of the current Key, save as string in 'dir'
        full_path = i['Key']
        # this retrieves the folder after '__'
        dir = full_path.split("/")[1]
        # capitalize the directory name, update variable
        dir = dir.capitalize()
        # this retrieves the file name
        filename = full_path.split("/")[2]

        # if the name of the directory ('dir') is not in folder_list:
        # add it to folder_list,
        # and add an item to dictionary 'url_json' where key is current 'dir' and value is empty list
        if dir not in folder_list:
            folder_list.append(dir)
            url_json[folder_list[-1]] = []

        # generate a temporary URL for the current bucket object
        url = s3.generate_presigned_url('get_object', Params={'Bucket':'<MY BUCKET>', 'Key':i['Key']}, ExpiresIn=3600)

        # create a dictionary for each bucket object
        # store the object's name (Key) and URL (value)
        object_dict = {filename:url}

        # Append the newly created URL to a list in the 'url_json' dictionary,
        # whose key is the last directory in 'folder_list'
        url_json[folder_list[-1]].append(object_dict)

# Dump content of 'url_json' directory to 'urls.json' file
# if it already exists, overwrite it
with open('url_list.json', mode='w') as outfile:
    json.dump(url_json, outfile)

<script type="text/javascript">

async function getData(url) {
  const response = await fetch(url);
  return response.json()
}

async function main() {
  const data = await getData('<JSON FILE URL>');

  // 'instrument' example: 'Bass'
  for (var instrument in data) {

    var h = document.createElement("H3"); // Create the H1 element
    var t = document.createTextNode(instrument); // Create a text element
    h.appendChild(t); // Append the text node to the H1 element
    document.body.appendChild(h); // Append the H1 element to the document body

    // store the list of bass songs
    var song_list = data[instrument]
    // for each song (list element),
    for (var song_object of song_list) {
      // for every song name in the object (in python, dictionary)
      for (var song_name in song_object) {
        // create a var with the name and URL of the PDF song file
        var link_str = song_name;
        var link_url = song_object[song_name];

        // Create link to appear on website
        const a = document.createElement("a");
        var lineBreak = document.createElement("br");
        a.href = link_url;
        a.innerText = link_str;
        a.style.backgroundColor="rgba(255, 255, 255, 0.7)"
        document.body.appendChild(a);
        document.body.append(lineBreak);
      }
    }
  }
}
main();

</script>


异步函数getData（url）{
const response=等待获取（url）；
返回response.json（）
}
异步函数main（）{
常量数据=等待获取数据（“”）；
//“乐器”示例：“低音”
用于（数据中的var仪器）{
var h=document.createElement（“H3”）；//创建H1元素
var t=document.createTextNode（instrument）；//创建一个文本元素
h、 appendChild（t）；//将文本节点追加到H1元素
document.body.appendChild（h）；//将H1元素附加到文档体
//存储低音歌曲列表
var song_列表=数据[仪器]
//对于每首歌曲（列表元素），
for（歌曲列表的var song_对象）{
//对于对象中的每个歌曲名称（在python、字典中）
for（song_对象中的变量song_名称）{
//使用PDF歌曲文件的名称和URL创建一个var
var link_str=宋_name；
var link_url=song_对象[song_名称]；
//创建要显示在网站上的链接
常量a=document.createElement（“a”）；
var lineBreak=document.createElement（“br”）；
a、 href=link\uURL；
a、 innerText=link_str；
a、 style.backgroundColor=“rgba（255、255、255、0.7）”
文件.正文.附件（a）；
document.body.append（换行符）；
}
}
}
}
main（）；

所以我的最终目标是： 我将一个PDF文件上传到S3存储桶，执行Python脚本更新S3存储桶中的JSON文件，当有人访问我的网页时，JavaScript将解析JSON文件以创建指向这些PDF文件的链接。当我将文件上传到bucket中时，或者当有人访问网站时，我应该让python脚本创建JSON吗？问题是这些对象URL具有。。。但我也不希望每次有人加载页面时都运行代码（出于$$原因）

任何帮助都将不胜感激。如果你还需要我的信息，请告诉我。非常感谢。

Sebastian。

为什么要使用预签名URL？你不希望他们总是公开吗？用户是否会通过身份验证来获取JSON文件？确保隐私将导致与公开访问不同的设计，因此值得提前决定。如果可以避免预签名URL，则可以在添加新文件时使用Amazon S3事件触发Python AWS Lambda函数。然后，代码可以列出对象并创建一个JSON文件。因此，它只有在文件更改时才会运行。谢谢您的输入。我将避免使用预签名URL来代替S3事件触发器选项，但如何以编程方式获取新对象的公共URL？我找不到对它的调用，URL是：

BUCKET-NAME.s3.amazonaws.com/OBJECT-KEY

，如果您的域名与