Google cloud platform 从谷歌云存储桶下载文件夹_Google Cloud Platform_Google Cloud Storage

Google cloud platform 从谷歌云存储桶下载文件夹

google-cloud-platform google-cloud-storage

Google cloud platform 从谷歌云存储桶下载文件夹,google-cloud-platform,google-cloud-storage,Google Cloud Platform,Google Cloud Storage,我是谷歌云平台的新手。我已经在datalab上训练了我的模型，并将模型文件夹保存在我的存储桶中的云存储中。通过右键单击文件-->另存为链接，我可以将存储桶中的现有文件下载到本地计算机。但是，当我尝试通过与上面相同的过程下载文件夹时，我得到的不是文件夹，而是它的图像。我是否可以下载整个文件夹及其内容？是否有任何gsutil命令将文件夹从云存储复制到本地目录？您可以在gsutil工具上找到文档，更具体地回答您的问题要使用的命令是： gsutil cp -r gs://bucket/folder .

我是谷歌云平台的新手。我已经在datalab上训练了我的模型，并将模型文件夹保存在我的存储桶中的云存储中。通过右键单击文件-->另存为链接，我可以将存储桶中的现有文件下载到本地计算机。但是，当我尝试通过与上面相同的过程下载文件夹时，我得到的不是文件夹，而是它的图像。我是否可以下载整个文件夹及其内容？是否有任何gsutil命令将文件夹从云存储复制到本地目录？

您可以在gsutil工具上找到文档，更具体地回答您的问题

要使用的命令是：

gsutil cp -r gs://bucket/folder .

先决条件：已安装并初始化Google Cloud SDK（$glcoud init）

命令：

gsutil-mcp-rgs://bucket name.

当然有效。

如果您使用python从google云存储下载数据，并且希望保持相同的文件夹结构，请遵循我用python编写的代码

选择1

from google.cloud import storage

def findOccurrences(s, ch): # to find position of '/' in blob path ,used to create folders in local storage
    return [i for i, letter in enumerate(s) if letter == ch]

def download_from_bucket(bucket_name, blob_path, local_path):    
    # Create this folder locally
    if not os.path.exists(local_path):
        os.makedirs(local_path)        

    storage_client = storage.Client()
    bucket = storage_client.get_bucket(bucket_name)
    blobs=list(bucket.list_blobs(prefix=blob_path))

    startloc = 0
    for blob in blobs:
        startloc = 0
        folderloc = findOccurrences(blob.name.replace(blob_path, ''), '/') 
        if(not blob.name.endswith("/")):
            if(blob.name.replace(blob_path, '').find("/") == -1):
                downloadpath=local_path + '/' + blob.name.replace(blob_path, '')
                logging.info(downloadpath)
                blob.download_to_filename(downloadpath)
            else:
                for folder in folderloc:
                    
                    if not os.path.exists(local_path + '/' + blob.name.replace(blob_path, '')[startloc:folder]):
                        create_folder=local_path + '/' +blob.name.replace(blob_path, '')[0:startloc]+ '/' +blob.name.replace(blob_path, '')[startloc:folder]
                        startloc = folder + 1
                        os.makedirs(create_folder)
                    
                downloadpath=local_path + '/' + blob.name.replace(blob_path, '')

                blob.download_to_filename(downloadpath)
                logging.info(blob.name.replace(blob_path, '')[0:blob.name.replace(blob_path, '').find("/")])

    logging.info('Blob {} downloaded to {}.'.format(blob_path, local_path))


bucket_name = 'google-cloud-storage-bucket-name' # do not use gs://
blob_path = 'training/data' # blob path in bucket where data is stored 
local_dir = 'local-folder name' #trainingData folder in local
download_from_bucket(bucket_name, blob_path, local_dir)

选项2：使用gsutil sdk 下面是通过python程序执行此操作的另一个选项

def download_bucket_objects(bucket_name, blob_path, local_path):
    # blob path is bucket folder name
    command = "gsutil cp -r gs://{bucketname}/{blobpath} {localpath}".format(bucketname = bucket_name, blobpath = blob_path, localpath = local_path)
    os.system(command)
    return command

选项3-无python，直接使用终端和google SDK 先决条件：已安装并初始化Google Cloud SDK（$glcoud init）有关命令，请参阅以下链接：

这是我写的代码。这将把完整的目录结构下载到VM/本地存储中

from google.cloud import storage
import os
bucket_name = "ar-data"
    
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)

dirName='Data_03_09/' #***folder in bucket whose content you want to download
blobs = bucket.list_blobs(prefix = dirName)#, delimiter = '/')
destpath=r'/home/jupyter/DATA_test/' #***path on your vm/local where you want to download the bucket directory
for blob in blobs:
    #print(blob.name.lstrip(dirName).split('/'))
    currpath=destpath
    if not os.path.exists(os.path.join(destpath,'/'.join(blob.name.lstrip(dirName)).split('/')[:-1])):
        for n in blob.name.lstrip(dirName).split('/')[:-1]:
            currpath=os.path.join(currpath,n)
            if not os.path.exists(currpath):
                print('creating directory- ', n , 'On path-', currpath)
                os.mkdir(currpath)
    print("downloading ... ",blob.name.lstrip(dirName))
    blob.download_to_filename(os.path.join(destpath,blob.name.lstrip(dirName)))

或仅在终端中使用：

gsutil -m cp -r gs://{bucketname}/{folderPath} {localpath}

这就是如何从Google云存储桶下载文件夹的方法

运行以下命令将其从bucket存储下载到您的Google云控制台本地路径

gsutil-mcp-rgs://{bucketname}/{folderPath}{localpath}

运行该命令后，通过运行

ls

命令列出本地路径上的文件和目录，确认文件夹位于本地路径上

现在运行下面的命令压缩文件夹

zip -r foldername.zp yourfolder/*

完成压缩过程后，单击谷歌云控制台右侧的“更多”下拉菜单

然后选择“下载文件”选项。系统会提示您输入要下载的文件名，输入zip文件名-“foldername.zp”

不是此问题的正确位置。这不是我真正想要的。我已经使用gsutil命令将我的文件夹从Google Cloud Datalab复制到云存储。我的问题是，有没有办法将文件夹下载到我的本地计算机上，这样我就可以脱机使用它？这个命令在本地命令行上执行时，正好可以做到这一点。-r标志后面的两个选项指定：1。您要下载的文件夹的GCS路径2。要下载到的文件夹（使用“”时，这将是命令行会话中的当前文件夹）但是可以是C:/Users/username/Documents或/home/username/之类的内容，只要我将本地目录的路径作为目标，比如C:/Users/username/Documents，就会出现此错误。CommandException：目标URL必须为cp命令的多源格式命名目录、bucket或bucketsubdirectory。"

gsutil cp-r gs://api-project-921234036675cancer-data-7617/cancer_model7617 C:/Users/sanghamitra.rc

我得到了与@JSnow相同的错误，我在我的例子中修复了它。原因是目标文件夹不存在，我希望命令能创建它，但它会给出那个错误。所以只要创建目录就可以为我修复它。希望这有助于寻找相同答案的人