Numpy 如何将.npz文件加载到Google计算引擎_Numpy_Google Cloud Platform_Google Compute Engine_Jupyter

Numpy 如何将.npz文件加载到Google计算引擎

numpy google-cloud-platform google-compute-engine

Numpy 如何将.npz文件加载到Google计算引擎,numpy,google-cloud-platform,google-compute-engine,jupyter,Numpy,Google Cloud Platform,Google Compute Engine,Jupyter,我有一个CNN模型，我想在Jupyter接口中运行，该接口连接到Google Compute Engine中的VM实例。我一直不知道如何从Jupyter读取数据，并将照片图像数据转换成.npz文件，保存在Google云存储桶中这就是我迄今为止所尝试的： def load_dataset(): # load dataset data = load('gs://bucket/data.npz') X, y = data['arr_0'], data['arr_1']

我有一个CNN模型，我想在Jupyter接口中运行，该接口连接到Google Compute Engine中的VM实例。我一直不知道如何从Jupyter读取数据，并将照片图像数据转换成.npz文件，保存在Google云存储桶中

这就是我迄今为止所尝试的：

def load_dataset():

    # load dataset

    data = load('gs://bucket/data.npz')
    X, y = data['arr_0'], data['arr_1']

    # separate into train and test datasets
    trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.3, random_state=1)
    print(trainX.shape, trainY.shape, testX.shape, testY.shape)
    return trainX, trainY, testX, testY

我想我可以使用gsutil函数来别名bucket和文件的路径，但是得到一个错误，说没有这样的文件存在

以下是完整的回溯：

我们可以向您推荐两种不同的方式

一种是让Python调用文件系统的

gsutil

，然后处理该数据

或者你可以使用一种可以说是更好的方法

对于后者，请确保您以前在运行代码的机器上运行过

pip安装google云存储
假设您使用的load
函数将当前工作目录中的文件作为输入
将以下代码段添加到源代码中：
from google.cloud import storage

client = storage.Client()

def download_gcs_object(name_bucket, name_blob):
    bucket = client.bucket(name_bucket)
    blob = bucket.blob(name_blob)
    blob.download_to_filename(blob.name)
    print("Downloaded into current working directory a file with name ", blob.name)


在此之后，您可以通过以下方式编辑您发布的文章：
def load_dataset():

    #download a Cloud Storage object
    BUCKET="bucket" #TODO edit
    BLOB="data.npz" #TODO edit
    #or #BUCKET, BLOB = 'gs://bucket/data.npz'.split('/')[-2:] #if you prefer, have to edit accordingly again
    download_gcs_object(BUCKET, BLOB)

    # load dataset from filename with the blob name 

    data = load(BLOB)
    #The rest of the code is as it was...
    X, y = data['arr_0'], data['arr_1']

    # separate into train and test datasets
    trainX, testX, trainY, testY = train_test_split(X, y, test_size=0.3, random_state=1)
    print(trainX.shape, trainY.shape, testX.shape, testY.shape)
    return trainX, trainY, testX, testY

让我们知道这是否有效，如果它没有解释什么是load
函数。
如果load
语句中出现错误（最好显示完整消息并进行回溯），那么其余的代码就多余了。您需要专注于获取正确的路径和/或从源代码下载。请向我们提供您正在使用的load
功能的背景信息。