Python 如何在Google云数据流作业中从GCS读取blob（pickle）文件？_Python_Google Cloud Storage_Google Cloud Dataflow_Apache Beam - Fatal编程技术网

Python 如何在Google云数据流作业中从GCS读取blob（pickle）文件？

python google-cloud-storage google-cloud-dataflow

Python 如何在Google云数据流作业中从GCS读取blob（pickle）文件？,python,google-cloud-storage,google-cloud-dataflow,apache-beam,Python,Google Cloud Storage,Google Cloud Dataflow,Apache Beam,我尝试远程运行数据流管道，它将使用pickle文件。在本地，我可以使用下面的代码调用该文件 with open (known_args.file_path, 'rb') as fp: file = pickle.load(fp) 但是，当路径是关于云存储时，我发现它无效（gs://…）：我有点理解为什么它不起作用，但我找不到正确的方法 open（）是标准的Python库函数，它不理解Google云存储路径。您需要改为使用，它知道它以及Beam支持的其他文件系统。如果您的GCS存储

我尝试远程运行数据流管道，它将使用pickle文件。在本地，我可以使用下面的代码调用该文件

with open (known_args.file_path, 'rb') as fp:
     file = pickle.load(fp)

但是，当路径是关于云存储时，我发现它无效（gs://…）：

我有点理解为什么它不起作用，但我找不到正确的方法

open（）

是标准的Python库函数，它不理解Google云存储路径。您需要改为使用，它知道它以及Beam支持的其他文件系统。

如果您的GCS存储桶中有pickle文件，那么您可以将它们作为blob加载，并像在代码中一样进一步处理它们（使用
pickle.load（）
）：

IOError: [Errno 2] No such file or directory: 'gs://.../.pkl'

class ReadGcsBlobs(beam.DoFn): def process(self, element, *args, **kwargs): from apache_beam.io.gcp import gcsio gcs = gcsio.GcsIO() yield (element, gcs.open(element).read()) # usage example: files = (p | "Initialize" >> beam.Create(["gs://your-bucket-name/pickle_file_path.pickle"]) | "Read blobs" >> beam.ParDo(ReadGcsBlobs()) )

[google cloud storage]相关文章推荐

Google cloud storage Google云存储查看器上载不工作 google-cloud-storage

Google cloud storage 谷歌云存储小文件的成本效益？ google-cloud-storage

Google cloud storage 使用SSIS将本地文件上载到google云存储桶 google-cloud-storage

Google cloud storage 如何从gcloud中的部署中排除文件？ google-cloud-storage google-cloud-platform

Google cloud storage 执行的PHP脚本无法访问GCE上安装的GCS驱动器 google-cloud-storage google-compute-engine

Google cloud storage Google存储：使用修补程序重命名对象不起作用 google-cloud-storage

Google cloud storage 使用python在Google云存储中创建文件 google-cloud-storage

Google cloud storage Google CDN连接到CDN不创建此类密钥错误 google-cloud-storage

Google cloud storage 如何在谷歌云存储中共享存储桶 google-cloud-storage

Google cloud storage StorageException:服务器@<；项目>；。iam.gserviceaccount.com没有对项目的storage.bucket.create访问权<；项目id>； google-cloud-storage

Google cloud storage 无法删除Google云存储对象更改通知webhook google-cloud-storage

Google cloud storage GCS存储桶名称的“全局”唯一性是什么？问题 google-cloud-storage

随机文章推荐

Continuous integration 如何配置Hudson以归档空文件夹 continuous-integration

Continuous integration 在TeamCity中创建变更日志工件 continuous-integration teamcity

Continuous integration 谁/什么在詹金斯被执行期间中止了工作？ continuous-integration jenkins

Continuous integration Jenkins针对不同的配置制定了不同的时间表 continuous-integration jenkins

Continuous integration 团队城市工件路径字段中的注释 continuous-integration teamcity

Continuous integration 在运行时而不是编译期间中止厨师配方 continuous-integration chef-infra

Continuous integration 子目录中的wercker.yml continuous-integration

Continuous integration CircleCI如何在没有测试的情况下成功构建？ continuous-integration

Continuous integration Teamcity%build.vcs.number%未正确显示 continuous-integration teamcity

Continuous integration TFS生成代理nuget可从cmd工作，但在TFS上获取授权时出错 continuous-integration nuget

Continuous integration 如何在CI（Gitlab）中运行Fullstack E2E测试 continuous-integration

Continuous integration 停止并行运行Gitlab CI中的作业 continuous-integration gitlab

Continuous integration Bitbucket云中的测试覆盖突出显示 continuous-integration sonarqube

[python]相关推荐

Python 无法运行Hello World Open
Python Artificial Intelligence

发现python中使用的流打印
Python Windows Logging Printing

Python 列表拆分程序赢得'；跑不动
Python

Python中max子数组的索引不正确
Python Algorithm

在python中的另一个函数中使用函数中的变量
Python Function

编写python两级循环理解
Python

Python 如何单独返回完整的数组列表？
Python Arrays

Python 使用lxml从xml中提取字段标记及其内容
Python Xml

Python Django request.GET.GET（）截断url字符串
Python Django Google Chrome Extension

Python 在Tensorflow中添加许多变量
Python Math Tensorflow

修改python中两个文本字符串之间的编辑距离算法
Python R Python 3.x

Python Django 2:Can'；t导入模型并保存它
Python

Python 从2D numpy中提取行块
Python Python 3.x Numpy

Python 属性错误：'；线性回归'；对象没有属性'；预测概率'；
Python Machine Learning Scikit Learn

Python 如何使用列表理解创建以下列表？
Python Python 3.x List

Python 从列表中删除同一单元格中的float duplicate
Python

Python 忽略包含特定文本的行的一部分
Python Regex

Python：将数字放入列表中
Python

Python 将RGB值的列表相乘
Python List Sorting

如何使用Python在写入DynamoDB时插入对象列表？
Python Amazon Web Services Amazon Dynamodb

Python创建dict键路径类似于mkdir-p
Python Json Dictionary

Python产生了关键字的重要性和关于生成器的混淆？
Python Python 3.x

Python 基于列表删除列
Python Pandas

Python 将名称转换为ASCII
Python Python 3.x

Python 从类返回对象，打印并迭代
Python Class

Python 如何解决列表索引出错的问题？
Python

Python 如何发布pip包：尽管列在conda下，但无法导入
Python Pip Anaconda

Python 如何从nike获取产品的可用尺码
Python Web Scraping

音乐和弦名称的python正则表达式
Python Regex

Python 想为大学项目构建自动售票应用程序吗
Python Selenium

Tags

Azure Sql Database Graph Hibernate Pyspark Wicket Salesforce Ipython Vb.net Jboss Corda Virtual Machine Wix Server Antlr4 Google Colaboratory Mariadb Openlayers Datetime Ios7 Windows Oauth 2.0 Dynamic Gruntjs Continuous Integration Quickbooks Xslt Ms Access Compression Django Sublimetext2 Networking Jasper Reports Javafx Titanium Oauth Asp.net Mvc 2 Big O Responsive Design Drupal Laravel Ruby On Rails 3.1 Collections String F# Xcode Hbase Https Go Microservices Netty Time Complexity Sdk Pagination Asp.net Mvc 4 Sms Cookies Phpunit Vmware Ecmascript 6 Indexing Forms Sencha Touch 2 Exchange Server Ruby On Rails Path Coq Cordova Xml Checkbox Perforce Push Notification Apache Flink Memory Architecture Perl Puppet Computer Science Uml Linux Apache Openid Shopify Binary Google Analytics Moodle Dotnetnuke Apache Spark Breeze Neo4j Spring Integration Reporting Services Gradle Arrays Bootstrap 4 Boost Mod Rewrite Character Encoding Exception Ruby Orientdb Zend Framework2 Automated Tests Menu Entity Framework Core Winforms Web Scraping Tfs Svg Entity Framework 4 Grails Mule Grafana Laravel 5 Keyboard Google Bigquery Printing Scrapy Tensorflow Navigation Phpmyadmin Amp Html Processing Asp.net Mvc 5 Keras Gstreamer Opencart For Loop Sql Server 2008 Knockout.js Compiler Errors Jaxb Notepad++ Iis Iframe Import Generics Magento2 Tcp Ios5 Asynchronous Notifications Backbone.js Glassfish Windows Phone 7 Openerp Memory Leaks Drupal 7 Wso2 Xamarin.ios Unicode Android Studio Layout Kubernetes Android Emulator Ant Sass Parse Platform Axapta Merge Xamarin Batch File Redis Performance Office Js Lua Devexpress Blackberry Sql Cocoa Touch Log4net Enums Eclipse Mysql Python Sphinx Cocos2d X Javascript Nuget Tinymce Rust Combobox Verilog Ionic2 Primefaces Internationalization Mapreduce Oracle11g Testing Hadoop Telegram Time Doctrine Ftp Fullcalendar Dataframe Windows Phone 8.1 Google Chrome Extension Xaml Mvvm Google Cloud Firestore Windows Services Windows Installer

Copyright © 2024. All Rights Reserved by - Fatal编程技术网