Google bigquery 将大文件从Google BigQuery传输到Google云存储_Google Bigquery_Google Cloud Storage - Fatal编程技术网

Google bigquery 将大文件从Google BigQuery传输到Google云存储

google-bigquery google-cloud-storage

Google bigquery 将大文件从Google BigQuery传输到Google云存储,google-bigquery,google-cloud-storage,Google Bigquery,Google Cloud Storage,我需要将BigQuery中的一个大表2B记录传输到csv格式的云存储。我正在使用控制台进行传输由于文件的大小，我需要指定一个包含*的uri来分割导出。我最终在云存储中存储了400个csv文件。每个都有一个标题行这使得合并文件非常耗时，因为我需要将csv文件下载到另一台机器上，去掉标题行，合并文件，然后重新上载。FY组合csv文件的大小约为48GB 有更好的方法吗？使用API，您可以告诉BigQuery在表提取期间不要打印标题行。这是通过将configuration.extract.print

我需要将BigQuery中的一个大表2B记录传输到csv格式的云存储。我正在使用控制台进行传输

由于文件的大小，我需要指定一个包含*的uri来分割导出。我最终在云存储中存储了400个csv文件。每个都有一个标题行

这使得合并文件非常耗时，因为我需要将csv文件下载到另一台机器上，去掉标题行，合并文件，然后重新上载。FY组合csv文件的大小约为48GB

有更好的方法吗？

使用API，您可以告诉BigQuery在表提取期间不要打印标题行。这是通过将

configuration.extract.printHeader

选项设置为

false

来实现的。有关更多信息，请参阅。命令行实用程序也应该能够做到这一点

完成此操作后，连接文件就容易多了。在Linux/Mac计算机中，它将是一个

cat

命令。但是，您也可以尝试使用

compose

操作直接从云存储连接。可以从API或命令行实用程序执行合成
由于合成操作仅限于32个组件，因此必须在32个文件之后合成32个文件。这将为400个文件进行大约13次合成操作。请注意，我从未尝试过合成操作，因此我只是猜测这一部分。
在控制台中，使用该实用程序剥离标题：

bq --skip_leading_rows 1

合并后如何处理这些CSV文件？为什么上传前需要合并（为什么不能单独上传）？您真的需要标题吗，或者您可以在代码中假设列序列？

[google cloud storage]相关文章推荐

Google cloud storage 谷歌存储Json Api-访问“；文件夹"；从api失败？ google-cloud-storage

Google cloud storage 如何从浏览器访问Google云存储桶中的文件？ google-cloud-storage

Google cloud storage 谷歌云平台API->&引用；OAuth客户端已被删除。”；错误 google-cloud-storage

Google cloud storage 对Google存储桶的POST请求的授权 google-cloud-storage

Google cloud storage gsutil命令在一定时间内移动文件？ google-cloud-storage google-cloud-platform

Google cloud storage 如何使用gsutil的服务帐户，从CS-DCM谷歌私有bucket下载 google-cloud-storage google-cloud-platform

Google cloud storage rsync不会排除gsutil 4.15中的隐藏文件 google-cloud-storage

Google cloud storage 谷歌存储上传文件在javascript google-cloud-storage

Google cloud storage 谷歌云存储是一个自动的全球CDN吗？ google-cloud-storage google-cloud-platform

Google cloud storage GCS存储桶的数据访问审核日志时间戳 google-cloud-storage

Google cloud storage 使用wget等命令行实用程序从谷歌云存储下载公共数据目录 google-cloud-storage

Google cloud storage 谷歌云可恢复下载未发生 google-cloud-storage

Google cloud storage 如何在与用户合作时规划google云存储桶的创建 google-cloud-storage

随机文章推荐

Asynchronous 使用WLS 10.3.3调用JAX-WS客户端异步服务 asynchronous weblogic

Asynchronous Tianium-Webview在异步回调加载时不呈现 asynchronous webview

Asynchronous Tornado和异步请求处理 asynchronous

Asynchronous Breeze控制器中的异步/等待 asynchronous breeze

Asynchronous 目标C：在块内定义变量 asynchronous

Asynchronous 如何异步调用DeviceIOControl代码？ asynchronous

Asynchronous 如何限制传递给async.Parallel的大量异步工作流 asynchronous f#

Asynchronous Netty-跳过管道中的其余处理程序 asynchronous netty

Asynchronous 使用对谷歌表单的多个查询的承诺 asynchronous google-apps-script google-sheets

Asynchronous GameMaker:Studio switch语句未在http异步事件中执行 asynchronous

Asynchronous Xamarin表单GetAddressForPositionAsync asynchronous geolocation xamarin.forms

Asynchronous Cassandra异步读写，最佳实践 asynchronous cassandra

Asynchronous 如何在不使用任何外部依赖项的情况下执行异步/等待函数？ asynchronous rust

Asynchronous 如何处理路线警卫和；ionic 4中的异步ionic.storage？ asynchronous routing jwt

Asynchronous 有没有办法在dask客户机中运行异步函数？ asynchronous dask

Asynchronous 如果是批处理，Promissions和async/await会创建很多线程吗？它比同步版本好吗？ asynchronous

Asynchronous 使用ApacheCamel和RabbitMQ进行异步处理 asynchronous rabbitmq apache-camel

Asynchronous REST端点：无返回值的异步执行 asynchronous

Asynchronous 实现异步的最佳方式是什么；“优先权”；大块？ asynchronous

Asynchronous 如何在Warp内部生成一个线程来处理异步行为？ asynchronous rust

[google bigquery]相关推荐

Google bigquery 大查询表太零碎-无法更正
Google Bigquery

Google bigquery 计数内的逻辑检查和性能问题（不同的foo）
Google Bigquery

Google bigquery 时间戳不一致导致BigQuery浏览器工具？
Google Bigquery

Google bigquery 大查询分页继续返回相同的pageToken
Google Bigquery

Google bigquery Bigquery：管理服务帐户
Google Bigquery

Google bigquery 如何获取有用的BigQuery错误
Google Bigquery

Google bigquery 在BigQuery项目之间迁移服务帐户
Google Bigquery Google Cloud Platform

Google bigquery 无法在单独的位置处理数据
Google Bigquery Google Cloud Storage

Google bigquery 基于函数而不是原始内容查找重复行
Google Bigquery

Google bigquery BigQuery更新或删除DML
Google Bigquery

Google bigquery 如何使用Python删除或清空BigQuery中的表
Google Bigquery

Google bigquery 如何在Google Cloud HTTP Loadbalancer日志中记录X-Forwarded-For以导出到BigQuery
Google Bigquery Google Cloud Platform

Google bigquery 从Google Cloud SQL迁移到Google Bigquery
Google Bigquery

Google bigquery 如何从Google BigQuery中的非唯一记录生成唯一id
Google Bigquery

Google bigquery Data studio是否自动从报表中筛选出空维度？
Google Bigquery

Google bigquery 连接到Power BI中的BigQuery
Google Bigquery Powerbi

Google bigquery 如何使用BigQuery在INT64和二进制字符串之间进行转换？
Google Bigquery

Google bigquery 如何更改数据集名称
Google Bigquery

Google bigquery BigQuery日期需要多少字节
Google Bigquery

Google bigquery BigQuery给出了不一致的结果
Google Bigquery

Google bigquery bigQuery中分片表和通配符表的区别。如何在bigQuery中创建分片表
Google Bigquery

Google bigquery BigQuery类型中有多少字节
Google Bigquery

Google bigquery 如何安排查询并将结果保存到其他项目中？
Google Bigquery

Google bigquery “dataset.table”中的rray_agg（结构（id为id，val为val）））`
Google Bigquery

Google bigquery 通过Power BI的Google BigQuery Rest API存在问题
Google Bigquery Powerbi

Google bigquery 使用SSIS将数据从prem sql server加载到bigquery
Google Bigquery

Google bigquery 将新值插入多个表中，然后根据特定标识符删除这些新插入的值（行）
Google Bigquery

Google bigquery 为什么这个BigQuery在语句中没有给出结果？
Google Bigquery

Google bigquery 将云存储中的文本文件（.txt）加载到大查询表中
Google Bigquery Google Cloud Storage

Google bigquery BigQuery错误："；无法在带有DML语句的作业中设置写处理；
Google Bigquery

Tags

Wix Windows 10 Server Grid Rss Docusignapi Junit Android Studio Neo4j Dask Itext Tsql Teamcity Select Ios8 Firefox Html5 Canvas Directory Gitlab Sharepoint Exchange Server Karate Mariadb .net Core Swift2 Spring Cloud Swing Content Management System Hash Xamarin.ios Google Analytics Pip Localization Deployment Phpunit Jakarta Ee Pandas Matplotlib D Plsql Exception Tags Floating Point Windows 7 Inno Setup Xcode4 Ruby On Rails Ibm Mobilefirst Forms Racket Mqtt Ignite Sencha Touch .htaccess Google Apps Script Actionscript Markdown Random Mysql Deep Learning Perl Pyspark Webrtc Core Data Reactjs Collections Programming Languages Jetty Sonarqube Weblogic Vba Git Camera Prestashop Jsp Shiny Django Sprite Kit Google Maps Api 3 Service Scroll Arangodb Vuejs2 Sockets Cocoa Jquery Mobile Office Js 3d Antlr Algorithm Openshift Nestjs Sbt Rxjs Google Plus Speech Recognition Aurelia Sharepoint 2010 Iis 7 Rabbitmq Text Silverlight 4.0 Backbone.js Composer Php Windows Services Excel Formula Coding Style Ecmascript 6 Talend Generics Transactions Multithreading Zend Framework React Native Woocommerce Nginx Gps Webstorm Azure Service Fabric Cygwin Sails.js Apache2 Uwp Scala Uml Signalr Typo3 Ionic2 Networking Ant Netsuite Abap Clojure Arm Dns Design Patterns Angular Material Stm32 Appium Primefaces Terraform Gcc Antlr4 Kernel Proxy Twitter Bootstrap 3 Smalltalk Twitter Bootstrap Nosql Aws Lambda Join Apache Kafka Amazon Cloudformation Internet Explorer Ios5 Jquery Ui Report Download Database Ckeditor Blackberry File Io Struts2 Hbase Mediawiki Dynamics Crm 2011 .net 4.0 Mapreduce Video Rust Model View Controller Apache Zookeeper Browser Elixir Windows Runtime Windows Mobile Boost Opengl Error Handling Qt4 Sql Server 2005 Ipython Flutter Asp.net Web Api Wicket Tkinter Hyperlink Express Rest Stream Winforms Laravel 5 Iphone Import Intellij Idea Gruntjs Macros Kubernetes R Internationalization Stata

Copyright © 2024. All Rights Reserved by - Fatal编程技术网