从python上传到Bigquery

从python上传到Bigquery,python,json,upload,export,google-bigquery,Python,Json,Upload,Export,Google Bigquery,我有一个Python脚本,它从firebase下载数据,对其进行操作,然后将其转储到JSON文件中。我可以通过命令行将其上传到BigQuery,但现在我想在Python脚本中加入一些代码,以便一次完成所有操作 这是我到目前为止的代码 import json from firebase import firebase firebase = firebase.FirebaseApplication('<redacted>') result = firebase.get('/connec

我有一个Python脚本,它从firebase下载数据,对其进行操作,然后将其转储到JSON文件中。我可以通过命令行将其上传到BigQuery,但现在我想在Python脚本中加入一些代码,以便一次完成所有操作

这是我到目前为止的代码

import json
from firebase import firebase

firebase = firebase.FirebaseApplication('<redacted>')
result = firebase.get('/connection_info', None)
id_keys = map(str, result.keys())

#with open('result.json', 'r') as w:
 # connection = json.load(w)

with open("w.json", "w") as outfile:
  for id in id_keys:
    json.dump(result[id], outfile, indent=None)
    outfile.write("\n")
导入json
从firebase导入firebase
firebase=firebase.FirebaseApplication(“”)
结果=firebase.get('/connection\u info',无)
id\u keys=map(str,result.keys())
#将open('result.json','r')作为w:
#connection=json.load(w)
以open(“w.json”,“w”)作为输出文件:
对于id_键中的id:
dump(结果[id],输出文件,缩进=无)
outfile.write(“\n”)

要使用
google cloud bigquery
Python库加载JSON文件,请使用以下方法

从以下代码示例:

Edit:自Python库的0.28.0版以来,上传到表的方式发生了变化。以下是在0.27及更早版本中执行此操作的方法。


要使用
googlecloudbigquery
Python库加载JSON文件,请使用
Table.upload\u from\u file()
方法

bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset('mydataset')
table = dataset.table('mytable')

# Reload the table to get the schema.
table.reload()

with open(source_file_name, 'rb') as source_file:
    # This example uses JSON, but you can use other formats.
    # See https://cloud.google.com/bigquery/loading-data
    job = table.upload_from_file(
        source_file, source_format='NEWLINE_DELIMITED_JSON')

代码示例:2019年11月更新

找到使用Python将JSON上载到Google BigQuery的更新版本

这是我的工作解决方案:

from google.cloud import bigquery

from google.oauth2 import service_account
from dotenv import load_dotenv
load_dotenv()

client = bigquery.Client()
filename = '/path/to/file/in/nd-format.json'
dataset_id = 'DatasetName'
table_id = 'TableName'

dataset_ref = client.dataset(dataset_id)
table_ref = dataset_ref.table(table_id)
job_config = bigquery.LoadJobConfig()
job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
job_config.autodetect = True

with open(filename, "rb") as source_file:
    job = client.load_table_from_file(
        source_file,
        table_ref,
        location="europe-west1",  # Must match the destination dataset location.
        job_config=job_config,
    )  # API request

job.result()  # Waits for table load to complete.

print("Loaded {} rows into {}:{}.".format(job.output_rows, dataset_id, table_id))

另外,当我从命令行手动将文件上传到BiqQuery时,我首先必须将json上传到我的google云。有没有一种方法可以将其自动化?只需将格式更改为JSON:感谢您的报告。我已经更新了最新样本的链接,并使用提交哈希来防止将来出现404。
from google.cloud import bigquery

from google.oauth2 import service_account
from dotenv import load_dotenv
load_dotenv()

client = bigquery.Client()
filename = '/path/to/file/in/nd-format.json'
dataset_id = 'DatasetName'
table_id = 'TableName'

dataset_ref = client.dataset(dataset_id)
table_ref = dataset_ref.table(table_id)
job_config = bigquery.LoadJobConfig()
job_config.source_format = bigquery.SourceFormat.NEWLINE_DELIMITED_JSON
job_config.autodetect = True

with open(filename, "rb") as source_file:
    job = client.load_table_from_file(
        source_file,
        table_ref,
        location="europe-west1",  # Must match the destination dataset location.
        job_config=job_config,
    )  # API request

job.result()  # Waits for table load to complete.

print("Loaded {} rows into {}:{}.".format(job.output_rows, dataset_id, table_id))