Python 3.x 如何自动化bigquery';使用python生成模式?
我在谷歌云上有一个Mysql数据库,我想创建自动模式,将数据插入Bigquery, 我需要自动创建以下行:Python 3.x 如何自动化bigquery';使用python生成模式?,python-3.x,google-bigquery,Python 3.x,Google Bigquery,我在谷歌云上有一个Mysql数据库,我想创建自动模式,将数据插入Bigquery, 我需要自动创建以下行: schema= [bigquery.SchemaField('EmployeeID', 'STRING', mode='NULLABLE') bigquery.SchemaField('LastName', 'STRING', mode='NULLABLE') bigquery.SchemaField('FirstName', 'STRING', mode='NULLABLE') bigq
schema= [bigquery.SchemaField('EmployeeID', 'STRING', mode='NULLABLE')
bigquery.SchemaField('LastName', 'STRING', mode='NULLABLE')
bigquery.SchemaField('FirstName', 'STRING', mode='NULLABLE')
bigquery.SchemaField('Title', 'STRING', mode='NULLABLE')
bigquery.SchemaField('TitleOfCourtesy', 'STRING', mode='NULLABLE')
bigquery.SchemaField('BirthDate', 'STRING', mode='NULLABLE')
bigquery.SchemaField('HireDate', 'STRING', mode='NULLABLE')
bigquery.SchemaField('Address', 'STRING', mode='NULLABLE')
bigquery.SchemaField('City', 'STRING', mode='NULLABLE')
bigquery.SchemaField('Region', 'STRING', mode='NULLABLE')
bigquery.SchemaField('PostalCode', 'STRING', mode='NULLABLE')
bigquery.SchemaField('Country', 'STRING', mode='NULLABLE')
bigquery.SchemaField('HomePhone', 'STRING', mode='NULLABLE')
bigquery.SchemaField('Extension', 'STRING', mode='NULLABLE')
bigquery.SchemaField('Photo', 'STRING', mode='NULLABLE')
bigquery.SchemaField('Notes', 'STRING', mode='NULLABLE')
bigquery.SchemaField('ReportsTo', 'STRING', mode='NULLABLE')
bigquery.SchemaField('PhotoPath', 'STRING', mode='NULLABLE')]
因此,为了实现这一点,我尝试:
首先,我使用一个函数获取列的名称这是我的输出:
print(table_schema_name_column)
['EmployeeID', 'LastName', 'FirstName', 'Title', 'TitleOfCourtesy', 'BirthDate', 'HireDate', 'Address', 'City', 'Region', 'PostalCode', 'Country', 'HomePhone', 'Extension', 'Photo', 'Notes', 'ReportsTo', 'PhotoPath']
然后我试着:
schema2=[]
for element in table_schema_name_column:
base2="bigquery.SchemaField("+'\''+element+"\', \'STRING\', mode=\'NULLABLE\')"
tmp=base2
#print(base2)
schema2.append(base2)
print(schema2)
这是相应的输出:
["bigquery.SchemaField('EmployeeID', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('LastName', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('FirstName', 'STRING', mode='NULLABLE')", "bigquery.SchemaField('Title', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('TitleOfCourtesy', 'STRING', mode='NULLABLE')", "bigquery.SchemaField('BirthDate', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('HireDate', 'STRING', mode='NULLABLE')", "bigquery.SchemaField('Address', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('City', 'STRING', mode='NULLABLE')", "bigquery.SchemaField('Region', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('PostalCode', 'STRING', mode='NULLABLE')", "bigquery.SchemaField('Country', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('HomePhone', 'STRING', mode='NULLABLE')", "bigquery.SchemaField('Extension', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('Photo', 'STRING', mode='NULLABLE')", "bigquery.SchemaField('Notes', 'STRING', mode='NULLABLE')",
"bigquery.SchemaField('ReportsTo', 'STRING', mode='NULLABLE')", "bigquery.SchemaField('PhotoPath', 'STRING', mode='NULLABLE')"]
此schema2的问题在于,当我尝试使用它创建表时,出现以下错误:
table_ref = dataset_ref.table("my_table_aut")
table = bigquery.Table(table_ref, schema=schema2)
table = client.create_table(table) # API request
assert table.table_id == "my_table_aut"
错误输出:
ValueError Traceback (most recent call last)
<ipython-input-13-ce1fc2c637fe> in <module>
4 ]
5 table_ref = dataset_ref.table("my_table_aut")
----> 6 table = bigquery.Table(table_ref, schema=schema2)
7 table = client.create_table(table) # API request
8
~/.local/lib/python3.6/site-packages/google/cloud/bigquery/table.py in __init__(self, table_ref, schema)
371 # Let the @property do validation.
372 if schema is not None:
--> 373 self.schema = schema
374
375 @property
~/.local/lib/python3.6/site-packages/google/cloud/bigquery/table.py in schema(self, value)
420 self._properties["schema"] = None
421 elif not all(isinstance(field, SchemaField) for field in value):
--> 422 raise ValueError("Schema items must be fields")
423 else:
424 self._properties["schema"] = {"fields": _build_schema_resource(value)}
ValueError: Schema items must be fields
ValueError回溯(最近一次调用)
在里面
4 ]
5表参考=数据集参考表(“我的表aut”)
---->6 table=bigquery.table(table\u ref,schema=schema2)
7 table=客户端。创建_表(table)#API请求
8.
~/.local/lib/python3.6/site-packages/google/cloud/bigquery/table.py in\uuuuu init\uuuuu(self,table\u ref,schema)
371#让@property进行验证。
372如果架构不是无:
-->373 self.schema=schema
374
375@property
模式中的~/.local/lib/python3.6/site-packages/google/cloud/bigquery/table.py(self,value)
420 self._属性[“schema”]=无
421 elif not all(值中字段的isinstance(字段,SchemaField)):
-->422提升值错误(“架构项必须是字段”)
423其他:
424 self._属性[“schema”]=“fields”:_build_schema_resource(value)}
ValueError:架构项必须是字段
因此,我想感谢大家对我的支持,帮助我完成这项任务。这项任务应该是可行的:
schema2=[]
for element in table_schema_name_column:
schema2.append(bigquery.SchemaField(element, 'STRING', mode='NULLABLE'))
table_ref = dataset_ref.table("my_table_aut")
table = bigquery.Table(table_ref, schema=schema2)
table = client.create_table(table)
我不明白为什么SchemaField对象周围有引号。看起来你正在制作一个字符串数组……这就是问题的关键所在,我无法将字符串数组转换为相应的对象