Google cloud platform 使用terraform设置bigquery数据集
我是GCP和Terraform的新手。我正在开发terraform脚本,以提供大约50个BQ数据集,每个数据集至少有10个表。并非所有表都具有相同的架构 我已经开发了创建数据集和表的脚本,但是我面临着向表中添加模式的挑战,我需要帮助。我正在利用地形变量来构建脚本 这是我的密码。我需要集成逻辑来为表创建模式 变量tf terraform.tfvars main.tfGoogle cloud platform 使用terraform设置bigquery数据集,google-cloud-platform,google-bigquery,terraform,terraform-provider-gcp,Google Cloud Platform,Google Bigquery,Terraform,Terraform Provider Gcp,我是GCP和Terraform的新手。我正在开发terraform脚本,以提供大约50个BQ数据集,每个数据集至少有10个表。并非所有表都具有相同的架构 我已经开发了创建数据集和表的脚本,但是我面临着向表中添加模式的挑战,我需要帮助。我正在利用地形变量来构建脚本 这是我的密码。我需要集成逻辑来为表创建模式 变量tf terraform.tfvars main.tf 我尝试了所有可能的方法为数据集中的表创建模式。但是没有一个有效。可能所有表都应该具有相同的模式 我想这样试试 在 resource“
我尝试了所有可能的方法为数据集中的表创建模式。但是没有一个有效。可能所有表都应该具有相同的模式 我想这样试试 在
resource“google\u bigquery\u table”“table”
例如,您可以在标签之后添加:
schema=file(“${path.root}/subdirectories path/table_schema.json”)
在哪里
-是地形文件的位置${path.root}
-零个子目录或多个子目录子目录路径
-包含模式的json文件table_schema.json
variable "project_id" {
description = "The target project"
type = string
default = "ishim-sample"
}
variable "region" {
description = "The region where resources are created => europe-west2"
type = string
default = "europe-west2"
}
variable "zone" {
description = "The zone in the europe-west region for resources"
type = string
default = "europe-west2-b"
}
# ===========================
variable "test_bq_dataset" {
type = list(object({
id = string
location = string
}))
}
variable "test_bq_table" {
type = list(object({
dataset_id = string
table_id = string
schema_id = string
}))
}
provider "google" {
project = var.project_id
region = var.region
zone = var.zone
}
resource "google_bigquery_dataset" "test_dataset_set" {
project = var.project_id
count = length(var.test_bq_dataset)
dataset_id = var.test_bq_dataset[count.index]["id"]
location = var.test_bq_dataset[count.index]["location"]
labels = {
"environment" = "development"
}
}
resource "google_bigquery_table" "test_table_set" {
project = var.project_id
count = length(var.test_bq_table)
dataset_id = var.test_bq_table[count.index]["dataset_id"]
table_id = var.test_bq_table[count.index]["table_id"]
schema = file("${path.root}/bq-schema/${var.test_bq_table[count.index]["schema_id"]}")
labels = {
"environment" = "development"
}
depends_on = [
google_bigquery_dataset.test_dataset_set,
]
}
terraform.tfvars
test_bq_dataset = [
{
id = "ds1"
location = "EU"
},
{
id = "ds2"
location = "EU"
}
]
test_bq_table = [
{
dataset_id = "ds1"
table_id = "table1"
schema_id = "table-schema-01.json"
},
{
dataset_id = "ds2"
table_id = "table2"
schema_id = "table-schema-02.json"
},
{
dataset_id = "ds1"
table_id = "table3"
schema_id = "table-schema-03.json"
},
{
dataset_id = "ds2"
table_id = "table4"
schema_id = "table-schema-04.json"
}
]
json模式文件示例-table-schema-01.json
[
{
"name": "table_column_01",
"mode": "REQUIRED",
"type": "STRING",
"description": ""
},
{
"name": "_gcs_file_path",
"mode": "REQUIRED",
"type": "STRING",
"description": "The GCS path to the file for loading."
},
{
"name": "_src_file_ts",
"mode": "REQUIRED",
"type": "TIMESTAMP",
"description": "The source file modification timestamp."
},
{
"name": "_src_file_name",
"mode": "REQUIRED",
"type": "STRING",
"description": "The file name of the source file."
},
{
"name": "_firestore_doc_id",
"mode": "REQUIRED",
"type": "STRING",
"description": "The hash code (based on the file name and its content, so each file has a unique hash) used as a Firestore document id."
},
{
"name": "_ingested_ts",
"mode": "REQUIRED",
"type": "TIMESTAMP",
"description": "The timestamp when this record was processed during ingestion into the BigQuery table."
}
]
main.tf
variable "project_id" {
description = "The target project"
type = string
default = "ishim-sample"
}
variable "region" {
description = "The region where resources are created => europe-west2"
type = string
default = "europe-west2"
}
variable "zone" {
description = "The zone in the europe-west region for resources"
type = string
default = "europe-west2-b"
}
# ===========================
variable "test_bq_dataset" {
type = list(object({
id = string
location = string
}))
}
variable "test_bq_table" {
type = list(object({
dataset_id = string
table_id = string
schema_id = string
}))
}
provider "google" {
project = var.project_id
region = var.region
zone = var.zone
}
resource "google_bigquery_dataset" "test_dataset_set" {
project = var.project_id
count = length(var.test_bq_dataset)
dataset_id = var.test_bq_dataset[count.index]["id"]
location = var.test_bq_dataset[count.index]["location"]
labels = {
"environment" = "development"
}
}
resource "google_bigquery_table" "test_table_set" {
project = var.project_id
count = length(var.test_bq_table)
dataset_id = var.test_bq_table[count.index]["dataset_id"]
table_id = var.test_bq_table[count.index]["table_id"]
schema = file("${path.root}/bq-schema/${var.test_bq_table[count.index]["schema_id"]}")
labels = {
"environment" = "development"
}
depends_on = [
google_bigquery_dataset.test_dataset_set,
]
}
项目目录结构-屏幕截图
请记住“main.tf”文件中“google_bigquery_table”资源的“schema”属性中使用的子目录名——“bq schema”
BigQuery控制台-屏幕截图
“terraform apply”命令的结果。terraform包含一个可选参数,该参数需要JSON字符串 上一链接上共享的文档有一个示例:
resource "google_bigquery_table" "default" {
dataset_id = google_bigquery_dataset.default.dataset_id
table_id = "bar"
time_partitioning {
type = "DAY"
}
labels = {
env = "default"
}
schema = <<EOF
[
{
"name": "permalink",
"type": "STRING",
"mode": "NULLABLE",
"description": "The Permalink"
},
{
"name": "state",
"type": "STRING",
"mode": "NULLABLE",
"description": "State where the head office is located"
}
]
EOF
}
resource“google\u bigquery\u表”默认值{
dataset\u id=google\u bigquery\u dataset.default.dataset\u id
table_id=“bar”
时间分割{
type=“天”
}
标签={
env=“默认值”
}
schema=我必须将此添加到问题中。表没有相同的模式。您可能可以定义一个terraform变量-映射“table name=>schema file name”,或使用模式文件名的列表,以便使用相同的计数循环而不是常量“table_schema.json”来选择正确的文件。请您分享一个使用map的示例。我有50个BQ数据集,每个DS有10个表,我不喜欢硬编码值。想办法利用变量创建模式(就像我创建表和DS一样)我明白了!我不知道模式在哪里不同。这肯定需要另一种方法。我相信@al dann的答案提供了一种更好的方法。
resource "google_bigquery_table" "default" {
dataset_id = google_bigquery_dataset.default.dataset_id
table_id = "bar"
time_partitioning {
type = "DAY"
}
labels = {
env = "default"
}
schema = <<EOF
[
{
"name": "permalink",
"type": "STRING",
"mode": "NULLABLE",
"description": "The Permalink"
},
{
"name": "state",
"type": "STRING",
"mode": "NULLABLE",
"description": "State where the head office is located"
}
]
EOF
}