Google cloud dataflow 如何在带有节点的gcloud上针对dataflow API执行简单的HTTP请求?

Google cloud dataflow 如何在带有节点的gcloud上针对dataflow API执行简单的HTTP请求?,google-cloud-dataflow,Google Cloud Dataflow,我想用应用程序监视我的数据流作业。我正在开发的应用程序是一个nodejs应用程序,理想情况下,它应该存在一个类似于@googlecloud/bigquery的包,但用于数据流。我完全知道,如果不是模板作业,我可能无法启动作业,但这应该是列出作业或获取作业信息的简单方法 更新: 我找到了这个规范,$discovery/rest?version=v1b3,但我不明白列表操作的location是什么。此页面链接了规范:我自己也找到了解决方案。有一个repo,它基本上包含了gcloud的所有API: 在

我想用应用程序监视我的数据流作业。我正在开发的应用程序是一个nodejs应用程序,理想情况下,它应该存在一个类似于
@googlecloud/bigquery
的包,但用于
数据流
。我完全知道,如果不是模板作业,我可能无法启动作业,但这应该是列出作业或获取作业信息的简单方法

更新:


我找到了这个规范,$discovery/rest?version=v1b3,但我不明白列表操作的
location
是什么。此页面链接了规范:

我自己也找到了解决方案。有一个repo,它基本上包含了gcloud的所有API:

在我发现我可以轻松地做我想做的事情之后:

'use strict';

var google = require('googleapis');
var dataflow = google.dataflow('v1b3');

google.auth.getApplicationDefault(function (err, authClient, projectId) {
    if (err) {
        throw err;
    }

    // The createScopedRequired method returns true when running on GAE or a local developer
    // machine. In that case, the desired scopes must be passed in manually. When the code is
    // running in GCE or a Managed VM, the scopes are pulled from the GCE metadata server.
    // See https://cloud.google.com/compute/docs/authentication for more information.
    if (authClient.createScopedRequired && authClient.createScopedRequired()) {
        // Scopes can be specified either as an array or as a single, space-delimited string.
        authClient = authClient.createScoped([
            'https://www.googleapis.com/auth/compute'
        ]);
    }

    // Fetch the list of GCE zones within a project.
    // NOTE: You must fill in your valid project ID before running this sample!
    var compute = google.compute({
        version: 'v1',
        auth: authClient
    });

    var result = dataflow.projects.jobs.list({
        'projectId': projectId,
        'auth': authClient
    }, function (err, result) {
        console.log(err, result);
    });
});

我自己也找到了解决办法。有一个repo,它基本上包含了gcloud的所有API:

在我发现我可以轻松地做我想做的事情之后:

'use strict';

var google = require('googleapis');
var dataflow = google.dataflow('v1b3');

google.auth.getApplicationDefault(function (err, authClient, projectId) {
    if (err) {
        throw err;
    }

    // The createScopedRequired method returns true when running on GAE or a local developer
    // machine. In that case, the desired scopes must be passed in manually. When the code is
    // running in GCE or a Managed VM, the scopes are pulled from the GCE metadata server.
    // See https://cloud.google.com/compute/docs/authentication for more information.
    if (authClient.createScopedRequired && authClient.createScopedRequired()) {
        // Scopes can be specified either as an array or as a single, space-delimited string.
        authClient = authClient.createScoped([
            'https://www.googleapis.com/auth/compute'
        ]);
    }

    // Fetch the list of GCE zones within a project.
    // NOTE: You must fill in your valid project ID before running this sample!
    var compute = google.compute({
        version: 'v1',
        auth: authClient
    });

    var result = dataflow.projects.jobs.list({
        'projectId': projectId,
        'auth': authClient
    }, function (err, result) {
        console.log(err, result);
    });
});

为了子孙后代。有一种方法可以在没有客户端库的情况下实现这一点,但它需要从服务帐户凭据生成jwt,并将jwt交换为访问令牌以执行
数据流
模板。此示例使用
Cloud\u Bigtable\u to\u GCS\u Avro
模板:

import axios from "axios";
import jwt from "jsonwebtoken";
import mem from "mem";

const loadCredentials = mem(function() {
  // This is a string containing service account credentials
  const serviceAccountJson = process.env.GOOGLE_APPLICATION_CREDENTIALS;
  if (!serviceAccountJson) {
    throw new Error("Missing GCP Credentials");
  }

  const credentials = JSON.parse(serviceAccountJson.replace(/\n/g, "\\n").replace(/\r/g, "\\r").replace(/\t/g, "\\t"));

  return {
    projectId: credentials.project_id,
    privateKeyId: credentials.private_key_id,
    privateKey: credentials.private_key,
    clientEmail: credentials.client_email,
  };
});

interface ProjectCredentials {
  projectId: string;
  privateKeyId: string;
  privateKey: string;
  clientEmail: string;
}

function generateJWT(params: ProjectCredentials) {
  const scope = "https://www.googleapis.com/auth/cloud-platform";
  const authUrl = "https://www.googleapis.com/oauth2/v4/token";
  const issued = new Date().getTime() / 1000;
  const expires = issued + 60;

  const payload = {
    iss: params.clientEmail,
    sub: params.clientEmail,
    aud: authUrl,
    iat: issued,
    exp: expires,
    scope: scope,
  };

  const options = {
    keyid: params.privateKeyId,
    algorithm: "RS256",
  };

  return jwt.sign(payload, params.privateKey, options);
}

async function getAccessToken(credentials: ProjectCredentials): Promise<string> {
  const jwt = generateJWT(credentials);
  const authUrl = "https://www.googleapis.com/oauth2/v4/token";
  const params = {
    grant_type: "urn:ietf:params:oauth:grant-type:jwt-bearer",
    assertion: jwt,
  };
  try {
    const response = await axios.post(authUrl, params);
    return response.data.access_token;
  } catch (error) {
    console.error("Failed to get access token", error);
    throw error;
  }
}

function buildTemplateParams(projectId: string, table: string) {
  return {
    jobName: `[job-name]`,
    parameters: {
      bigtableProjectId: projectId,
      bigtableInstanceId: "[table-instance]",
      bigtableTableId: table,
      outputDirectory: `[gs://your-instance]`,
      filenamePrefix: `${table}-`,
    },
    environment: {
      zone: "us-west1-a" // omit or define your own,
      tempLocation: `[gs://your-instance/temp]`,
    },
  };
}

async function backupTable(table: string) {
  console.info(`Executing backup template for table=${table}`);
  const credentials = loadCredentials();
  const { projectId } = credentials;
  const accessToken = await getAccessToken(credentials);
  const baseUrl = "https://dataflow.googleapis.com/v1b3/projects";
  const templatePath = "gs://dataflow-templates/latest/Cloud_Bigtable_to_GCS_Avro";
  const url = `${baseUrl}/${projectId}/templates:launch?gcsPath=${templatePath}`;
  const template = buildTemplateParams(projectId, table);
  try {
    const response = await axios.post(url, template, {
      headers: { Authorization: `Bearer ${accessToken}` },
    });
    console.log("GCP Response", response.data);
  } catch (error) {
    console.error(`Failed to execute template for ${table}`, error.message);
  }
}

async function run() {
  await backupTable("my-table");
}

try {
  run();
} catch (err) {
  process.exit(1);
}
从“axios”导入axios;
从“jsonwebtoken”导入jwt;
从“mem”导入mem;
const loadCredentials=mem(函数(){
//这是一个包含服务帐户凭据的字符串
const serviceAccountJson=process.env.GOOGLE\u应用程序\u凭据;
如果(!serviceAccountJson){
抛出新错误(“缺少GCP凭据”);
}
const credentials=JSON.parse(serviceAccountJson.replace(/\n/g,“\\n”)。replace(/\r/g,“\\r”)。replace(/\t/g,“\\t”);
返回{
projectId:credentials.project\u id,
privateKeyId:credentials.private\u key\u id,
私钥:凭据。私钥,
clientEmail:credentials.client\u电子邮件,
};
});
接口项目凭据{
projectd:字符串;
privateKeyId:字符串;
私钥:字符串;
客户邮件:字符串;
}
函数generateJWT(参数:ProjectCredentials){
常量范围=”https://www.googleapis.com/auth/cloud-platform";
常量authUrl=”https://www.googleapis.com/oauth2/v4/token";
const issued=new Date().getTime()/1000;
const expires=发布+60;
常数有效载荷={
iss:params.clientEmail,
sub:params.clientEmail,
aud:authUrl,
iat:已发布,
exp:expires,
范围:范围,,
};
常量选项={
keyid:params.privateKeyId,
算法:“RS256”,
};
返回jwt.符号(有效载荷、参数私钥、选项);
}
异步函数getAccessToken(凭据:ProjectCredentials):承诺{
const jwt=generateJWT(凭证);
常量authUrl=”https://www.googleapis.com/oauth2/v4/token";
常量参数={
授权类型:“urn:ietf:params:oauth:grant-type:jwt-bearer”,
断言:jwt,
};
试一试{
const response=wait axios.post(authUrl,参数);
返回response.data.access\u令牌;
}捕获(错误){
console.error(“获取访问令牌失败”,错误);
投掷误差;
}
}
函数buildTemplateParams(projectId:string,table:string){
返回{
作业名称:`[作业名称]`,
参数:{
bigtableProjectId:projectId,
bigtableInstanceId:“[表实例]”,
bigtableid:table,
outputDirectory:`[gs://您的实例]`,
filenamePrefix:`${table}-`,
},
环境:{
区域:“us-west1-a”//省略或定义您自己的区域,
模板位置:`[gs://您的实例/temp]`,
},
};
}
异步函数backupTable(表:字符串){
info(`Executing backup template for table=${table}`);
const credentials=loadCredentials();
const{projectId}=凭证;
const accessToken=等待getAccessToken(凭证);
常量baseUrl=”https://dataflow.googleapis.com/v1b3/projects";
const templatePath=“gs://dataflow templates/latest/Cloud\u Bigtable\u to\u GCS\u Avro”;
constURL=`${baseUrl}/${projectId}/templates:launch?gcsPath=${templatePath}`;
const template=buildTemplateParams(projectId,table);
试一试{
const response=wait axios.post(url、模板、{
标头:{授权:`Bearer${accessToken}`,
});
日志(“GCP响应”,响应,数据);
}捕获(错误){
console.error(`Failed to execute template for${table}`,error.message);
}
}
异步函数run(){
等待备份(“我的桌子”);
}
试一试{
run();
}捕捉(错误){
过程。退出(1);
}

为了子孙后代。有一种方法可以在没有客户端库的情况下实现这一点,但它需要从服务帐户凭据生成jwt,并将jwt交换为访问令牌以执行
数据流
模板。此示例使用
Cloud\u Bigtable\u to\u GCS\u Avro
模板:

import axios from "axios";
import jwt from "jsonwebtoken";
import mem from "mem";

const loadCredentials = mem(function() {
  // This is a string containing service account credentials
  const serviceAccountJson = process.env.GOOGLE_APPLICATION_CREDENTIALS;
  if (!serviceAccountJson) {
    throw new Error("Missing GCP Credentials");
  }

  const credentials = JSON.parse(serviceAccountJson.replace(/\n/g, "\\n").replace(/\r/g, "\\r").replace(/\t/g, "\\t"));

  return {
    projectId: credentials.project_id,
    privateKeyId: credentials.private_key_id,
    privateKey: credentials.private_key,
    clientEmail: credentials.client_email,
  };
});

interface ProjectCredentials {
  projectId: string;
  privateKeyId: string;
  privateKey: string;
  clientEmail: string;
}

function generateJWT(params: ProjectCredentials) {
  const scope = "https://www.googleapis.com/auth/cloud-platform";
  const authUrl = "https://www.googleapis.com/oauth2/v4/token";
  const issued = new Date().getTime() / 1000;
  const expires = issued + 60;

  const payload = {
    iss: params.clientEmail,
    sub: params.clientEmail,
    aud: authUrl,
    iat: issued,
    exp: expires,
    scope: scope,
  };

  const options = {
    keyid: params.privateKeyId,
    algorithm: "RS256",
  };

  return jwt.sign(payload, params.privateKey, options);
}

async function getAccessToken(credentials: ProjectCredentials): Promise<string> {
  const jwt = generateJWT(credentials);
  const authUrl = "https://www.googleapis.com/oauth2/v4/token";
  const params = {
    grant_type: "urn:ietf:params:oauth:grant-type:jwt-bearer",
    assertion: jwt,
  };
  try {
    const response = await axios.post(authUrl, params);
    return response.data.access_token;
  } catch (error) {
    console.error("Failed to get access token", error);
    throw error;
  }
}

function buildTemplateParams(projectId: string, table: string) {
  return {
    jobName: `[job-name]`,
    parameters: {
      bigtableProjectId: projectId,
      bigtableInstanceId: "[table-instance]",
      bigtableTableId: table,
      outputDirectory: `[gs://your-instance]`,
      filenamePrefix: `${table}-`,
    },
    environment: {
      zone: "us-west1-a" // omit or define your own,
      tempLocation: `[gs://your-instance/temp]`,
    },
  };
}

async function backupTable(table: string) {
  console.info(`Executing backup template for table=${table}`);
  const credentials = loadCredentials();
  const { projectId } = credentials;
  const accessToken = await getAccessToken(credentials);
  const baseUrl = "https://dataflow.googleapis.com/v1b3/projects";
  const templatePath = "gs://dataflow-templates/latest/Cloud_Bigtable_to_GCS_Avro";
  const url = `${baseUrl}/${projectId}/templates:launch?gcsPath=${templatePath}`;
  const template = buildTemplateParams(projectId, table);
  try {
    const response = await axios.post(url, template, {
      headers: { Authorization: `Bearer ${accessToken}` },
    });
    console.log("GCP Response", response.data);
  } catch (error) {
    console.error(`Failed to execute template for ${table}`, error.message);
  }
}

async function run() {
  await backupTable("my-table");
}

try {
  run();
} catch (err) {
  process.exit(1);
}
从“axios”导入axios;
从“jsonwebtoken”导入jwt;
从“mem”导入mem;
const loadCredentials=mem(函数(){
//这是一个包含服务帐户凭据的字符串
const serviceAccountJson=process.env.GOOGLE\u应用程序\u凭据;
如果(!serviceAccountJson){
抛出新错误(“缺少GCP凭据”);
}
const credentials=JSON.parse(serviceAccountJson.replace(/\n/g,“\\n”)。replace(/\r/g,“\\r”)。replace(/\t/g,“\\t”);
返回{
projectId:credentials.project\u id,
privateKeyId:credentials.private\u key\u id,
私钥:凭据。私钥,
clientEmail:credentials.client\u电子邮件,
};
});
接口项目凭据{
projectd:字符串;
privateKeyId:字符串;
私钥:字符串;
客户邮件:字符串;
}
函数generateJWT(参数:ProjectCredentials){
常量范围=”https://www.googleapis.com/auth/cloud-platform";
常量authUrl=”https://www.googleapis.com/oauth2/v4/token";
const issued=new Date().getTime()/1000;
const expires=发布+60;
常数有效载荷={
iss:params.clientEmail,
sub:params.clientEmail,
aud:authUrl,
iat:已发布,
exp:expires,
经营范围:上海合作组织