Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/clojure/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x Google数据流将数据存储键作为输入参数传递_Python 3.x_Google Cloud Datastore_Google Cloud Dataflow_Value Provider - Fatal编程技术网

Python 3.x Google数据流将数据存储键作为输入参数传递

Python 3.x Google数据流将数据存储键作为输入参数传递,python-3.x,google-cloud-datastore,google-cloud-dataflow,value-provider,Python 3.x,Google Cloud Datastore,Google Cloud Dataflow,Value Provider,我正在尝试创建一个用于读取JSON文件的google数据流模板,并将其加载到google数据存储。下面是我的代码 我可以成功加载数据,但我希望从模板中传递数据存储密钥/种类作为输入参数,并使用相同的参数创建实体。有人能帮我传递代码吗 下面是在运行时从中获取输入的代码段。我有一个数据存储密钥 class MyOptions(PipelineOptions): @classmethod def _add_argparse_args(cls, parser): pars

我正在尝试创建一个用于读取JSON文件的google数据流模板,并将其加载到google数据存储。下面是我的代码

我可以成功加载数据,但我希望从模板中传递数据存储密钥/种类作为输入参数,并使用相同的参数创建实体。有人能帮我传递代码吗

下面是在运行时从中获取输入的代码段。我有一个数据存储密钥

class MyOptions(PipelineOptions):
    @classmethod
    def _add_argparse_args(cls, parser):
        parser.add_value_provider_argument(
                '--json_input',
                dest='json_input',
                type=str,
                required=False,
                help='Input file to read. This can be a local file or a file in a Google Storage Bucket.')

        parser.add_value_provider_argument(
                '--project_id',
                dest='project_id',
                type=str,
                required=False,
                help='Input Project ID.')

        parser.add_value_provider_argument(
                '--datastore_key',
                dest='datastore_key',
                type=str,
                required=False,
                help='The Key name')
下面是一个代码段,其中我将根据为实体创建分配datastore_键

我正在创建管道,如下所示

p = beam.Pipeline(options=options)

lines_text  = p | "Read Json From GCS" >> beam.io.ReadFromText(json_input)
lines_json = lines_text | "Convert To Json" >> beam.ParDo(ConvertToJson()) 
lines_json | "Create Entities From Json" >> beam.ParDo(CreateHbaseRow(project_id))
如果将数据存储密钥作为运行时参数传递,则不会创建该密钥。如果我像这样硬编码,它会工作

key = self.client.key('customer' ,element['customerNumber'])
我想要这样的东西

key = self.client.key(runtime_datastore_key ,runtime_datastore_id)
有人能帮助我如何将数据存储密钥/种类作为运行时参数传递吗

谢谢,
GS

看起来您没有将
数据存储\u键
值提供程序传递到
CreateHbaseRow


尝试使用:

class CreateHbaseRow(beam.DoFn): 
    def __init__(self, project_id, datastore_key):
       self.project_id = project_id
       self.datastore_key = datastore_key

    def start_bundle(self):
        self.client = datastore.Client()

    def process(self, element):
        try:
            key = self.client.key(datastore_key.get(), element['customerNumber'])
            entity = datastore.Entity(key=key)
            entity.update(element)  
            self.client.put(entity) 
        except:   
            logging.error("Failed with input: ", str(element))
请注意,我留下了project_id,因为您似乎想要它,但我下面的代码没有使用它


您还需要确保将相关的值提供程序从
选项
实例传递到
DoFn
。因此,管道创建代码变成:

p = beam.Pipeline(options=options)

lines_text  = p | "Read Json From GCS" >> beam.io.ReadFromText(json_input)
lines_json = lines_text | "Convert To Json" >> beam.ParDo(ConvertToJson()) 
lines_json | "Create Entities From Json" >> beam.ParDo(CreateHbaseRow(options.project_id, options.datastore_key))

像往常一样感谢卢卡斯的帮助,我已经像你提到的那样破解了。再次感谢你的帮助!
p = beam.Pipeline(options=options)

lines_text  = p | "Read Json From GCS" >> beam.io.ReadFromText(json_input)
lines_json = lines_text | "Convert To Json" >> beam.ParDo(ConvertToJson()) 
lines_json | "Create Entities From Json" >> beam.ParDo(CreateHbaseRow(options.project_id, options.datastore_key))