Google cloud dataflow 如何在DatastoreIO.v1.write中指定种类名称
我试图从google数据存储中的一种数据中读取数据,并应用一些转换,然后写回另一种数据。我使用谷歌数据流来实现这一点。当我们从数据存储中读取数据时,我们能够给出相应的结果。但不能在写作时给予善意。如何实现这一点。编辑:哎呀,我刚刚注意到你要求使用JavaGoogle cloud dataflow 如何在DatastoreIO.v1.write中指定种类名称,google-cloud-dataflow,Google Cloud Dataflow,我试图从google数据存储中的一种数据中读取数据,并应用一些转换,然后写回另一种数据。我使用谷歌数据流来实现这一点。当我们从数据存储中读取数据时,我们能够给出相应的结果。但不能在写作时给予善意。如何实现这一点。编辑:哎呀,我刚刚注意到你要求使用JavaDatastoreIO.v1.write为我查看了与WriteToDatastore相当的java,在这种情况下,您必须在上一步中设置实体(包括种类)。在本例中,请查看CreateEntityFn 原件: 我就是这样做的 import apach
DatastoreIO.v1.write
为我查看了与WriteToDatastore
相当的java,在这种情况下,您必须在上一步中设置实体(包括种类)。在本例中,请查看CreateEntityFn
原件:
我就是这样做的
import apache_beam
from apache_beam.io.gcp.datastore.v1.datastoreio import WriteToDatastore
from google.cloud.proto.datastore.v1 import entity_pb2
from googledatastore import helper
class MakeEntity(object):
def __init__(self, project):
self._project = project
def make(self, element):
try:
entity = entity_pb2.Entity()
helper.add_key_path(entity.key, 'EntityKind', element['id'])
helper.add_properties(entity, {
"created": datetime.datetime.now(),
"email": unicode(element['email'],
"count": int(element['count'],
"amount": float(element['amount'],
})
return entity
except:
logging.error(traceback.format_exc())
raise
def build_pipeline(project, pipeline_options):
p = apache_beam.Pipeline(options=pipeline_options)
_ = \
(p
# other transforms
| 'create entity' >> apache_beam.Map(MakeEntity(project=project).make)
| 'write to datastore' >> WriteToDatastore(project=project))
return p
编辑#2:我调整了您的代码,以便更接近我链接的示例。希望这能奏效
public class ModifyEntityKindFn extends DoFn<Entity, Entity> {
@ProcessElement
public void processElement(ProcessContext context) {
Key.Builder keyBuilder = makeKey(NEW_KIND, inputEntity.getKey());
keyBuilder.getPartitionIdBuilder().setNamespaceId(NEW_NAMESPACE);
Entity.Builder entityBuilder = Entity.newBuilder().setKey(keyBuilder.build());
entityBuilder.getMutableProperties().put("content", makeValue(context.element()).build());
context.output(entityBuilder.build());
}
}
public类ModifyEntityKindFn扩展DoFn{
@过程元素
public void processElement(ProcessContext上下文){
Key.Builder-keyBuilder=makeKey(NEW_-KIND,inputEntity.getKey());
keyBuilder.getPartitionDBuilder().setNamespaceId(新的_名称空间);
Entity.Builder entityBuilder=Entity.newBuilder().setKey(keyBuilder.build());
entityBuilder.getMutableProperties();
context.output(entityBuilder.build());
}
}
编辑:哎呀,我刚注意到你想要JavaDatastoreIO.v1.write
为我查看了与WriteToDatastore
相当的java,在这种情况下,您必须在上一步中设置实体(包括种类)。在本例中,请查看CreateEntityFn
原件:
我就是这样做的
import apache_beam
from apache_beam.io.gcp.datastore.v1.datastoreio import WriteToDatastore
from google.cloud.proto.datastore.v1 import entity_pb2
from googledatastore import helper
class MakeEntity(object):
def __init__(self, project):
self._project = project
def make(self, element):
try:
entity = entity_pb2.Entity()
helper.add_key_path(entity.key, 'EntityKind', element['id'])
helper.add_properties(entity, {
"created": datetime.datetime.now(),
"email": unicode(element['email'],
"count": int(element['count'],
"amount": float(element['amount'],
})
return entity
except:
logging.error(traceback.format_exc())
raise
def build_pipeline(project, pipeline_options):
p = apache_beam.Pipeline(options=pipeline_options)
_ = \
(p
# other transforms
| 'create entity' >> apache_beam.Map(MakeEntity(project=project).make)
| 'write to datastore' >> WriteToDatastore(project=project))
return p
编辑#2:我调整了您的代码,以便更接近我链接的示例。希望这能奏效
public class ModifyEntityKindFn extends DoFn<Entity, Entity> {
@ProcessElement
public void processElement(ProcessContext context) {
Key.Builder keyBuilder = makeKey(NEW_KIND, inputEntity.getKey());
keyBuilder.getPartitionIdBuilder().setNamespaceId(NEW_NAMESPACE);
Entity.Builder entityBuilder = Entity.newBuilder().setKey(keyBuilder.build());
entityBuilder.getMutableProperties().put("content", makeValue(context.element()).build());
context.output(entityBuilder.build());
}
}
public类ModifyEntityKindFn扩展DoFn{
@过程元素
public void processElement(ProcessContext上下文){
Key.Builder-keyBuilder=makeKey(NEW_-KIND,inputEntity.getKey());
keyBuilder.getPartitionDBuilder().setNamespaceId(新的_名称空间);
Entity.Builder entityBuilder=Entity.newBuilder().setKey(keyBuilder.build());
entityBuilder.getMutableProperties();
context.output(entityBuilder.build());
}
}
谢谢Alex。我将尝试您的建议。我正在使用以下代码将数据存储中的数据从一种复制到另一种公开类ModifyEntityKindFn扩展DoFn{@ProcessElement public void ProcessElement(ProcessContext上下文){Entity inputEntity=context.element();Key.Builder keyBuilder=makeKey(NEW_kind,inputEntity.getKey());keyBuilder.GetPartitionDBuilder().setNamespaceId(新名称空间);Entity.Builder entityBuilder=Entity.newBuilder().setKey(keyBuilder.build());entityBuilder.PutalProperties(inputEntity.getPropertiesMap());context.output(entityBuilder.build();}}管道p=Pipeline.create(选项);PCollection entities=p.apply(DatastoreIO.v1().read().withProjectId(options.getDataset()).withQuery(query).withNamespace(options.getNamespace());应用(ParDo.of(new GetContentFn());entities.apply(DatastoreIO.v1().write().withProjectId(options.getDataset());p.run();由于无法识别的方法inputenty.getPropertiesMap(),编译失败,不幸的是,我是用python操作的,因此我有点不熟悉这里的元素,但请检查我答案中的“Edit#2”。谢谢Alex。我将尝试您的建议。我正在使用以下代码将数据存储中的数据从一种复制到另一种公开类ModifyEntityKindFn扩展DoFn{@ProcessElement public void ProcessElement(ProcessContext上下文){Entity inputEntity=context.element();Key.Builder keyBuilder=makeKey(NEW_kind,inputEntity.getKey());keyBuilder.GetPartitionDBuilder().setNamespaceId(新名称空间);Entity.Builder entityBuilder=Entity.newBuilder().setKey(keyBuilder.build());entityBuilder.PutalProperties(inputEntity.getPropertiesMap());context.output(entityBuilder.build();}}管道p=Pipeline.create(选项);PCollection entities=p.apply(DatastoreIO.v1().read().withProjectId(options.getDataset()).withQuery(query).withNamespace(options.getNamespace());应用(ParDo.of(new GetContentFn());entities.apply(DatastoreIO.v1().write().withProjectId(options.getDataset());p.run();由于无法识别的方法inputenty.getPropertiesMap(),编译失败,不幸的是,我是用python操作的,因此我有点不熟悉这里的元素,但请检查我答案中的“Edit#2”。