Bulk Upsert Javascript存储过程始终超过5秒的执行上限,并导致超时
我目前正在PythonSDK中运行一个脚本,该脚本以编程方式将150万个文档批量添加到azure cosmos db中的一个集合中。我一直在使用github repo:中提供的示例中的批量导入存储过程,唯一的变化是我将collection.createDocument与collection.upsertDocument交换。我将在下面完整介绍我的存储过程 该存储过程确实成功运行—它一致且相对快速地更新文档。尽管只有在进度达到30%左右时才会出现此错误:Bulk Upsert Javascript存储过程始终超过5秒的执行上限,并导致超时,javascript,azure,azure-cosmosdb,Javascript,Azure,Azure Cosmosdb,我目前正在PythonSDK中运行一个脚本,该脚本以编程方式将150万个文档批量添加到azure cosmos db中的一个集合中。我一直在使用github repo:中提供的示例中的批量导入存储过程,唯一的变化是我将collection.createDocument与collection.upsertDocument交换。我将在下面完整介绍我的存储过程 该存储过程确实成功运行—它一致且相对快速地更新文档。尽管只有在进度达到30%左右时才会出现此错误: CosmosHttpResponseErr
CosmosHttpResponseError: (RequestTimeout) Message: {"Errors":["The requested operation exceeded maximum alloted time. Learn more: https://aka.ms/cosmosdb-tsg-service-request-timeout"]}
ActivityId: 9f2357c6-918c-4b67-ba20-569034bfde6f, Request URI: /apps/4a997bdb-7123-485a-9808-f952db2b7e52/services/a7c137c6-96b8-4b53-a20c-b9577981b353/partitions/305a8287-11d1-43f8-be1f-983bd4c4a63d/replicas/132488328092882514p/, RequestStats:
RequestStartTime: 2020-11-03T23:43:59.9158203Z, RequestEndTime: 2020-11-03T23:44:05.3858559Z, Number of regions attempted:1
ResponseTime: 2020-11-03T23:44:05.3858559Z, StoreResult: StorePhysicalAddress: rntbd://cdb-ms-prod-centralus1-fd22.documents.azure.com:14354/apps/4a997bdb-7123-485a-9808-f952db2b7e52/services/a7c137c6-96b8-4b53-a20c-b9577981b353/partitions/305a8287-11d1-43f8-be1f-983bd4c4a63d/replicas/132488328092882514p/, LSN: -1, GlobalCommittedLsn: -1, PartitionKeyRangeId: , IsValid: False, StatusCode: 408, SubStatusCode: 0, RequestCharge: 0, ItemLSN: -1, SessionToken: , UsingLocalLSN: False, TransportException: null, ResourceType: StoredProcedure, OperationType: ExecuteJavaScript, SDK: Microsoft.Azure.Documents.Common/2.11.0
有没有办法添加一些重试逻辑或延长批量升级的超时时间?如果(!isAccepted)getContext().getResponse().setBody(count),我相信下面存储过程中的代码部分
应该对这个场景有所帮助,但在我的情况下它似乎不起作用
Javascript中的大容量upsert存储过程:
我认为问题可能在于存储过程而不是python脚本,如果不是这样的话,尽管我可以提供python脚本。在这方面的任何帮助都将不胜感激,这几天对我来说都是一个难题
额外信息:
吞吐量=10000,分区向上插入大小一致为~1.9MB。存储过程的有限执行时间为5秒。但是,您可以编写存储过程来处理有界执行,方法是检查布尔返回值,然后使用存储过程每次调用中插入的项目计数来跟踪和恢复批处理的进度。有一个例子。如果其他人有这个问题,我使用的解决方法是在批量升级操作进行期间,将吞吐量临时增加到100000,而不是10000。如果将该大容量upsert存储过程与足够高的吞吐量结合使用,则不会发生此错误。我认为,一旦批量升级操作升级了150万条记录中约30%的记录,超时就会频繁发生,这可能是因为吞吐量没有在分区之间充分分配,造成了瓶颈。一旦在实践中使用,我可能不得不再次为我的容器分配一个更大的吞吐量,或者我将能够降低吞吐量以节省成本。无论哪种方式,代码都非常简单,只需使用以下方法:
new_吞吐量=10000;集装箱。更换吞吐量(新吞吐量)
谢谢您的回复标记。我相信我当前的存储过程已经遵循了您提供的示例中列出的实践——上面是它的代码。实际上,除了使用collection.upsertDocument()而不是collection.createDocument()之外,它的代码基本相同。出于某种原因,虽然它不能解决有界执行的错误,但也许我做错了什么?
function bulkUpsert(docs) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
// The count of imported docs, also used as current doc index.
var count = 0;
// Validate input.
if (!docs) throw new Error("The array is undefined or null.");
var docsLength = docs.length;
if (docsLength == 0) {
getContext().getResponse().setBody(0);
return;
}
// Call the CRUD API to create a document.
tryCreate(docs[count], callback);
// Note that there are 2 exit conditions:
// 1) The upsertDocument request was not accepted.
// In this case the callback will not be called, we just call setBody and we are done.
// 2) The callback was called docs.length times.
// In this case all documents were created and we don't need to call tryCreate anymore. Just call setBody and we are done.
function tryCreate(doc, callback) {
var isAccepted = collection.upsertDocument(collectionLink, doc, callback);
// If the request was accepted, callback will be called.
// Otherwise report current count back to the client,
// which will call the script again with remaining set of docs.
// This condition will happen when this stored procedure has been running too long
// and is about to get cancelled by the server. This will allow the calling client
// to resume this batch from the point we got to before isAccepted was set to false
if (!isAccepted) {
getContext().getResponse().setBody(count);
}
}
// This is called when collection.upsertDocument is done and the document has been persisted.
function callback(err, doc, options) {
if (err) throw err;
// One more document has been inserted, increment the count.
count++;
if (count >= docsLength) {
// If we have created all documents, we are done. Just set the response.
getContext().getResponse().setBody(count);
} else {
// Create next document.
tryCreate(docs[count], callback);
}
}
}