Performance VBA性能:并行分析还是顺序分析

Performance VBA性能:并行分析还是顺序分析,performance,vba,ms-word,Performance,Vba,Ms Word,我正在创建一个工具来帮助分析修订后的文件。在数百个文件上执行了许多检查,平均约10k个字符 每次分析我都有一节课,这些检查大多是逐句进行的 现在,我想知道的是,最好的方法是发送每个分析程序完整分析的每个文档,还是通过句子或段落循环并发送给每个分析程序 所以选择是 1-获取文档,获取一个句子,将其传递给每个分析器,然后移动到下一个句子,当它结束时,移动到下一个文档 2-获取一个文档,将其发送到每个分析器,然后移动到下一个文档 我说不出哪一个性能因素会影响更大: 段落或句子集合的使用与整个文件的字

我正在创建一个工具来帮助分析修订后的文件。在数百个文件上执行了许多检查,平均约10k个字符

每次分析我都有一节课,这些检查大多是逐句进行的

现在,我想知道的是,最好的方法是发送每个分析程序完整分析的每个文档,还是通过句子或段落循环并发送给每个分析程序

所以选择是

1-获取文档,获取一个句子,将其传递给每个分析器,然后移动到下一个句子,当它结束时,移动到下一个文档

2-获取一个文档,将其发送到每个分析器,然后移动到下一个文档

我说不出哪一个性能因素会影响更大:

  • 段落或句子集合的使用与整个文件的字符串操作
  • 除了列举段落或句子之外,是否还有其他方法一次浏览文档
  • 其他我不知道的性能因素
我希望有人能提供至少一个线索

如果没有,我将对所有可用选项进行性能检查,因为执行此任务需要很长时间(我们可能会讨论几个小时),因此每项改进都很重要。这种策略无疑是主要的性能瓶颈之一。在这种情况下,我将发布我的结果


谢谢。

将分析放入exe文件。您可以使用多个EXE文件执行多任务。由于您的主机是office应用程序(?),这意味着无论哪个应用程序都有多个实例

记住指环王,一个过程可以控制它们


<> p>所以你的主脚本会以其他方式发送给其他进程指令(脚本设置一个定时器[完成工作],然后响应OK,然后在结束时回电,PASS事件)。 Word中的大多数性能问题都来自解析范围和选择。段落、句子和单词更糟糕,因为Word必须预先解析内容,以便知道它们的开始和结束位置

这个基准应该会给你一个很好的想法。“分析”只是一个繁忙的工作功能,可以处理一系列文本,该文档由从中检索到的20段Lorem Ipsum组成

基准代码:

Private Const iterations As Long = 1000

Public Sub Benchmarks()

    Dim active As Range
    Dim index As Long
    Dim starting As Double
    Dim doc As Document

    Set doc = ActiveDocument

    starting = Timer
    For index = 1 To iterations
        Process doc.Content
    Next index
    Debug.Print "Process all: " & Timer - starting & " seconds"

    Dim para As Paragraph
    starting = Timer
    For index = 1 To iterations
        For Each para In doc.Paragraphs
            Process para.Range
        Next para
    Next index
    Debug.Print "Process by paragraph: " & Timer - starting & " seconds"

    Dim sent As Range
    starting = Timer
    For index = 1 To iterations
        For Each sent In doc.Sentences
            Process sent
        Next sent
    Next index
    Debug.Print "Process by sentence: " & Timer - starting & " seconds"

End Sub
Public Sub Process(target As Range)

    Dim buffer() As String
    Dim pos As Long
    Dim proxy As Long

    buffer = Split(target.Text)
    For pos = LBound(buffer) To UBound(buffer)
        proxy = proxy + InStr(1, buffer(pos), "a")
    Next pos

End Sub
Process all: 0.75 seconds
Process by paragraph: 2.125 seconds
Process by sentence: 9.078125 seconds
“分析”代码:

Private Const iterations As Long = 1000

Public Sub Benchmarks()

    Dim active As Range
    Dim index As Long
    Dim starting As Double
    Dim doc As Document

    Set doc = ActiveDocument

    starting = Timer
    For index = 1 To iterations
        Process doc.Content
    Next index
    Debug.Print "Process all: " & Timer - starting & " seconds"

    Dim para As Paragraph
    starting = Timer
    For index = 1 To iterations
        For Each para In doc.Paragraphs
            Process para.Range
        Next para
    Next index
    Debug.Print "Process by paragraph: " & Timer - starting & " seconds"

    Dim sent As Range
    starting = Timer
    For index = 1 To iterations
        For Each sent In doc.Sentences
            Process sent
        Next sent
    Next index
    Debug.Print "Process by sentence: " & Timer - starting & " seconds"

End Sub
Public Sub Process(target As Range)

    Dim buffer() As String
    Dim pos As Long
    Dim proxy As Long

    buffer = Split(target.Text)
    For pos = LBound(buffer) To UBound(buffer)
        proxy = proxy + InStr(1, buffer(pos), "a")
    Next pos

End Sub
Process all: 0.75 seconds
Process by paragraph: 2.125 seconds
Process by sentence: 9.078125 seconds
结果:

Private Const iterations As Long = 1000

Public Sub Benchmarks()

    Dim active As Range
    Dim index As Long
    Dim starting As Double
    Dim doc As Document

    Set doc = ActiveDocument

    starting = Timer
    For index = 1 To iterations
        Process doc.Content
    Next index
    Debug.Print "Process all: " & Timer - starting & " seconds"

    Dim para As Paragraph
    starting = Timer
    For index = 1 To iterations
        For Each para In doc.Paragraphs
            Process para.Range
        Next para
    Next index
    Debug.Print "Process by paragraph: " & Timer - starting & " seconds"

    Dim sent As Range
    starting = Timer
    For index = 1 To iterations
        For Each sent In doc.Sentences
            Process sent
        Next sent
    Next index
    Debug.Print "Process by sentence: " & Timer - starting & " seconds"

End Sub
Public Sub Process(target As Range)

    Dim buffer() As String
    Dim pos As Long
    Dim proxy As Long

    buffer = Split(target.Text)
    For pos = LBound(buffer) To UBound(buffer)
        proxy = proxy + InStr(1, buffer(pos), "a")
    Next pos

End Sub
Process all: 0.75 seconds
Process by paragraph: 2.125 seconds
Process by sentence: 9.078125 seconds

我也尝试用word进行基准测试,但它一直将应用程序发送到“无响应”状态。

VBA在我所知道的每个环境中都是单线程的-你是指VB.NET还是多线程的VBA环境?我是指并行的,而不是多线程的。将整个文档发送到每个分析例程,或将每个段落发送到这些例程。顺便说一句,我看到过关于VBA上使用windows API、COM或vbscript的多线程ish的讨论。。。丑陋的东西,它仍然不能使VBA多线程。另外,我认为excel公式是多线程的,所以如果它们调用VBA函数,我认为它们是并发运行的。我在慢公式上见过这种行为,但我不确定;使用互操作从VBA内部调用多线程的点网组件是可行的,但不能使VBA成为多线程的。异步调用ADO也是如此。性能很可能会受到分析例程实现方式和调用顺序的限制。如果没有更多关于你正在做什么的细节和一个提议的架构大纲,就不可能现实地解决你的问题。这太好了,谢谢。如果你有兴趣将句子作为一个单元来处理,你会如何拆分文档?不使用句子或段落,也就是说。@RSinohara-这取决于您正在进行的分析和文档的内容,但字节数组扫描标点符号或正则表达式将是一个良好的开端。基本上问问自己,如果它是纯文本文档,你会如何处理它。我会的,知道段落和句子的影响很好(而且影响很大)。我仍然需要处理分割范围,因为我需要修改每个句子的范围。我会看看我能做些什么,谢谢你的洞察力。