Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/file/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
VB.Net:按行搜索Word文档_Vb.net_File_Search_Ms Word_Streamreader - Fatal编程技术网

VB.Net:按行搜索Word文档

VB.Net:按行搜索Word文档,vb.net,file,search,ms-word,streamreader,Vb.net,File,Search,Ms Word,Streamreader,我试图逐行阅读Word文档(800多页),如果该行包含某些文本,在本例中为部分,只需将该行打印到控制台即可 Public Sub doIt() SearchFile("theFilePath", "Section") Console.WriteLine("SHit") End Sub Public Sub SearchFile(ByVal strFilePath As String, ByVal strSearchTerm As String) Dim sr As St

我试图逐行阅读Word文档(800多页),如果该行包含某些文本,在本例中为
部分
,只需将该行打印到控制台即可

Public Sub doIt()
    SearchFile("theFilePath", "Section")
    Console.WriteLine("SHit")
End Sub

Public Sub SearchFile(ByVal strFilePath As String, ByVal strSearchTerm As String)
    Dim sr As StreamReader = New StreamReader(strFilePath)
    Dim strLine As String = String.Empty

    For Each line As String In sr.ReadLine
        If line.Contains(strSearchTerm) = True Then
            Console.WriteLine(line)
        End If
    Next

End Sub

它运行,但不打印任何内容。我知道“Section”一词也多次出现在文档中。

正如评论中已经提到的,您无法以当前的方式搜索
单词
文档。您需要创建一个前面提到的
Word.Application
对象,然后加载文档以便进行搜索

这是我为你写的一个简短的例子。请注意,您需要添加对Microsoft.Office.Interop.Word的引用,然后需要将导入语句添加到类中。例如,
导入Microsoft.Office.Interop
。此外,它会抓取每个段落,然后使用范围来查找您正在搜索的单词,如果找到,会将其添加到列表中

注意:经过尝试和测试-我在一次按钮事件中使用了此功能,但请放在需要的地方

    Try
                Dim objWordApp As Word.Application = Nothing
                Dim objDoc As Word.Document = Nothing
                Dim TextToFind As String = YOURTEXT
                Dim TextRange As Word.Range = Nothing
                Dim StringLines As New List(Of String)

                objWordApp = CreateObject("Word.Application")

                If objWordApp IsNot Nothing Then
                    objWordApp.Visible = False
                    objDoc = objWordApp.Documents.Open(FileName, )
                End If

                If objDoc IsNot Nothing Then

                    'loop through each paragraph in the document and get the range
                    For Each p As Word.Paragraph In objDoc.Paragraphs
                        TextRange = p.Range
                        TextRange.Find.ClearFormatting()

                        If TextRange.Find.Execute(TextToFind, ) Then
                            StringLines.Add(p.Range.Text)
                        End If
                    Next

                    If StringLines.Count > 0 Then
                        MessageBox.Show(String.Join(Environment.NewLine, StringLines.ToArray()))
                    End If

                    objDoc.Close()
                    objWordApp.Quit()

                End If


            Catch ex As Exception
                'publish your exception?
            End Try
更新使用句子-这将浏览每个段落并抓取每个句子,然后我们可以查看单词是否存在。。。这样做的好处是速度更快,因为我们得到每个段落,然后搜索句子。我们必须得到段落才能得到句子

Try
            Dim objWordApp As Word.Application = Nothing
            Dim objDoc As Word.Document = Nothing
            Dim TextToFind As String = "YOUR TEXT TO FIND"
            Dim TextRange As Word.Range = Nothing
            Dim StringLines As New List(Of String)
            Dim SentenceCount As Integer = 0

            objWordApp = CreateObject("Word.Application")

            If objWordApp IsNot Nothing Then
                objWordApp.Visible = False
                objDoc = objWordApp.Documents.Open(FileName, )
            End If

            If objDoc IsNot Nothing Then

                For Each p As Word.Paragraph In objDoc.Paragraphs
                    TextRange = p.Range
                    TextRange.Find.ClearFormatting()
                    SentenceCount = TextRange.Sentences.Count
                    If SentenceCount > 0 Then
                        Do Until SentenceCount = 0
                            Dim sentence As String = TextRange.Sentences.Item(SentenceCount).Text
                            If sentence.Contains(TextToFind) Then
                                StringLines.Add(sentence.Trim())
                            End If

                            SentenceCount -= 1
                        Loop
                    End If
                Next

                If StringLines.Count > 0 Then
                    MessageBox.Show(String.Join(Environment.NewLine, StringLines.ToArray()))
                End If

                objDoc.Close()
                objWordApp.Quit()

            End If


        Catch ex As Exception
            'publish your exception?
        End Try

这里有一个sub,它将打印搜索字符串所在的每一行,而不是每一段。它将模拟示例中使用streamreader读取/检查每行的行为:

'Add reference to and import Microsoft.Office.Interop.Word
Public Sub SearchFile(ByVal strFilePath As String, ByVal strSearchTerm As String)
    Dim wordObject As Word.Application = New Word.Application
    wordObject.Visible = False
    Dim objWord As Word.Document = wordObject.Documents.Open(strFilePath)
    objWord.Characters(1).Select()

    Dim bolEOF As Boolean = False
    Do Until bolEOF
        wordObject.Selection.MoveEnd(WdUnits.wdLine, 1)
        If wordObject.Selection.Text.ToUpper.Contains(strSearchTerm.ToUpper) Then
            Console.WriteLine(wordObject.Selection.Text.Replace(vbCr, "").Replace(vbCr, "").Replace(vbCrLf, ""))
        End If
        wordObject.Selection.Collapse(WdCollapseDirection.wdCollapseEnd)
        If wordObject.Selection.Bookmarks.Exists("\EndOfDoc") Then
            bolEOF = True
        End If
    Loop

    objWord.Close()
    wordObject.Quit()
    objWord = Nothing
    wordObject = Nothing
    Me.Close()
End Sub

这是对vb.net的一个稍加修改的实现

,您不能以这种方式搜索Word文档。您需要创建Word应用程序对象并加载要搜索的文件。您不能像搜索文本文件一样搜索Word文档。单词“document”实际上是一个zip文件,将文档的大部分数据保存在一个xml文件中。您需要使用第三方dll或interop来“读取”word文档并搜索其内容text@soohoonigan我是否仍可以为每行使用
?或者我该如何逐行遍历word文档?当然,您可以使用迭代,但我不认为您可以按照“逐行”的意思来进行迭代。word对象模型是一种wierd,文档的内容被分解成段落、节、书签、句子、范围等,因此您可以迭代这些内容。但是行的设置并不是天生的那样,因为在文档打开并且打印驱动程序可以用来确定一行上适合多少文本之前,行是不会“创建”的。也就是说,看看这里的答案,它使用巫毒魔法将文本分成几行。它是用c#编写的,就像另一个响应一样,但是carlosAG有一个非常好的工具,可以用来将它翻译成vb.net