C# 不安装Word的通用Microsoft Word文档解析器_C#_Wpf_Ms Word

C# 不安装Word的通用Microsoft Word文档解析器

c# wpf ms-word

C# 不安装Word的通用Microsoft Word文档解析器,c#,wpf,ms-word,C#,Wpf,Ms Word,为了在不安装Word的情况下使用C#和WPF解析：Microsoft Word 97/2003（.doc）和Microsoft Word 2007/2010（.docx），我需要知道是否有人可以给我一个严肃的库来实现这一点从技术上讲，我迭代了ZipEntry元素，如下所示： foreach (string file in _listPathFiles) { using (Ionic.Zip.ZipFile zip = ZipFile.Read(file)

为了在不安装Word的情况下使用C#和WPF解析：Microsoft Word 97/2003（.doc）和Microsoft Word 2007/2010（.docx），我需要知道是否有人可以给我一个严肃的库来实现这一点

从技术上讲，我迭代了ZipEntry元素，如下所示：

foreach (string file in _listPathFiles)
        {
            using (Ionic.Zip.ZipFile zip = ZipFile.Read(file))
            {
                try
                {
                    zip.ToList().ForEach(entry =>
                    {
                        if (entry.FileName.EndsWith(".doc") ||
                            entry.FileName.EndsWith(".docx"))
                        {
                            // Extract file into disk
                            entry.FileName = System.IO.Path.GetFileName(entry.FileName);
                            entry.Extract(baseStoragePath);

                            // Get data from file with Parser
                            string filePath = baseStoragePath + entry.FileName;


                            // Remove extracted filess
                            if (File.Exists(filePath))
                            {
                                File.Delete(filePath);
                                Console.WriteLine("Delete : " + filePath);
                            }
                        }
                    });
                }
                catch (Exception e)
                {
                    Console.WriteLine("Fail to unzip Exception : " + e.StackTrace);
                }
            }
        }

我不确定我是否可以直接使用ZipEntry来获取文档，也许我必须在解析之前解压缩它

我的目标是获取位于“标题1”Microsoft Word样式之后的数据，因此库应该能够获取此类属性

欢迎使用库思想和代码示例。

查看NPOI（ApacheNOIAPI的.NET端口）：

或

下载用于阅读Office文档（如MS Word）的SDK。

可用于从Word文档中提取文本，而无需安装MS Word。提取可以逐行执行，也可以一次执行

// extracting all the text 
WordsTextExtractor extractor = new WordsTextExtractor("sample.docx");
Console.Write(extractor.ExtractAll());

// OR

// Extract text line by line
string line = extractor.ExtractLine();

// If the line is null, then the end of the file is reached
while (line != null)
{
      // Print a line to the console
      Console.Write(line);
      // Extract another line
      line = extractor.ExtractLine();
}

披露：我在GroupDocs担任开发人员宣传员。

NPOI如何？NPOI查看文档时似乎无法管理.doc文件和OpenXMLSDK，这意味着（.docx）也无法与.doc一起工作。。