在java中，是否可以解析存档库（.a）和.so（共享对象）文件的内容以检索人类可读的文本？_Java_Parsing_Io_Normalization

在java中，是否可以解析存档库（.a）和.so（共享对象）文件的内容以检索人类可读的文本？

java parsing io

在java中，是否可以解析存档库（.a）和.so（共享对象）文件的内容以检索人类可读的文本？,java,parsing,io,normalization,Java,Parsing,Io,Normalization,例如，JAVA windows PE（可移植可执行文件）解析器能够解析windows.exe和.dll文件，以检索产品名称和版本信息以及版权信息。基本上，您向它传递一个类似notepad.exe的文件，它将返回以下内容 CompanyName = Microsoft Corporation FileDescription = Notepad FileVersion = 6.1.7600.16385 (win7_rtm.090713-1255) InternalName = Notepad Le

例如，JAVA windows PE（可移植可执行文件）解析器能够解析windows.exe和.dll文件，以检索产品名称和版本信息以及版权信息。基本上，您向它传递一个类似notepad.exe的文件，它将返回以下内容

CompanyName = Microsoft Corporation
FileDescription = Notepad
FileVersion = 6.1.7600.16385 (win7_rtm.090713-1255)
InternalName = Notepad
LegalCopyright = © Microsoft Corporation. All rights reserved.
OriginalFilename = NOTEPAD.EXE
ProductName = Microsoft® Windows® Operating System
ProductVersion = 6.1.7600.16385

该工具基本上使用几个Java inputstream库来访问文件中的某些字节，并返回读取的原始数据的正确ASCII表示

对于我的问题，我已经尝试使用以下方法，该方法返回无法读取的文本，即使我尝试将其规范化：

public static void readContent(String file){
          BufferedReader buff = null;
        try {
            buff = new BufferedReader(new InputStreamReader(new DataInputStream(new FileInputStream(file)),"UTF-8"));
        } catch (UnsupportedEncodingException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        while(true){
            String line=null;
            try {
                line = buff.readLine();
            } catch (IOException e) {

                e.printStackTrace();
            }
            if(line == null){
                break;
            }

            line =  Normalizer.normalize(line, Normalizer.Form.NFD).replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
            System.out.println(line);
}

如果有人能为我指出正确的方向，如果有可能实现我想要的目标。

我已经找到了问题的解决方案，我想我会把它贴出来，以防有人也在寻找答案。出于保密原因，我不会将我所有的代码都发布在这里，但它很简单：

 fis = new FileInputStream(file);
 bis = new BufferedInputStream(fis);
 PeekableInputStream pis = new PeekableInputStream(bis);
 Set<String> lines = new HashSet<String>();
 int currentByte;
 StringBuilder currentLine = null;
while (((currentByte = is.read()) != -1) && lines.size() < 10000) {
            char ch = (char) currentByte;
            if (isStringChar(ch)) {
                if (currentLine == null) {
                    currentLine = new StringBuilder(8);
                }
                // found a char, add it to the current line
                currentLine.append(ch);
           }
}

fis=新文件输入流（文件）；
bis=新的缓冲数据流（fis）；
PeekableInputStream pis=新的PeekableInputStream（bis）；
Set line=new HashSet（）；
int-currentByte；
StringBuilder currentLine=null；
而（（（currentByte=is.read（））！=-1）和&lines.size（）<10000）{
char ch=（char）currentByte；
if（isStringChar（ch））{
如果（currentLine==null）{
currentLine=新的StringBuilder（8）；
}
//找到一个字符，将其添加到当前行
currentLine.append（ch）；
}
}

现在，如果你查看currentLine字符串，你会发现像许可证版权这样的信息，如果原始作者包含了这些信息

据我所知，这些信息不是.a/的一部分。所以你自然无法提取它。看起来是这样的，但是如果我能够将正在读取的数据转换为人类可读的文本以进行进一步调查，那就更好了。我只是想在调查几天后添加一些内容，.so和.a文件实际上包含版权信息，我可以使用Java代码以外的其他方法从文件中检索它。（例如，linux命令行的字符串grep函数）使用java InputStream解决了这个问题，我遍历了文件中的字节并从中提取字符，并且能够在.o.so和.a文件中找到版权和许可证信息，