在Java中读取PDF文件-无外部库_Java_Pdf_File Io

在Java中读取PDF文件-无外部库

java pdf file-io

在Java中读取PDF文件-无外部库,java,pdf,file-io,Java,Pdf,File Io,在过去的几周里，我一直在编写一个简单的Java服务器首先，我想根据您启动服务器的位置显示文件系统。例如，如果您在src目录中启动服务器，打开浏览器，转到localhost:5555，您将看到src中包含的文件和目录。每个都将被链接。我很好地工作了如果你点击一个目录，它会显示它的内容（就像我提到的）。如果单击某个文件，它将读取该文件并以纯文本显示该文件。如果单击图像，它将为该图像提供服务。这一切都发生在浏览器中，您可以使用“上一步”按钮返回到以前查看的目录列表或文件。这也可以很好地工作，并且不

在过去的几周里，我一直在编写一个简单的Java服务器

首先，我想根据您启动服务器的位置显示文件系统。例如，如果您在

src

目录中启动服务器，打开浏览器，转到localhost:5555，您将看到

src

中包含的文件和目录。每个都将被链接。我很好地工作了

如果你点击一个目录，它会显示它的内容（就像我提到的）。如果单击某个文件，它将读取该文件并以纯文本显示该文件。如果单击图像，它将为该图像提供服务。这一切都发生在浏览器中，您可以使用“上一步”按钮返回到以前查看的目录列表或文件。这也可以很好地工作，并且不使用外部库

这是我用来读取文本文件（使用读卡器）的代码：

这是我用来提供图像的代码（输入流，而不是读取器）：

public byte[]getByteArray（）引发IOException{
byte[]byteArray=新字节[（int）requestedFile.length（）]；
输入流输入流；
字符串文件名=String.valueOf（requestedFile）；
inputStream=new BufferedInputStream（new FileInputStream（fileName））；
int字节读取=0；
while（字节读取<字节数组长度）{
int bytesraining=byteArray.length-bytesRead；
int read=inputStream.read（字节数组、字节读取、字节剩余）；
如果（读取>0）{
字节读取+=读取；
}
}
inputStream.close（）；
FilterOutputStream binaryOutputStream=新的FilterOutputStream（outputStream）；
byte[]binaryHeaders=headers.getBytes（）；
byte[]fullBinaryResponse=新字节[binaryHeaders.length+byteArray.length]；
arraycopy（binaryHeaders，0，fullBinaryResponse，0，binaryHeaders.length）；
System.arraycopy（byteArray，0，fullBinaryResponse，binaryHeaders.length，byteArray.length）；
试一试{
写入（fullBinaryResponse）；
binaryOutputStream.flush（）；
}捕获（IOE异常）{
e、 printStackTrace（）；
}
}

我现在尝试的是为PDF服务。如果我在其中一个目录中有一个PDF，并单击它，它应该会打开该PDF（在浏览器使用的任何默认阅读器中）

我在谷歌上搜索这个话题，尝试了一两天的一些东西，但我似乎没能找到它。我觉得奇怪的是，当我按我当前的代码点击PDF时，浏览器似乎正在打开PDF，但没有显示文本。这是浏览器中的标准PDF查看器，我们都习惯于在单击PDF链接时看到它。但是没有内容。只是一些空白页

有人能帮忙吗？我不打算使用外部库。我只是想了解如何用Java打开PDF文件

谢谢

不要将其解析为文本，这样会转换字符，可能会转换结尾行，并且可能会更改您不想要的内容。不要将整个内容作为字节数组进行缓冲，而是直接写入输出流，这样就不会出现内存问题。相反，只需按如下方式提供文件：

public class FileServer extends javax.servlet.http.HttpServlet
{

public void doGet(HttpServletRequest req, HttpServletResponse resp)
{
    OutputStream out=null;
    try {

        HttpSession session = req.getSession();

        out = resp.getOutputStream();
        resp.setContentType(-- specify content type here --);
        req.setCharacterEncoding("UTF-8");

        String pathInfo = req.getPathInfo();

        String fullPath = -- figure out the path to the file in question --;

        FileInputStream fis = new FileInputStream(fullPath);

        byte[] buf = new byte[2048];

        int amtRead = fis.read(buf);
        while (amtRead > 0) {
            out.write(buf, 0, amtRead);
            amtRead = fis.read(buf);
        }
        fis.close();
        out.flush();
    }
    catch (Exception e) {
        try {
            resp.setContentType("text/html");
            if (out == null) {
                out = resp.getOutputStream();
            }
            Writer w = new OutputStreamWriter(out);
            w.write("<html><body><ul><li>Exception: ");
            w.write(e.toString());
            w.write("</ul></body></html>");
            w.flush();
        }
        catch (Exception eeeee) {
            //nothing we can do here...
        }
    }
}
}

公共类文件服务器扩展了javax.servlet.http.HttpServlet
{
公共无效数据集（HttpServletRequest请求、HttpServletResponse响应）
{
OutputStream out=null；
试一试{
HttpSession session=req.getSession（）；
out=resp.getOutputStream（）；
resp.setContentType（--在此处指定内容类型-）；
要求setCharacterEncoding（“UTF-8”）；
字符串pathInfo=req.getPathInfo（）；
String fullPath=--找出有问题的文件的路径--；
FileInputStream fis=新的FileInputStream（完整路径）；
字节[]buf=新字节[2048]；
int amtRead=fis.read（buf）；
而（amtRead>0）{
out.write（buf，0，amtRead）；
amtRead=fis.read（基本单位）；
}
fis.close（）；
out.flush（）；
}
捕获（例外e）{
试一试{
分别为setContentType（“文本/html”）；
if（out==null）{
out=resp.getOutputStream（）；
}
Writer w=新输出流Writer（out）；
w、 写（“异常：”）；
w、 写（例如toString（））；
w、 写“”；
w、 冲洗（）；
}
捕获（异常eeeee）{
//我们在这里无能为力。。。
}
}
}
}

只需打开文件并将其流回；有什么问题吗？在不知道您实际在做什么的情况下，无法提供帮助。您是说它可以像我读取文本文件一样读取（如上所述）？PDF文件是二进制文件，而不是文本文件，因此您不能弄乱其内容（如添加“\n”）或逐行读取，因为这里没有文本行的概念，也没有类似于普通字符编码的东西。通过使用…读取器，您隐式地假设存在。我真的很惊讶你能用你的方法提供图像。而且，如果文件的大小是几GB，你想在提供它们之前像在代码中一样读取它们吗？不，这不是我用来提供图像的代码。上面的代码仅用于文本文件。我正在使用InputStream读取图像。

public byte[] getByteArray() throws IOException {
    byte[] byteArray = new byte[(int) requestedFile.length()];
    InputStream inputStream;
    String fileName = String.valueOf(requestedFile);
    inputStream = new BufferedInputStream(new FileInputStream(fileName));
    int bytesRead = 0;
    while (bytesRead < byteArray.length) {
        int bytesRemaining = byteArray.length - bytesRead;
        int read = inputStream.read(byteArray, bytesRead, bytesRemaining);
        if (read > 0) {
            bytesRead += read;
        }
    }
    inputStream.close();
    FilterOutputStream binaryOutputStream = new FilterOutputStream(outputStream);
    byte [] binaryHeaders = headers.getBytes();
    byte [] fullBinaryResponse = new byte[binaryHeaders.length + byteArray.length];
    System.arraycopy(binaryHeaders, 0, fullBinaryResponse, 0, binaryHeaders.length);
    System.arraycopy(byteArray, 0, fullBinaryResponse, binaryHeaders.length, byteArray.length);
    try {
        binaryOutputStream.write(fullBinaryResponse);
        binaryOutputStream.flush();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

public class FileServer extends javax.servlet.http.HttpServlet
{

public void doGet(HttpServletRequest req, HttpServletResponse resp)
{
    OutputStream out=null;
    try {

        HttpSession session = req.getSession();

        out = resp.getOutputStream();
        resp.setContentType(-- specify content type here --);
        req.setCharacterEncoding("UTF-8");

        String pathInfo = req.getPathInfo();

        String fullPath = -- figure out the path to the file in question --;

        FileInputStream fis = new FileInputStream(fullPath);

        byte[] buf = new byte[2048];

        int amtRead = fis.read(buf);
        while (amtRead > 0) {
            out.write(buf, 0, amtRead);
            amtRead = fis.read(buf);
        }
        fis.close();
        out.flush();
    }
    catch (Exception e) {
        try {
            resp.setContentType("text/html");
            if (out == null) {
                out = resp.getOutputStream();
            }
            Writer w = new OutputStreamWriter(out);
            w.write("<html><body><ul><li>Exception: ");
            w.write(e.toString());
            w.write("</ul></body></html>");
            w.flush();
        }
        catch (Exception eeeee) {
            //nothing we can do here...
        }
    }
}
}