Java HTML格式化程序到底做什么?

Java HTML格式化程序到底做什么?,java,html,formatting,Java,Html,Formatting,html格式化程序做什么?我已使用以下html格式化程序格式化了一个html文件: 但是当我比较这两个文件时,我只能找到一个不同点,那就是缩进。对于某些需要解析html文件的项目,我无法使用jsoup解析未格式化的html文件,但在格式化html后,jsoup正在解析html文件。 用于解析的代码: public Document parseHTML(String filePath) throws IOException{ File inputFile = new File

html格式化程序做什么?我已使用以下html格式化程序格式化了一个html文件:

但是当我比较这两个文件时,我只能找到一个不同点,那就是缩进。对于某些需要解析html文件的项目,我无法使用jsoup解析未格式化的html文件,但在格式化html后,jsoup正在解析html文件。 用于解析的代码:

public Document parseHTML(String filePath)  throws IOException{

        File inputFile = new File(filePath); 
        Document fileDoc = Jsoup.parse(inputFile, null);
        return fileDoc;
     }  

你问题的答案是什么

html格式化程序做什么

位于您共享的链接的第一行:

使用所需缩进级别格式化HTML字符串/文件。这个 格式化规则是不可配置的,但我认为它为用户提供了 以尽可能最好的输出


因此,它只是缩进HTML字符串,这样它将被视为有效的HTML字符串

一些格式化程序使您的代码如此美丽,以至于您感到欣喜若狂,例如:

<table border="0" cellpadding="0" cellspacing="0" width="450" style="border-spacing:0px;color:rgb(62,80,97);font-family:Lato,Arial,San-serif;font-size:14px;line-height:13.3333px;table-layout:fixed;background-color:transparent"><tbody><tr><td align="left" valign="top" width="81" style="padding:0px"><p style="margin:0px 10px 10px 0px;font-family:Helvetica,Arial,sans-serif;line-height:16px"><img src="https://ci4.googleusercontent.com/proxy/fHPFAzy43itIJmV5eI64vf04XagIdF6yGVf7vcMWQCfZ-stb0pWyWjbF_UvZA8mPyCnPwjWeb4ItHV4YWH3OGdTS4GZhV71czA09ggSoA-FbsWoTXdVr3Molo3RWymrznp1k=s0-d-e1-ft#https://htmlsigs.s3.amazonaws.com/logos/files/000/295/371/landscape/Monkey.jpg" alt="" border="0" height="80" width="71" style="border:0px;vertical-align:middle"></p></td><td align="left" valign="top" nowrap="" width="16" style="padding:0px;border-left-width:6px;border-left-style:solid;border-left-color:rgb(71,124,204)"><br></td><td align="left" valign="middle" nowrap="" width="363" style="padding:0px"><div><p style="text-align:left;margin:0px 0px 10px;line-height:16px;color:rgb(33,33,33)"><font face="arial, helvetica, sans-serif" size="4"><span style="font-weight:bold;background-color:transparent">Moo Cow</span><br></font></p><p style="margin:0px 0px 10px;line-height:16px;color:rgb(33,33,33)"><font face="arial, helvetica, sans-serif" size="4"><span style="display:inline"></span></font><span style="display:block"></span><font face="arial, helvetica, sans-serif" size="4"><span style="display:inline">(345) 977-4334</span></font></p><p style="margin:0px 0px 10px;font-family:Helvetica,Arial,sans-serif;line-height:16px"><span style="display:inline"><a href="https://www.linkedin.com/in/MooCow" style="color:rgb(33,33,33);font-family:'Times New Roman';font-size:medium;line-height:normal;background-color:transparent" target="_blank"><img src="https://ci6.googleusercontent.com/proxy/Z0nLY12pdK7xeUMB--M8IKlzAbulOc1ZXo0Am9thnzcdsfOddseTVUDrqV90uSBFs-5p-Bykh2kAQC1FkBxI34w17P9GGl7gYxR79w=s0-d-e1-ft#http://www.companysig.com/images/but_linkedin_logo.png" title="LinkedIn button" alt=":inkedIn button" style="border:none"></a><font color="#000000">&nbsp; </font><a href="mailto:MooCow@ucla.edu" style="color:rgb(33,33,33);font-family:'Times New Roman';font-size:medium;line-height:normal" target="_blank"><img src="https://ci6.googleusercontent.com/proxy/lLjmL_p2lcLcwNTeXCHq_OaY8C4EDuCt2ynWpTboZUNlS6jK7LZYVszDodowST9klsHP6bPzE6ph8f6jp2xUp5wgoWr9gTIJjv8=s0-d-e1-ft#http://www.companysig.com/images/but_email_black.png" title="email" alt="email" style="border:none"></a><span style="color:rgb(0,0,0);font-family:'Times New Roman';font-size:medium;line-height:normal">&nbsp;&nbsp;</span><a href="http://www.companysig.com/X/H/HI/show_vcard3.php/HIFKBH878391.vcf" style="color:rgb(33,33,33);font-family:'Times New Roman';font-size:medium;line-height:normal" target="_blank"><img src="https://ci3.googleusercontent.com/proxy/cMfYHWNyrPoDyeLFX39VL_WRboFaZUb9RxYuVaxUIpDwnEbiY4K6GLvJ_y2_iUggQtflbIcabZVjFOMjENgpSytdWgVi9yrC964=s0-d-e1-ft#http://www.companysig.com/images/but_vcard_black.png" title="Vcard Save Contact" alt="Vcard Save Contact" style="border:none"></a><br></span></p></div></td></tr></tbody></table>


Moo-Cow

977-4334=“保证金:0px 0px 10px;字体系列:Helvetica、Arial、无衬线字体;线高:16px“>

给我这个:

<table border="0" cellpadding="0" cellspacing="0" style=
"border-spacing:0px;color:rgb(62,80,97);font-family:Lato,Arial,San-serif;font-size:14px;line-height:13.3333px;table-layout:fixed;background-color:transparent"
width="450">
    <tbody>
        <tr>
            <td align="left" style="padding:0px" valign="top" width="81">
                <p style=
                "margin:0px 10px 10px 0px;font-family:Helvetica,Arial,sans-serif;line-height:16px">
                <img alt="" border="0" height="80" src=
                "https://ci4.googleusercontent.com/proxy/fHPFAzy43itIJmV5eI64vf04XagIdF6yGVf7vcMWQCfZ-stb0pWyWjbF_UvZA8mPyCnPwjWeb4ItHV4YWH3OGdTS4GZhV71czA09ggSoA-FbsWoTXdVr3Molo3RWymrznp1k=s0-d-e1-ft#https://htmlsigs.s3.amazonaws.com/logos/files/000/295/371/landscape/Ammar.jpg"
                style="border:0px;vertical-align:middle" width="71">
                </p>
            </td>

            <td align="left" nowrap style=
            "padding:0px;border-left-width:6px;border-left-style:solid;border-left-color:rgb(71,124,204)"
            valign="top" width="16"><br>
            </td>

            <td align="left" nowrap style="padding:0px" valign="middle" width=
            "363">
                <div>
                    <p style=
                    "text-align:left;margin:0px 0px 10px;line-height:16px;color:rgb(33,33,33)">
                    <span style=
                    "font-weight:bold;background-color:transparent">Moo
                    Cow</span><br>
                    </p>


                    <p style=
                    "margin:0px 0px 10px;line-height:16px;color:rgb(33,33,33)">
                    <span style="display:inline"></span><span style=
                    "display:block"></span><span style="display:inline">(345)
                    977-4334</span>
                    </p>


                    <p style=
                    "margin:0px 0px 10px;font-family:Helvetica,Arial,sans-serif;line-height:16px">
                    <span style="display:inline"><a href=
                    "https://www.linkedin.com/in/MooCow" style=
                    "color:rgb(33,33,33);font-family:'Times New Roman';font-size:medium;line-height:normal;background-color:transparent"
                    target="_blank"><img alt=":inkedIn button" src=
                    "https://ci6.googleusercontent.com/proxy/Z0nLY12pdK7xeUMB--M8IKlzAbulOc1ZXo0Am9thnzcdsfOddseTVUDrqV90uSBFs-5p-Bykh2kAQC1FkBxI34w17P9GGl7gYxR79w=s0-d-e1-ft#http://www.companysig.com/images/but_linkedin_logo.png"
                    style="border:none" title="LinkedIn button"></a>&nbsp;
                    <a href="mailto:ammardiwan@ucla.edu" style=
                    "color:rgb(33,33,33);font-family:'Times New Roman';font-size:medium;line-height:normal"
                    target="_blank"><img alt="email" src=
                    "https://ci6.googleusercontent.com/proxy/lLjmL_p2lcLcwNTeXCHq_OaY8C4EDuCt2ynWpTboZUNlS6jK7LZYVszDodowST9klsHP6bPzE6ph8f6jp2xUp5wgoWr9gTIJjv8=s0-d-e1-ft#http://www.companysig.com/images/but_email_black.png"
                    style="border:none" title="email"></a><span style=
                    "color:rgb(0,0,0);font-family:'Times New Roman';font-size:medium;line-height:normal">&nbsp;&nbsp;</span><a href="http://www.companysig.com/X/H/HI/show_vcard3.php/HIFKBH878391.vcf"
                    style=
                    "color:rgb(33,33,33);font-family:'Times New Roman';font-size:medium;line-height:normal"
                    target="_blank"><img alt="Vcard Save Contact" src=
                    "https://ci3.googleusercontent.com/proxy/cMfYHWNyrPoDyeLFX39VL_WRboFaZUb9RxYuVaxUIpDwnEbiY4K6GLvJ_y2_iUggQtflbIcabZVjFOMjENgpSytdWgVi9yrC964=s0-d-e1-ft#http://www.companysig.com/images/but_vcard_black.png"
                    style="border:none" title=
                    "Vcard Save Contact"></a><br></span>
                    </p>
                </div>
            </td>
        </tr>
    </tbody>
</table>



哞 奶牛

(345) 977-4334



哦,是的!

顾名思义,它基本上是格式化html的,在大多数情况下更容易让人阅读。如果jsoup只处理格式化的html,并且除了空格和换行符之外没有其他区别,那么就在上面发布一个问题(或者重新表述这个问题,尤其是标题)-别忘了发布一个简单的例子,说明未格式化的html不起作用,格式化的版本也起作用。