Java 如何在JSoup中打印UTF-8编码字符

Java 如何在JSoup中打印UTF-8编码字符,java,jsoup,Java,Jsoup,我正在使用JSoup,版本:1.8.1,并希望从中解析内容。它用UTF-8编码,包含阿拉伯字符。我已经编写了以下代码,在控制台中打印一些内容 在html文件中有一些特殊的chrecter的地方,我在这些地方有“?”chrecter。我不知道如何解决这个问题 public static void main(String[] args){ String url = "http://corpus.quran.com/wordbyword.jsp"; System.out

我正在使用JSoup,版本:1.8.1,并希望从中解析内容。它用UTF-8编码,包含阿拉伯字符。我已经编写了以下代码,在控制台中打印一些内容

在html文件中有一些特殊的chrecter的地方,我在这些地方有“?”chrecter。我不知道如何解决这个问题

public static void main(String[] args){
        String url = "http://corpus.quran.com/wordbyword.jsp";
        System.out.printf("Fetching %s...\n", url);

        Document doc=null;
        try {
            doc = Jsoup.parse(new URL(url).openStream(), "UTF-8", url);
            //doc = Jsoup.connect(url).get();

        } catch (IOException e) {
            e.printStackTrace();
            System.exit(0);
        }
        System.out.println("Fetching completed. Collecting Data...");
        Elements columns=doc.select("td");
        for (Element link : columns) {
                System.out.printf(" * %s %s\n", link.tagName(),link.text());
        }
        System.out.println("------------------------------------------");
    }
以及控制台中的输出:

Fetching http://corpus.quran.com/wordbyword.jsp...
Fetching completed. Collecting Data...
 * td  
 * td Qur'an | Word by Word | Audio | Prayer Times | Android | New : beta.quran.com
 * td 
 * td __
 * td Sign In
 * td Search
 * td  
 * td 
 * td 
 * td __
 * td Verse (1:1) - Word by Word
 * td Word by Word Quran Dictionary English Translation Syntactic Treebank Ontology of Concepts Documentation Quranic Grammar Message Board Resources Feedback Java API
 * td Word by Word
 * td Quran Dictionary
 * td English Translation
 * td Syntactic Treebank
 * td Ontology of Concepts
 * td Documentation
 * td Quranic Grammar
 * td Message Board
 * td Resources
 * td Feedback
 * td Java API
 * td __
 * td Welcome to the Quranic Arabic Corpus, an annotated linguistic resource which shows the Arabic grammar, syntax and morphology for each word in the Holy Quran. Click on an Arabic word below to see details of the word's grammar, or to suggest a correction. Chapter (1) s?rat l-f?ti?ah (The Opening)Chapter (2) s?rat l-baqarah (The Cow)Chapter (3) s?rat ?l ?im'r?n (The Family of Imr?n)Chapter (4) s?rat l-nis?a (The Women)Chapter (5) s?rat l-m?idah (The Table spread with Food)Chapter (6) s?rat l-an??m (The Cattle)Chapter (7) s?rat l-a?r?f (The Heights)Chapter (8) s?rat l-anf?l (The Spoils of War)Chapter (9) s?rat l-tawbah (The Repentance)Chapter (10) s?rat y?nus (Jonah)Chapter (11) s?rat h?d (Hud)Chapter (12) s?rat y?suf (Joseph)Chapter (13) s?rat l-ra?d (The Thunder)Chapter (14) s?rat ib'r?h?m (Abraham)Chapter (15) s?rat l-?ij'r (The Rocky Tract)Chapter (16) s?rat l-na?l (The Bees)Chapter (17) s?rat l-isr? (The Night Journey)Chapter (18) s?rat l-kahf (The Cave)Chapter (19) s?rat maryam (Mary)Chapter (20) s?rat ?? h?Chapter (21) s?rat l-anbiy?a (The Prophets)Chapter (22) s?rat l-?aj (The Pilgrimage)Chapter (23) s?rat l-mu'min?n (The Believers)Chapter (24) s?rat l-n?r (The Light)Chapter (25) s?rat l-fur'q?n (The Criterion)Chapter (26) s?rat l-shu?ar? (The Poets)Chapter (27) s?rat l-naml (The Ants)Chapter (28) s?rat l-qa?a? (The Stories)Chapter (29) s?rat l-?ankab?t (The Spider)Chapter (30) s?rat l-r?m (The Romans)Chapter (31) s?rat luq'm?nChapter (32) s?rat l-sajdah (The Prostration)Chapter (33) s?rat l-a?z?b (The Combined Forces)Chapter (34) s?rat saba (Sheba)Chapter (35) s?rat f??ir (The Originator)Chapter (36) s?rat y? s?nChapter (37) s?rat l-??f?t (Those Ranges in Ranks)Chapter (38) s?rat ??dChapter (39) s?rat l-zumar (The Groups)Chapter (40) s?rat gh?fir (The Forgiver God)Chapter (41) s?rat fu??ilat (Explained in Detail)Chapter (42) s?rat l-sh?r? (Consultation)Chapter (43) s?rat l-zukh'ruf (The Gold Adornment)Chapter (44) s?rat l-dukh?n (The Smoke)Chapter (45) s?rat l-j?thiyah (Crouching)Chapter (46) s?rat l-a?q?f (The Curved Sand-hills)Chapter (47) s?rat mu?ammadChapter (48) s?rat l-fat? (The Victory)Chapter (49) s?rat l-?ujur?t (The Dwellings)Chapter (50) s?rat q?fChapter (51) s?rat l-dh?riy?t (The Wind that Scatter)Chapter (52) s?rat l-??r (The Mount)Chapter (53) s?rat l-najm (The Star)Chapter (54) s?rat l-qamar (The Moon)Chapter (55) s?rat l-ra?m?n (The Most Gracious)Chapter (56) s?rat l-w?qi?ah (The Event)Chapter (57) s?rat l-?ad?d (The Iron)Chapter (58) s?rat l-muj?dilah (She That Disputeth)Chapter (59) s?rat l-?ashr (The Gathering)Chapter (60) s?rat l-mum'ta?anah (The Woman to be examined)Chapter (61) s?rat l-?af (The Row)Chapter (62) s?rat l-jumu?ah (Friday)Chapter (63) s?rat l-mun?fiq?n (The Hypocrites)Chapter (64) s?rat l-tagh?bun (Mutual Loss & Gain)Chapter (65) s?rat l-?al?q (The Divorce)Chapter (66) s?rat l-ta?r?m (The Prohibition)Chapter (67) s?rat l-mulk (Dominion)Chapter (68) s?rat l-qalam (The Pen)Chapter (69) s?rat l-??qah (The Inevitable)Chapter (70) s?rat l-ma??rij (The Ways of Ascent)Chapter (71) s?rat n??Chapter (72) s?rat l-jin (The Jinn)Chapter (73) s?rat l-muzamil (The One wrapped in Garments)Chapter (74) s?rat l-mudathir (The One Enveloped)Chapter (75) s?rat l-qiy?mah (The Resurrection)Chapter (76) s?rat l-ins?n (Man)Chapter (77) s?rat l-mur'sal?t (Those sent forth)Chapter (78) s?rat l-naba (The Great News)Chapter (79) s?rat l-n?zi??t (Those who Pull Out)Chapter (80) s?rat ?abasa (He frowned)Chapter (81) s?rat l-takw?r (The Overthrowing)Chapter (82) s?rat l-infi??r (The Cleaving)Chapter (83) s?rat l-mu?afif?n (Those Who Deal in Fraud)Chapter (84) s?rat l-inshiq?q (The Splitting Asunder)Chapter (85) s?rat l-bur?j (The Big Stars)Chapter (86) s?rat l-??riq (The Night-Comer)Chapter (87) s?rat l-a?l? (The Most High)Chapter (88) s?rat l-gh?shiyah (The Overwhelming)Chapter (89) s?rat l-fajr (The Dawn)Chapter (90) s?rat l-balad (The City)Chapter (91) s?rat l-shams (The Sun)Chapter (92) s?rat l-layl (The Night)Chapter (93) s?rat l-?u?? (The Forenoon)Chapter (94) s?rat l-shar? (The Opening Forth)Chapter (95) s?rat l-t?n (The Fig)Chapter (96) s?rat l-?alaq (The Clot)Chapter (97) s?rat l-qadr (The Night of Decree)Chapter (98) s?rat l-bayinah (The Clear Evidence)Chapter (99) s?rat l-zalzalah (The Earthquake)Chapter (100) s?rat l-??diy?t (Those That Run)Chapter (101) s?rat l-q?ri?ah (The Striking Hour)Chapter (102) s?rat l-tak?thur (The piling Up)Chapter (103) s?rat l-?a?r (Time)Chapter (104) s?rat l-humazah (The Slanderer)Chapter (105) s?rat l-f?l (The Elephant)Chapter (106) s?rat qurayshChapter (107) s?rat l-m???n (Small Kindnesses)Chapter (108) s?rat l-kawthar (A River in Paradise)Chapter (109) s?rat l-k?fir?n (The Disbelievers)Chapter (110) s?rat l-na?r (The Help)Chapter (111) s?rat l-masad (The Palm Fibre)Chapter (112) s?rat l-ikhl?? (Sincerity)Chapter (113) s?rat l-falaq (The Daybreak)Chapter (114) s?rat l-n?s (Mankind) Verse (1:1)Verse (1:2)Verse (1:3)Verse (1:4)Verse (1:5)Verse (1:6)Verse (1:7) Go Chapter (1) s?rat l-f?ti?ah (The Opening) Translation Arabic word Syntax and morphology (1:1:1) bis'mi In (the) name P – prefixed preposition bi N – genitive masculine noun ??? ?????? (1:1:2) l-lahi (of) Allah, PN – genitive proper noun ? Allah ??? ??????? ????? (1:1:3) l-ra?m?ni the Most Gracious, ADJ – genitive masculine singular adjective ??? ?????? (1:1:4) l-ra??mi the Most Merciful. ADJ – genitive masculine singular adjective ??? ?????? (1:2:1) al-?amdu All praises and thanks N – nominative masculine noun ??? ????? (1:2:2) lillahi (be) to Allah, P – prefixed preposition l?m PN – genitive proper noun ? Allah ??? ?????? (1:2:3) rabbi the Lord N – genitive masculine noun ??? ????? (1:2:4) l-??lam?na of the universe N – genitive masculine plural noun ??? ????? (1:3:1) al-ra?m?ni The Most Gracious, ADJ – genitive masculine singular adjective ??? ?????? (1:3:2) l-ra??mi the Most Merciful. ADJ – genitive masculine singular adjective ??? ?????? (1:4:1) m?liki (The) Master N – genitive masculine active participle ??? ????? (1:4:2) yawmi (of the) Day N – genitive masculine noun ? Day of Resurrection ??? ????? (1:4:3) l-d?ni (of the) Judgment. N – genitive masculine noun ??? ????? (1:5:1) iyy?ka You Alone PRON – 2nd person masculine singular personal pronoun ? Allah ???? ????? (1:5:2) na?budu we worship, V – 1st person plural imperfect verb ??? ????? (1:5:3) wa-iyy?ka and You Alone CONJ – prefixed conjunction wa (and) PRON – 2nd person masculine singular personal pronoun ? Allah ????? ????? ???? ????? (1:5:4) nasta??nu we ask for help. V – 1st person plural (form X) imperfect verb ??? ????? (1:6:1) ih'din? Guide us V – 2nd person masculine singular imperative verb PRON – 1st person plural object pronoun PRON – implicit subject pronoun ? Allah ??? ??? ?«??» ???? ???? ?? ??? ??? ????? ?? (1:6:2) l-?ir??a (to) the path, N – accusative masculine noun ??? ????? (1:6:3) l-mus'taq?ma the straight. ADJ – accusative masculine (form X) active participle ??? ?????? Quran Recitation by Saad Al-Ghamadi Verse 1-6 | 7
 * td Chapter (1) s?rat l-f?ti?ah (The Opening)Chapter (2) s?rat l-baqarah (The Cow)Chapter (3) s?rat ?l ?im'r?n (The Family of Imr?n)Chapter (4) s?rat l-nis?a (The Women)Chapter (5) s?rat l-m?idah (The Table spread with Food)Chapter (6) s?rat l-an??m (The Cattle)Chapter (7) s?rat l-a?r?f (The Heights)Chapter (8) s?rat l-anf?l (The Spoils of War)Chapter (9) s?rat l-tawbah (The Repentance)Chapter (10) s?rat y?nus (Jonah)Chapter (11) s?rat h?d (Hud)Chapter (12) s?rat y?suf (Joseph)Chapter (13) s?rat l-ra?d (The Thunder)Chapter (14) s?rat ib'r?h?m (Abraham)Chapter (15) s?rat l-?ij'r (The Rocky Tract)Chapter (16) s?rat l-na?l (The Bees)Chapter (17) s?rat l-isr? (The Night Journey)Chapter (18) s?rat l-kahf (The Cave)Chapter (19) s?rat maryam (Mary)Chapter (20) s?rat ?? h?Chapter (21) s?rat l-anbiy?a (The Prophets)Chapter (22) s?rat l-?aj (The Pilgrimage)Chapter (23) s?rat l-mu'min?n (The Believers)Chapter (24) s?rat l-n?r (The Light)Chapter (25) s?rat l-fur'q?n (The Criterion)Chapter (26) s?rat l-shu?ar? (The Poets)Chapter (27) s?rat l-naml (The Ants)Chapter (28) s?rat l-qa?a? (The Stories)Chapter (29) s?rat l-?ankab?t (The Spider)Chapter (30) s?rat l-r?m (The Romans)Chapter (31) s?rat luq'm?nChapter (32) s?rat l-sajdah (The Prostration)Chapter (33) s?rat l-a?z?b (The Combined Forces)Chapter (34) s?rat saba (Sheba)Chapter (35) s?rat f??ir (The Originator)Chapter (36) s?rat y? s?nChapter (37) s?rat l-??f?t (Those Ranges in Ranks)Chapter (38) s?rat ??dChapter (39) s?rat l-zumar (The Groups)Chapter (40) s?rat gh?fir (The Forgiver God)Chapter (41) s?rat fu??ilat (Explained in Detail)Chapter (42) s?rat l-sh?r? (Consultation)Chapter (43) s?rat l-zukh'ruf (The Gold Adornment)Chapter (44) s?rat l-dukh?n (The Smoke)Chapter (45) s?rat l-j?thiyah (Crouching)Chapter (46) s?rat l-a?q?f (The Curved Sand-hills)Chapter (47) s?rat mu?ammadChapter (48) s?rat l-fat? (The Victory)Chapter (49) s?rat l-?ujur?t (The Dwellings)Chapter (50) s?rat q?fChapter (51) s?rat l-dh?riy?t (The Wind that Scatter)Chapter (52) s?rat l-??r (The Mount)Chapter (53) s?rat l-najm (The Star)Chapter (54) s?rat l-qamar (The Moon)Chapter (55) s?rat l-ra?m?n (The Most Gracious)Chapter (56) s?rat l-w?qi?ah (The Event)Chapter (57) s?rat l-?ad?d (The Iron)Chapter (58) s?rat l-muj?dilah (She That Disputeth)Chapter (59) s?rat l-?ashr (The Gathering)Chapter (60) s?rat l-mum'ta?anah (The Woman to be examined)Chapter (61) s?rat l-?af (The Row)Chapter (62) s?rat l-jumu?ah (Friday)Chapter (63) s?rat l-mun?fiq?n (The Hypocrites)Chapter (64) s?rat l-tagh?bun (Mutual Loss & Gain)Chapter (65) s?rat l-?al?q (The Divorce)Chapter (66) s?rat l-ta?r?m (The Prohibition)Chapter (67) s?rat l-mulk (Dominion)Chapter (68) s?rat l-qalam (The Pen)Chapter (69) s?rat l-??qah (The Inevitable)Chapter (70) s?rat l-ma??rij (The Ways of Ascent)Chapter (71) s?rat n??Chapter (72) s?rat l-jin (The Jinn)Chapter (73) s?rat l-muzamil (The One wrapped in Garments)Chapter (74) s?rat l-mudathir (The One Enveloped)Chapter (75) s?rat l-qiy?mah (The Resurrection)Chapter (76) s?rat l-ins?n (Man)Chapter (77) s?rat l-mur'sal?t (Those sent forth)Chapter (78) s?rat l-naba (The Great News)Chapter (79) s?rat l-n?zi??t (Those who Pull Out)Chapter (80) s?rat ?abasa (He frowned)Chapter (81) s?rat l-takw?r (The Overthrowing)Chapter (82) s?rat l-infi??r (The Cleaving)Chapter (83) s?rat l-mu?afif?n (Those Who Deal in Fraud)Chapter (84) s?rat l-inshiq?q (The Splitting Asunder)Chapter (85) s?rat l-bur?j (The Big Stars)Chapter (86) s?rat l-??riq (The Night-Comer)Chapter (87) s?rat l-a?l? (The Most High)Chapter (88) s?rat l-gh?shiyah (The Overwhelming)Chapter (89) s?rat l-fajr (The Dawn)Chapter (90) s?rat l-balad (The City)Chapter (91) s?rat l-shams (The Sun)Chapter (92) s?rat l-layl (The Night)Chapter (93) s?rat l-?u?? (The Forenoon)Chapter (94) s?rat l-shar? (The Opening Forth)Chapter (95) s?rat l-t?n (The Fig)Chapter (96) s?rat l-?alaq (The Clot)Chapter (97) s?rat l-qadr (The Night of Decree)Chapter (98) s?rat l-bayinah (The Clear Evidence)Chapter (99) s?rat l-zalzalah (The Earthquake)Chapter (100) s?rat l-??diy?t (Those That Run)Chapter (101) s?rat l-q?ri?ah (The Striking Hour)Chapter (102) s?rat l-tak?thur (The piling Up)Chapter (103) s?rat l-?a?r (Time)Chapter (104) s?rat l-humazah (The Slanderer)Chapter (105) s?rat l-f?l (The Elephant)Chapter (106) s?rat qurayshChapter (107) s?rat l-m???n (Small Kindnesses)Chapter (108) s?rat l-kawthar (A River in Paradise)Chapter (109) s?rat l-k?fir?n (The Disbelievers)Chapter (110) s?rat l-na?r (The Help)Chapter (111) s?rat l-masad (The Palm Fibre)Chapter (112) s?rat l-ikhl?? (Sincerity)Chapter (113) s?rat l-falaq (The Daybreak)Chapter (114) s?rat l-n?s (Mankind)
 * td Verse (1:1)Verse (1:2)Verse (1:3)Verse (1:4)Verse (1:5)Verse (1:6)Verse (1:7) Go
 * td Translation
 * td Arabic word
 * td Syntax and morphology
 * td (1:1:1) bis'mi In (the) name
 * td 
 * td P – prefixed preposition bi N – genitive masculine noun ??? ??????
 * td (1:1:2) l-lahi (of) Allah,
 * td 
 * td PN – genitive proper noun ? Allah ??? ??????? ?????
 * td (1:1:3) l-ra?m?ni the Most Gracious,
 * td 
 * td ADJ – genitive masculine singular adjective ??? ??????
 * td (1:1:4) l-ra??mi the Most Merciful.
 * td 
 * td ADJ – genitive masculine singular adjective ??? ??????
 * td (1:2:1) al-?amdu All praises and thanks
 * td 
 * td N – nominative masculine noun ??? ?????
 * td (1:2:2) lillahi (be) to Allah,
 * td 
 * td P – prefixed preposition l?m PN – genitive proper noun ? Allah ??? ??????
 * td (1:2:3) rabbi the Lord
 * td 
 * td N – genitive masculine noun ??? ?????
 * td (1:2:4) l-??lam?na of the universe
 * td 
 * td N – genitive masculine plural noun ??? ?????
 * td (1:3:1) al-ra?m?ni The Most Gracious,
 * td 
 * td ADJ – genitive masculine singular adjective ??? ??????
 * td (1:3:2) l-ra??mi the Most Merciful.
 * td 
 * td ADJ – genitive masculine singular adjective ??? ??????
 * td (1:4:1) m?liki (The) Master
 * td 
 * td N – genitive masculine active participle ??? ?????
 * td (1:4:2) yawmi (of the) Day
 * td 
 * td N – genitive masculine noun ? Day of Resurrection ??? ?????
 * td (1:4:3) l-d?ni (of the) Judgment.
 * td 
 * td N – genitive masculine noun ??? ?????
 * td (1:5:1) iyy?ka You Alone
 * td 
 * td PRON – 2nd person masculine singular personal pronoun ? Allah ???? ?????
 * td (1:5:2) na?budu we worship,
 * td 
 * td V – 1st person plural imperfect verb ??? ?????
 * td (1:5:3) wa-iyy?ka and You Alone
 * td 
 * td CONJ – prefixed conjunction wa (and) PRON – 2nd person masculine singular personal pronoun ? Allah ????? ????? ???? ?????
 * td (1:5:4) nasta??nu we ask for help.
 * td 
 * td V – 1st person plural (form X) imperfect verb ??? ?????
 * td (1:6:1) ih'din? Guide us
 * td 
 * td V – 2nd person masculine singular imperative verb PRON – 1st person plural object pronoun PRON – implicit subject pronoun ? Allah ??? ??? ?«??» ???? ???? ?? ??? ??? ????? ??
 * td (1:6:2) l-?ir??a (to) the path,
 * td 
 * td N – accusative masculine noun ??? ?????
 * td (1:6:3) l-mus'taq?ma the straight.
 * td 
 * td ADJ – accusative masculine (form X) active participle ??? ??????
 * td Language Research Group University of Leeds
 * td __
 * td Copyright © Kais Dukes, 2009-2011. E-mail: sckd@leeds.ac.uk. This is an open source project. The Quranic Arabic Corpus is available under the GNU public license with terms of use.
------------------------------------------

大多数控制台不使用UTF 8作为默认编码,因此当我们尝试打印UTF8字符时,这些字符将被替换为。但您始终可以更改console的编码,例如在eclipse中,只需转到:

运行配置->公共->编码->其他从中选择UTF 8 下降


运行您的程序,现在您应该可以在eclipse控制台中正确地看到UTF 8字符。

大多数控制台不使用UTF 8作为默认编码,因此当我们尝试打印UTF8字符时,这些字符将被替换为。但您始终可以更改console的编码,例如在eclipse中,只需转到:

运行配置->公共->编码->其他从中选择UTF 8 下降


运行您的程序,现在您应该可以在eclipse控制台中正确地看到UTF 8个字符。

您是在使用eclipse还是在普通命令提示符下运行?在文件中输出并重新检查-Juned,我正在使用Eclipse。您是使用Eclipse还是普通命令提示符来运行?在文件中输出并重新检查-朱纳德,我正在使用Eclipse。