Javascript 刮伤<;头>;使用Node.JS?

Javascript 刮伤<;头>;使用Node.JS?,javascript,node.js,web-scraping,cheerio,Javascript,Node.js,Web Scraping,Cheerio,我想用Node.JS从网页上刮去脑袋,但我不知道怎么做。感谢cheerio,我可以接触到所有的身体,如下所示: request(webUrl, function(err, resp, body){ if(!err && resp.statusCode == 200) { var $ = cheerio.load(body); //Getting all the links 'a' from the webpage $('a

我想用Node.JS从网页上刮去脑袋,但我不知道怎么做。感谢cheerio,我可以接触到所有的身体,如下所示:

request(webUrl, function(err, resp, body){
    if(!err && resp.statusCode == 200) {
        var $ = cheerio.load(body);

        //Getting all the links 'a' from the webpage
        $('a').each(function(){

            //Getting the href attribute from the 'a' link
            var url = $(this).attr('href');

            //We keep the link only if it is the same root (in order to avoid the 'undefined' links and the subdomains or outside links (like social media links))
            if(url != undefined && url[0] == '/') {

                //We add the domain name to the url we got in order to have the full 
                url = websiteUrl + url;
                urls.push(url);
            }

        });
        console.log(urls);
    }
});
但是用这种方法是不可能得到头部的。我尝试了这个方法,但它只提供了正文脚本,而没有标题中的脚本:

request(webUrl, function(err, resp, body){
    if(!err && resp.statusCode == 200) {
        var $ = cheerio.load(body);

        $('script').each(function(){

            //Getting the href attribute from the 'a' link
            var url = $(this).attr('src');
            console.log(url);

            if(url != undefined) {


                wowo.push(url);
            }

        });
        console.log(wowo);
    }
});
有人能帮我吗?:'(


您好,谢谢您的回答。不幸的是,即使我尝试使用parse5或htmlparser2,我也只能得到这样的结果:
{nodeName:'#document',mode:'no怪癖',childNodes:[{nodeName:'#documentType',name:'html',publicId:null,systemId:null,parentNode:[循环],{nodeName:'html',标记名:'html',属性:[Object],命名空间URI:'http://www.w3.org/1999/xhtml,childNodes:[Object],parentNode:[Circular]}
我没有完整的网站,只有顶部。这很奇怪