Javascript Cheerio不完整Html
我正试图用cheerio来清理一个网站Javascript Cheerio不完整Html,javascript,cheerio,Javascript,Cheerio,我正试图用cheerio来清理一个网站 const rp = require('request'); const cheerio = require('cheerio'); rp('https://www.fideyo.com/list',(error,response,html) => { if(!error && response.statusCode == 200) { const $ = cheerio.load(html);
const rp = require('request');
const cheerio = require('cheerio');
rp('https://www.fideyo.com/list',(error,response,html) =>
{
if(!error && response.statusCode == 200)
{
const $ = cheerio.load(html);
console.log($.html());
}
});
但它返回一个不完整的html体,如
<body>
<div id="app"></div>
<script type="text/javascript" src="https://cdn.fideyo.com/static/main.js?v=13"></script>
<!-- Google Tag Manager (noscript) -->
<noscript>
<iframe src="https://www.googletagmanager.com/ns.html?id=GTM-KBGVCP3"
height="0" width="0" style="display:none;visibility:hidden">
</iframe>
</noscript>
<!-- End Google Tag Manager (noscript) -->
</body></html>
如何访问应用程序部分中的内容?如果我理解正确,应用程序部分中的内容可能是由JavaScript动态创建的,
cheerio
不运行JavaScript,只解析硬编码的HTML
你需要像这样的东西:
从“木偶师”导入木偶师;
const browser=wait puppeter.launch();
试一试{
const[page]=wait browser.pages();
等待页面。转到('https://www.fideyo.com/list');
wait page.waitForSelector('div#app div#page wrapper a');
const html=wait page.content();
log(html);
}catch(err){console.error(err);}最后{wait browser.close();}
文件如下: