Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/javascript/385.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript 使用Puppeter选择表格的第二个表格行_Javascript_Html_Node.js_Puppeteer - Fatal编程技术网

Javascript 使用Puppeter选择表格的第二个表格行

Javascript 使用Puppeter选择表格的第二个表格行,javascript,html,node.js,puppeteer,Javascript,Html,Node.js,Puppeteer,我正在使用node.js和puppeter在一个爬虫程序中工作,我的目标是获取表中两列的数据(日期和描述),代码工作正常,直到块从列中获取数据 下面的完整代码,包括我正在爬网的页面的url: const fs = require('fs'); const puppeteer = require('puppeteer'); const urlConsulta = "http://www.tre-pr.jus.br/"; const numeroProcessoSeq = "000000889";

我正在使用node.js和puppeter在一个爬虫程序中工作,我的目标是获取表中两列的数据(日期和描述),代码工作正常,直到块从列中获取数据

下面的完整代码,包括我正在爬网的页面的url:

const fs = require('fs');
const puppeteer = require('puppeteer');

const urlConsulta = "http://www.tre-pr.jus.br/";
const numeroProcessoSeq = "000000889";
const numeroProcessoAno = "2014";
const numeroProcessoDigito = "6160047";

var wait = ms => new Promise((r, j)=> setTimeout(r, ms));

void (async () => {
    try {
        const browser = await puppeteer.launch({
            headless: false
        });
        const page = await browser.newPage();
        await page.goto(urlConsulta);
        await page.select('#acao', 'pesquisarNumUnico');
        await page.evaluate((numeroProcessoSeq, numeroProcessoAno, numeroProcessoDigito) => {
            document.getElementById('numUnicoSequencial').value = numeroProcessoSeq;
            document.getElementById('numUnicoAno').value = numeroProcessoAno;
            document.getElementById('numUnicoOrigem').value = numeroProcessoDigito;
        }, numeroProcessoSeq, numeroProcessoAno, numeroProcessoDigito);

        await page.$eval('form[action*="http://www.tre-pr.jus.br/@@processrequest"]', form => form.submit());

        await page.waitForNavigation();
        var frame = await page.frames().find(f => f.name() === 'ifr_servicos');
        await frame.click('a[href*="ExibirDadosProcesso"]');
        await page.frames().find(f => f.name() === 'ifr_servicos');
        await wait(10000);
        await frame.click('[name*="todos"]');
        await frame.$eval('[name*="ExibirPartesProcessoZona"]', form => form.submit());
        await wait(10000);
        let string = await buscaFases(frame);
        fs.writeFile("teste.txt", string, function(err) {
            if(err) {
                return console.log(err);
            }
            console.log("The file was saved!");
        }); 
        console.log(string);
        await wait(10000);
        await browser.close();
    } catch (error) {
        console.log(error);
    }
})();

async function buscaFases(frame) {
    return await frame.evaluate(() => {
        let div = document.querySelector('div[id*="conteudo"]');
        let rowns = Array.from(div.children[4].children[0].children);
        let movimentosInfo = rowns.map(row => {
          let data = row.querySelector("tr td:first-child").textContent;
          let descricao = row.querySelector("tr td:first-child + td").textContent;
          return { data, descricao };
        });
        return JSON.stringify(movimentosInfo);
    });
};
获取数据的特定行:

let data = row.querySelector("tr td:first-child").textContent;
let descricao = row.querySelector("tr td:first-child + td").textContent;

问题是并非所有的
tr
都具有您期望的子元素。这可能是因为带有colspan的
td
标记。因此,您应该首先过滤数组以将其他元素排序出来

代码 将包含映射函数的行从
let movimentosInfo=…
更改为:

让movimentosInfo=rowns.filter(行=>{
返回row.querySelector(“tr td:first child”)和&row.querySelector(“tr td:first child+td”);
}).map(行=>{
让data=row.querySelector(“tr td:first child”).textContent;
让descripcao=row.querySelector(“trtd:first child+td”).textContent;
返回{data,descripcao};
});
这将添加一个过滤器函数,用于在映射所需元素的内容之前测试它们是否确实存在