Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/spring/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Javascript 从第三帧抓取数据时卡住_Javascript_Node.js_Puppeteer - Fatal编程技术网

Javascript 从第三帧抓取数据时卡住

Javascript 从第三帧抓取数据时卡住,javascript,node.js,puppeteer,Javascript,Node.js,Puppeteer,我不是一个只想从网站上搜集数据的专业人士。 这里的一些人帮助我选择了第一个“帧”,但我需要从第三个帧中刮取数据,并在一个结果中连接第1帧+第2帧+第3帧的数据。这是站点 这就是我所拥有的: const puppeteer = require('puppeteer'); let scrape = async() => { const browser = await puppeteer.launch({ headless: false, slowMo:

我不是一个只想从网站上搜集数据的专业人士。 这里的一些人帮助我选择了第一个“帧”,但我需要从第三个帧中刮取数据,并在一个结果中连接第1帧+第2帧+第3帧的数据。这是站点 这就是我所拥有的:

const puppeteer = require('puppeteer');

let scrape = async() => {
    const browser = await puppeteer.launch({
        headless: false,
        slowMo: 250
    });
    const page = await browser.newPage();
    await page.goto('', {
        waituntil: "networkidle0"
    });
    const frame = await page.frames().find(f => f.name() === 'stanga');
    const button = await frame.$('body > form > font > select > option:nth-child(12)');
    button.click();
    await page.waitFor(1000);
    const frame1 = await page.frames().find(a => a.name() ==='centru');
    const select = await frame1.$('body > form > font > select > option:nth-child(1)');
    await page.waitFor(500);
    select.click();
    await page.waitFor(500);

    const result = await page.$$eval("body > font", (options) => {
        const timpi = options.map(option => option.innerText);

        return timpi

    });

    await browser.close();
    return result;
};
scrape().then((value) => {
    console.log(value);
});

感谢您的帮助。

您必须改进刮板功能,不仅要单击“选择”,还要从“选择”对象中提取“选定项”值

  const frame = await page.frames().find(f => f.name() === "stanga");
  const select1 = await frame.$(
    "body > form > font > select > option:nth-child(12)"
  );

  const select1Value = await frame.evaluate(
    select1 => select1.textContent,
    select1
  );
select1Value将在选择框中显示所选项目的值。必须在下一帧中对select2执行相同的操作

在代码中,没有选择frame3,因此无法从中读取数据

我已更新了您的代码,这是我可以从您的代码中获得的结果:

$ node scrape.js
Frame1: AT_Miresei_1
Frame2:  [1]  E1
Frame3: Linia: E12019-07-25 22:29:13Sosire1: 22:55 Sosire2: 23:00
这就是我最终得到的,但是还有很多需要改进的地方(代码质量和可读性)


我已经修改了我们的脚本:

const puppeteer = require('puppeteer');

let scrape = async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();

await page.goto('http://example.com/txt', { waitUntil: "networkidle2" });
const optionSelector = 'body > form > font > select > option';
const frames = await page.frames();
const expectedFrames = ['stanga', 'centru'];
const scrapedText = [];


const getOptions = (frameName) => { 
  return frameName.$$eval(optionSelector, (options) => {
    const result = options.map(option => option.innerText);

    return result;
  }, optionSelector);
}

for (const frame of frames) {
  const name = frame.name();

  if (expectedFrames.includes(name)) {
    await frame.click(optionSelector);
    await page.waitFor(1000);
    const result = await getOptions(frame);

    scrapedText.push({[name]: result});
  } else if (name === 'dreapta') {
    const result = await frame.$eval('body', elm =>  elm.innerText);

    scrapedText.push({[name]: result.split(/\n+/g)});
  }
}


await browser.close();

return scrapedText;
};

scrape().then((value) => {
  console.log(value); 
});
输出:

[{ 
   stanga: ['Mures','A Saguna', 'A.Guttenbrun_1', ... and more items]
 },
 {
   centru: ['[0] E3'] 
 },
 { 
   dreapta: ['Linia: E3','2019-07-25 23:19:40','Sosire1: 23:39','Sosire2: 23:41'] 
}]
[{ 
   stanga: ['Mures','A Saguna', 'A.Guttenbrun_1', ... and more items]
 },
 {
   centru: ['[0] E3'] 
 },
 { 
   dreapta: ['Linia: E3','2019-07-25 23:19:40','Sosire1: 23:39','Sosire2: 23:41'] 
}]