Javascript 木偶演员:如何用地图代替for of进行迭代?

Javascript 木偶演员:如何用地图代替for of进行迭代?,javascript,puppeteer,webautomation,Javascript,Puppeteer,Webautomation,我想刮一个网站,其中有一个产品列表,每个产品都有一个特定的页面与更多的数据。我想使用MAP ASYNC+PROMISE.ALL而不是FOR-of来完成它,但是我无法使它正常工作 工作样本,用于: const puppeteer = require("puppeteer"); const SELECTOR_ITEMS_LINKS = ".sg-col-4-of-12.s-result-item.sg-col-4-of-16.sg-col.sg-col-4-of-

我想刮一个网站,其中有一个产品列表,每个产品都有一个特定的页面与更多的数据。我想使用MAP ASYNC+PROMISE.ALL而不是FOR-of来完成它,但是我无法使它正常工作

工作样本,用于:

const puppeteer = require("puppeteer");

const SELECTOR_ITEMS_LINKS =
  ".sg-col-4-of-12.s-result-item.sg-col-4-of-16.sg-col.sg-col-4-of-20  .a-link-normal.s-no-outline";

const removeEmptyLines = (txt) => txt.replace(/\n\n/g, "");

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto("https://www.amazon.com/s?k=gaming+chair");

  const links = await page.$$eval(SELECTOR_ITEMS_LINKS, (links) =>
    links.map((link) => link.href)
  );

  for (const link of links) {
    await page.goto(link);
    const rawTitle = await page.$eval("#productTitle", (el) => el.textContent);
    const title = removeEmptyLines(rawTitle);

    console.log({ link, title });
  }

  await browser.close();
})();
const puppeteer = require("puppeteer");

const SELECTOR_ITEMS_LINKS =
  ".sg-col-4-of-12.s-result-item.sg-col-4-of-16.sg-col.sg-col-4-of-20  .a-link-normal.s-no-outline";

const removeEmptyLines = (txt) => txt.replace(/\n\n/g, "");

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto("https://www.amazon.com/s?k=gaming+chair");

  const links = await page.$$eval(SELECTOR_ITEMS_LINKS, (links) =>
    links.map((link) => link.href)
  );

  const resolver = async (link) => {
    await page.goto(link);
    const rawTitle = await page.$eval("#productTitle", (el) => el.textContent);
    const title = removeEmptyLines(rawTitle);

    return { link, title };
  };

  const promises = await links.map((link) => resolver(link));
  const result = await Promise.all(promises);

  console.log(result);

  browser.close();
})();
结果:

{
  link: 'https://www.amazon.com/AJS-Clearance-Computer-Armrests-Adjustment/dp/B08QHZX2M9/ref=sr_1_26?dchild=1&keywords=gaming+chair&qid=1616435089&sr=8-26',        
  title: 'AJS Office Chairs Clearance, Cheap Gaming Chair for Teens, Fabric Computer Desk Chair with Padded Armrests and Height Adjustment (Red)\n'
}
{
  link: 'https://www.amazon.com/Swivel-Gaming-Support-Adjustable-Lounger/dp/B089D2DDNT/ref=sr_1_27?dchild=1&keywords=gaming+chair&qid=1616435089&sr=8-27',
  title: 'Swivel Gaming Floor Chair with Arms Back Support Adjustable Floor Sofa for Adults Teens Lazy Sofa Lounger Video Game Chair, Black and Blue\n'
}
{
  link: 'https://www.amazon.com/Nokaxus-Retractible-adjustment-Thickening-YK-6008-BLACK/dp/B07DZKG7SN/ref=sr_1_28?dchild=1&keywords=gaming+chair&qid=1616435089&sr=8-28',
  title: 'Nokaxus Gaming Chair Large Size High-back Ergonomic Racing Seat with Massager Lumbar Support and Retractible Footrest PU Leather 90-180 degree adjustment 
of backrest Thickening sponges (YK-6008-BLACK)\n'
}
现在我想做同样的代码,但是使用MAP而不是FOR-of。示例:

const puppeteer = require("puppeteer");

const SELECTOR_ITEMS_LINKS =
  ".sg-col-4-of-12.s-result-item.sg-col-4-of-16.sg-col.sg-col-4-of-20  .a-link-normal.s-no-outline";

const removeEmptyLines = (txt) => txt.replace(/\n\n/g, "");

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto("https://www.amazon.com/s?k=gaming+chair");

  const links = await page.$$eval(SELECTOR_ITEMS_LINKS, (links) =>
    links.map((link) => link.href)
  );

  for (const link of links) {
    await page.goto(link);
    const rawTitle = await page.$eval("#productTitle", (el) => el.textContent);
    const title = removeEmptyLines(rawTitle);

    console.log({ link, title });
  }

  await browser.close();
})();
const puppeteer = require("puppeteer");

const SELECTOR_ITEMS_LINKS =
  ".sg-col-4-of-12.s-result-item.sg-col-4-of-16.sg-col.sg-col-4-of-20  .a-link-normal.s-no-outline";

const removeEmptyLines = (txt) => txt.replace(/\n\n/g, "");

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto("https://www.amazon.com/s?k=gaming+chair");

  const links = await page.$$eval(SELECTOR_ITEMS_LINKS, (links) =>
    links.map((link) => link.href)
  );

  const resolver = async (link) => {
    await page.goto(link);
    const rawTitle = await page.$eval("#productTitle", (el) => el.textContent);
    const title = removeEmptyLines(rawTitle);

    return { link, title };
  };

  const promises = await links.map((link) => resolver(link));
  const result = await Promise.all(promises);

  console.log(result);

  browser.close();
})();
我得到的是相同的数据,就像它忽略了其他链接一样。结果:

  {
    link: 'https://www.amazon.com/OSP-Furniture-Ergonomic-Adjustable-Accents/dp/B08PDS88PZ/ref=sr_1_58?dchild=1&keywords=gaming+chair&qid=1616435001&sr=8-58',      
    title: 'Soontrans Rocking Gaming Chair,Ergonomic PC Computer Chair,Home Office Chair,Racing Chair with Adjustable Recliner and Armrest with Headrest Lumbar Pillow Support (Green)\n'
  },
  {
    link: 'https://www.amazon.com/gp/slredirect/picassoRedirect.html/ref=pa_sp_btf_aps_sr_pg1_1?ie=UTF8&adId=A05234183UPC2EOCKELXB&url=%2FNOUHAUS-Palette-Ergonomic-Comfortable-Computer%2Fdp%2FB083SN6BVS%2Fref%3Dsr_1_59_sspa%3Fdchild%3D1%26keywords%3Dgaming%2Bchair%26qid%3D1616435001%26sr%3D8-59-spons%26psc%3D1%26smid%3DA1DPRB9NBV0XDD&qualifier=1616435001&id=7849373560319144&widgetName=sp_btf',
    title: 'Soontrans Rocking Gaming Chair,Ergonomic PC Computer Chair,Home Office Chair,Racing Chair with Adjustable Recliner and Armrest with Headrest Lumbar Pillow Support (Green)\n'
  },
  {
    link: 'https://www.amazon.com/gp/slredirect/picassoRedirect.html/ref=pa_sp_btf_aps_sr_pg1_1?ie=UTF8&adId=A034517018WLNV9ZSL6AD&url=%2FSoontrans-Ergonomic-Computer-Adjustable-Recliner%2Fdp%2FB08HWPJZP2%2Fref%3Dsr_1_60_sspa%3Fdchild%3D1%26keywords%3Dgaming%2Bchair%26qid%3D1616435001%26sr%3D8-60-spons%26psc%3D1&qualifier=1616435001&id=7849373560319144&widgetName=sp_btf',
    title: 'Soontrans Rocking Gaming Chair,Ergonomic PC Computer Chair,Home Office Chair,Racing Chair with Adjustable Recliner and Armrest with Headrest Lumbar Pillow Support (Green)\n'
  }

您知道如何使用MAP实现相同的结果吗?

问题是您的代码是(伪)并行的。所以他们正在互相干涉。您可以通过在每次通话中创建新页面来修复此问题:

const resolver = async (link) => {
  const newPage = await browser.newPage();
  await newPage.goto(link);
  const rawTitle = await page.$eval("#productTitle", (el) => el.textContent);
  const title = removeEmptyLines(rawTitle);
  await newPage.close();
  return { link, title };
};

考虑到标题中的一个简单问题,所有这些代码都需要回答这个问题吗?我想是的,它展示了我解决问题的尝试。谢谢,我相信这是正确的方法,但是我的计算机没有足够的能力并行运行脚本,结果比解决方案的速度慢。但是谢谢你的解决方案,这对于学习来说是非常好的。