Node.js page.evaluate puppeter函数的参数,用于动态web抓取

Node.js page.evaluate puppeter函数的参数,用于动态web抓取,node.js,puppeteer,screen-scraping,Node.js,Puppeteer,Screen Scraping,我想传递page.evaluate()函数()的参数以进行动态刮片,但我什么也做不到。 有人能帮我吗?我正在尝试使用page.evaluate的参数函数来刮取大量页面,但从pharmavida开始。我想通过参数传递每个页面的主Url,从页面中提取每个会话,并从每个会话中提取数据,但我无法将参数传递给包含page.evaluate的函数。。因为那样的话,我想让它的动态刮削每一页的部分,以刮。。。我还尝试在page.evaluate之外放置一个let,并将节的父类的selector类的元素传递给qu

我想传递page.evaluate()函数()的参数以进行动态刮片,但我什么也做不到。 有人能帮我吗?我正在尝试使用page.evaluate的参数函数来刮取大量页面,但从pharmavida开始。我想通过参数传递每个页面的主Url,从页面中提取每个会话,并从每个会话中提取数据,但我无法将参数传递给包含page.evaluate的函数。。因为那样的话,我想让它的动态刮削每一页的部分,以刮。。。我还尝试在page.evaluate之外放置一个let,并将节的父类的selector类的元素传递给querySelectorAll(),但它表示未定义此变量。当我将其作为字符串而不是参数放置时,一切对我来说都很好,但我的想法是,刮取都是动态的 :

例如:

const data = await page.evaluate(function(params){
const myData = querySelectorAll(params.firstEleemntClass)
return{
data:myData
}
})
console.warn(data)//good data ruturn
但我所做的一切都不适合我。。。我想为多个页面和部分创建动态网页抓取:

const FarmaVidaHome = 'https://drogueriasfarmavida.com'
const FarmaTodoHome = 'https://www.farmatodo.com.ve'
const CruzVerde = 'https://www.cruzverde.com.co'
const LaBotica = 'https://www.tudrogueriavirtual.com/?v=9293'

module.exports = {
sites:[
    {homeUrl:FarmaVidaHome, navigationType:'navbar',
    fatherSectionClass:'.nav-top-link',
    ///////////////////////////////////////
    data:{
        productCardClass:'.product-type-simple',
        paginationClass:'.woocommerce-pagination',
        idClass:'.image-fade_in_back a',
        product_nameClass:'.product-title',
        imageClass:'.attachment-woocommerce_thumbnail',
        categoryClass:'.product-cat',
        priceClass:'.woocommerce-Price-amount'
        }
    }

  ]    
}


const puppeteer = require('puppeteer')
const {sites}  = require('./sites')
const {exploringPages} = require('./src/navigation/index')

const startScraping = async (datas) =>{
console.warn('THIS IS THE SITES-->', datas)  
let dataAgruped = []
     for (let i = 0; i < datas.length; i++) {
        const pageItem = datas[i]; 
      
         const response = await  exploringPages(pageItem)
         dataAgruped.push(pageItem)
     }
  // await exploringPages(datas)
}
startScraping(sites)






const exploringPages = async(thePage) =>{
console.warn('QUE VIENE AQUIII-->', thePage)
let myPage = thePage
  const browser = await puppeteer.launch()
    const page = await browser.newPage()
  //await page.type('#selector', 'lo que quieres buscaar')
    await page.goto(thePage.homeUrl)
    let thisItem = thePage
      const dataNavigation = await page.evaluate( ({thisItem})=>{
        console.warn('PAGE thisItem EN IVALUATE-->', thisItem)
        const $sections = document.querySelectorAll(thisItem.fatherSectionClass)
        const data = []
        $sections.forEach(($section) => {
          data.push({
            path:$section.getAttribute('href'),
           // data:thisItem.data
            })   
        });
        return{
          sections:data
        }
      }, {thisItem})
        console.warn('this is the sections--->', dataNavigation)
    // await  exploringSections(dataNavigation.sections)
  
 
  //await browser.close()
}

module.exports = {
    exploringPages
}
node:22759) UnhandledPromiseRejectionWarning: TimeoutError: Navigation timeout of 30000 ms exceeded
    at /Users/devios/Downloads/work/tests/node_modules/puppeteer/lib/cjs/puppeteer/common/LifecycleWatcher.js:106:111
(Use `node --trace-warnings ...` to show where the warning was created)
(node:22759) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:22759) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.