Javascript 复制具有相同类但按顺序排列的元素的内部文本(木偶演员)
有一个包装器,每个包装器都有一个容器,其中包含标题、说明和其他信息,但所有标题、说明和其他详细信息都具有相同的类名。我已经尝试了在这里找到的Javascript 复制具有相同类但按顺序排列的元素的内部文本(木偶演员),javascript,web-scraping,puppeteer,Javascript,Web Scraping,Puppeteer,有一个包装器,每个包装器都有一个容器,其中包含标题、说明和其他信息,但所有标题、说明和其他详细信息都具有相同的类名。我已经尝试了在这里找到的querySelectorAll 但它无法获取元素。我需要让他们各就各位,这样他们才能保持井然有序。这是其中一个容器的外观 < div class = "cms-article-list__content--container" > < div class = "cms-article-list
querySelectorAll
但它无法获取元素。我需要让他们各就各位,这样他们才能保持井然有序。这是其中一个容器的外观
<
div class = "cms-article-list__content--container" >
<
div class = "cms-article-list__title-number-panel" >
<
span class = "cms-article-list__title-number" > 1 < /span> <
/div>
<
div class = "cms-article-list__content--group" >
<
div class = "cms-article-list__content--group-title" > BARBELL BENCH PRESS(WARM - UP SETS) < /div>
<
div class = "cms-article-list__content--group-description" > Use light weight and perform 2 sets of 5 - 10 reps, stopping each set short of failure. < /div>
<
div class = "cms-article-list__content" >
<
div class = "cms-article-list__content--container-left" >
<
div class = "cms-article-workout__exercise--info" >
<
a target = "_blank"
href = "//www.bodybuilding.com/exercises/detail/view/name/barbell-bench-press-medium-grip"
class = "cms-article-workout__exercise--title" > Barbell Bench Press - Medium Grip < /a> <
div class = "cms-article-workout__exercise--description" > < /div> <
/div>
<
div class = "cms-article-workout__sets--definition" >
<
span > 2 sets, 5 - 10 reps(rest 1 min.) < /span> <
/div>
<
/div> <
a target = "_blank"
href = "//www.bodybuilding.com/exercises/detail/view/name/barbell-bench-press-medium-grip"
class = "cms-article-list__content--container-right" >
<
img class = "cms-article-workout__exercise--img cms-article-workout__exercise--img-left"
src = "https://www.bodybuilding.com/images/2020/xdb/cropped/xdb-81e-bench-press-m1-square-130x130.jpg"
srcset = "https://www.bodybuilding.com/images/2020/xdb/cropped/xdb-81e-bench-press-m1-square-130x130.jpg 1x, https://www.bodybuilding.com/images/2020/xdb/cropped/xdb-81e-bench-press-m1-square-200x200.jpg 2x" >
<img class = "cms-article-workout__exercise--img cms-article-workout__exercise--img-right"
src = "https://www.bodybuilding.com/images/2020/xdb/cropped/xdb-81e-bench-press-m2-square-130x130.jpg"
srcset = "https://www.bodybuilding.com/images/2020/xdb/cropped/xdb-81e-bench-press-m2-square-130x130.jpg 1x, https://www.bodybuilding.com/images/2020/xdb/cropped/xdb-81e-bench-press-m2-square-200x200.jpg 2x" >
</a> </div>
</div>
</div>
querySelectorAll()
存在于浏览器(文档)上下文中,您只能在evaluate()
函数中使用它。在puppeter(Node.js)上下文中,您可以尝试
因此,您的代码可以在以下变体中重写:
const nodes=wait page.$$('.cms-article-list_uucontent-container');
for(节点的常量节点){
const ExerciseGroupTitle=等待节点$eval(
`.cms-article-list\u内容--组标题`,
(el)=>el.innerText
);
console.log(ExerciseGroupTitle)
const ExerciseGroupDescription=等待节点$eval(
`.cms-article-list\u内容--组描述`,
(el)=>el.innerText
);
console.log(ExerciseGroupDescription)
const ExerciseName=等待节点。$eval(
`.cms-article-workout\u练习--info`,
(el)=>el.innerText
);
console.log(ExerciseName)
const ExerciseSets=等待节点$eval(
`.cms-article-workout\uu集合--定义>跨度`,
(el)=>el.innerText
);
console.log(ExerciseSets)
}
或:
const innerTexts=wait page.evaluate(
() => {
const nodes=[…document.querySelectorAll('.cms-article-list\uu content--container');
const text=nodes.map(node=>[
node.querySelector(`.cms-article-list\u content--group title`)。innerText,
node.querySelector(`.cms-article-list\u content--group description`)。innerText,
node.querySelector(`.cms-article-workout\u exercise--info`)。innerText,
node.querySelector(`.cms-article-workout\u set--definition>span`)。innerText,
]);
返回文本;
}
);
console.log(内部文本);
测试:
从“木偶师”导入木偶师;
const browser=wait puppeter.launch({headless:false,defaultViewport:null});
试一试{
const[page]=wait browser.pages();
等待页面。转到('https://www.bodybuilding.com/profile/login');
等待页面。键入('input#登录用户名input','***');
等待页面。键入('input#登录密码输入','***');
等待承诺([
page.waitForNavigation(),
页面。单击('按钮#登录提交按钮'),
]);
等待页面。转到('https://www.bodybuilding.com/workout-plans/jim-stoppanis-12-week-shortcut-to-size/day/1');
wait page.waitForSelector('.cms-article-list__content--container');
const innerText=wait page.evaluate(
() => {
const nodes=[…document.querySelectorAll('.cms-article-list\uu content--container');
const text=nodes.map(node=>[
node.querySelector(`.cms-article-list_uucontent--group title`)?.innerText,
node.querySelector(`.cms-article-list_uucontent--group description`)?.innerText,
node.querySelector(`.cms-article-workout\uu exercise--info`)?.innerText,
node.querySelector(`.cms-article-workout\uu set--definition>span`)?.innerText,
]);
返回文本;
}
);
console.log(内部文本);
}catch(err){console.error(err);}最后{wait browser.close();}
输出:
[
[
'BARBELL BENCH PRESS (WARM-UP SETS)',
'Use light weight and perform 2 sets of 5-10 reps, stopping each set short of failure.',
'Barbell Bench Press - Medium Grip',
'2 sets, 5-10 reps (rest 1 min.)'
],
[
null,
null,
'Barbell Bench Press - Medium Grip\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'4 sets, 12-15 reps (rest 2 min.)'
],
[
null,
null,
'Barbell Incline Bench Press Medium-Grip\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 2 min.)'
],
[
null,
null,
'Incline Dumbbell Flyes\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 1 min.)'
],
[
null,
null,
'Cable Crossover\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 1 min. )'
],
[
null,
null,
'Triceps Pushdown\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'4 sets, 12-15 reps (rest 1 min.)'
],
[
null,
null,
'Dumbbell skullcrusher\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 1 min.)'
],
[
null,
null,
'Low cable overhead triceps extension\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 1 min. )'
],
[
null,
null,
'Standing Dumbbell Calf Raise\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 25-30 reps (rest 1 min. )'
],
[
null,
null,
'Seated Calf Raise\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 25-30 reps (rest 1 min. )'
]
]
第一个代码给了我这个错误
error:error:failed to find element matching selector“.cms-article-list\u content--group title”
,第二个代码给了我这个错误:error:Evaluation failed:TypeError:Cannot read property'innerText'of null
,我已经添加了一个测试代码,其中包含了您提供给答案的HTML片段,它可以正常工作。看起来整个页面中的包装器元素在结构上可能有一些差异。因此,在获取其文本之前,只需检查子元素是否存在。例如,您可以使用?.innerText
来获取未定义的元素,而不是缺少元素的错误。好的,我更改了表中的换行符,它以数组形式给出了结果,但它没有循环表中的下一项,也没有给我错误。我更新了我的问题,加入了它提供的数组。由于它需要登录信息,有没有一种方法可以向您发送消息?很抱歉给您带来不便
[
[
'BARBELL BENCH PRESS (WARM-UP SETS)',
'Use light weight and perform 2 sets of 5-10 reps, stopping each set short of failure.',
'Barbell Bench Press - Medium Grip',
'2 sets, 5-10 reps (rest 1 min.)'
]
]
[
[
'BARBELL BENCH PRESS (WARM-UP SETS)',
'Use light weight and perform 2 sets of 5-10 reps, stopping each set short of failure.',
'Barbell Bench Press - Medium Grip',
'2 sets, 5-10 reps (rest 1 min.)'
],
[
null,
null,
'Barbell Bench Press - Medium Grip\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'4 sets, 12-15 reps (rest 2 min.)'
],
[
null,
null,
'Barbell Incline Bench Press Medium-Grip\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 2 min.)'
],
[
null,
null,
'Incline Dumbbell Flyes\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 1 min.)'
],
[
null,
null,
'Cable Crossover\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 1 min. )'
],
[
null,
null,
'Triceps Pushdown\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'4 sets, 12-15 reps (rest 1 min.)'
],
[
null,
null,
'Dumbbell skullcrusher\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 1 min.)'
],
[
null,
null,
'Low cable overhead triceps extension\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 12-15 reps (rest 1 min. )'
],
[
null,
null,
'Standing Dumbbell Calf Raise\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 25-30 reps (rest 1 min. )'
],
[
null,
null,
'Seated Calf Raise\n' +
'Perform a rest-pause after the final set. See Training Guidelines for details.',
'3 sets, 25-30 reps (rest 1 min. )'
]
]