Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/loops/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R CSS选择器读取用户评级_R_Web Scraping_Css Selectors_Rvest - Fatal编程技术网

R CSS选择器读取用户评级

R CSS选择器读取用户评级,r,web-scraping,css-selectors,rvest,R,Web Scraping,Css Selectors,Rvest,网站: 目标:提取个人用户评分 当我检查用户评级时,我看到了这一点 <span class="staticStars notranslate" title="did not like it"> 如果我可以提取标题,我就可以绘制收视率 rate_map = {'did not like it': 1, 'it was ok': 2, 'liked it': 3, 'really liked it': 4, 'it was amazing': 5} url = 'https://

网站: 目标:提取个人用户评分

当我检查用户评级时,我看到了这一点

<span class="staticStars notranslate" title="did not like it">

如果我可以提取标题,我就可以绘制收视率

rate_map = {'did not like it': 1,
'it was ok': 2,
'liked it': 3,
'really liked it': 4,
'it was amazing': 5}

url = 'https://www.goodreads.com/book/show/27841061-nevernight'
gr_list <- read_html(url)
gr_list %>%  html_node('.staticStars .notranslate') %>%  
  html_attr('title')
rate_map={'不喜欢它:1,
“还行”:2,
“喜欢它”:3,
“真的很喜欢”:4,
“太棒了”:5}
url='1〕https://www.goodreads.com/book/show/27841061-nevernight'
gr_列表%html_节点('.staticStars.NotTranslate')%>%
html_attr('标题')
代码的结果是“NA”

谁能告诉我我做错了什么?
谢谢。

css选择器
.statisticstars.nottranslate
表示您正在查找一个类
nottranslate
嵌套在类
statisticstars
节点中的节点。也就是说,它会匹配这样的东西

<span class="staticStars"><span class="notranslate">foo</span></span>
foo
如果要匹配具有两个类的节点,则需要确保选择器之间没有空格。你能行

url <- 'https://www.goodreads.com/book/show/27841061-nevernight'
gr_list <- read_html(url)
gr_list %>%  html_nodes('.staticStars.notranslate') %>% 
  html_attr('title')

#  [1] NA                NA                "did not like it"
#  [4] "did not like it" "it was amazing"  "it was amazing" 
#  [7] "it was amazing"  "it was amazing"  "it was amazing" 
# [10] "did not like it" "it was amazing"  "really liked it"
# [13] "did not like it" "it was amazing"  "it was amazing" 
# [16] "it was amazing"  "did not like it" "it was amazing" 
# [19] "it was amazing"  "it was amazing"  "it was amazing" 
# [22] "it was amazing"  "it was amazing"  "it was amazing" 
# [25] "it was amazing"  "it was amazing"  "it was amazing" 
# [28] "it was amazing"  "it was amazing"  "liked it" 
url%
html_attr('标题')
#[1]娜娜“不喜欢”
#[4]“不喜欢它”“太棒了”“太棒了”
#[7]“太棒了”“太棒了”“太棒了”
#[10]“不喜欢它”“太棒了”“真的喜欢它”
#[13]“不喜欢它”“太棒了”“太棒了”
#[16]“太棒了”“我不喜欢”“太棒了”
#[19]“太棒了”“太棒了”“太棒了”
#[22]“太棒了”“太棒了”“太棒了”
#[25]“太棒了”“太棒了”“太棒了”
#[28]“太棒了”“太棒了”“喜欢它”

可能重复:这是我这边的一个错误。输出仍然是NAWell,第一个节点没有标题。如果您将
html\u节点
更改为
html\u节点
,您将获得所有节点,并且您将看到大多数节点都有一个标题。