Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/three.js/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python美化组属性错误_Python_Beautifulsoup - Fatal编程技术网

Python美化组属性错误

Python美化组属性错误,python,beautifulsoup,Python,Beautifulsoup,我正在尝试使用python beautifulsoup从html内容中获取一些图像url 我的HTML内容: <div id="photos" class="tab rel-photos multiple-photos"> <span id="watch-this" class="classified-detail-buttons"> <span id="c_id_

我正在尝试使用python beautifulsoup从html内容中获取一些图像url

我的HTML内容:

<div id="photos" class="tab rel-photos multiple-photos">
   <span id="watch-this" class="classified-detail-buttons">
   <span id="c_id_10832265:c_type_202:watch_this">
   <a href="/watchlist/classified/baby-items/10832265/1/" id="watch_this_logged" data-require-auth="favoriteAd" data-tr-event-name="dpv-add-to-favourites">
   <i class="fa fa-fw fa-star-o"></i></a></span>
   </span>
   <span id="thumb1" class=" image">
      <a href="https://images.dubizzle.com/v1/files/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmbiI6ImYzYWdrZm8xcDBlai1EVUJJWlpMRSIsInciOlt7ImZuIjoiNWpldWk3cWZ6aWU2MS1EVUJJWlpMRSIsInMiOjUwLCJwIjoiY2VudGVyLGNlbnRlciIsImEiOjgwfV19.s1GmifnZr0_Bx4HG8RTR4puYcxN0asqAmnBvSpIExEI/image;p=main"
         id="a-photo-modal-view:263986810"
         rel="photos-modal"
         target="_new"
         onClick="return dbzglobal_event_adapter(this);">
         <div style="background-image:url(https://images.dubizzle.com/v1/files/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmbiI6ImYzYWdrZm8xcDBlai1EVUJJWlpMRSIsInciOlt7ImZuIjoiNWpldWk3cWZ6aWU2MS1EVUJJWlpMRSIsInMiOjUwLCJwIjoiY2VudGVyLGNlbnRlciIsImEiOjgwfV19.s1GmifnZr0_Bx4HG8RTR4puYcxN0asqAmnBvSpIExEI/image;p=main);"></div>
      </a>
   </span>
   <ul id="thumbs-list">
      <li>
         <span id="thumb2" class="image2">
            <a href="https://images.dubizzle.com/v1/files/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmbiI6Imtmc3cxMWgzNTB2cTMtRFVCSVpaTEUiLCJ3IjpbeyJmbiI6IjVqZXVpN3FmemllNjEtRFVCSVpaTEUiLCJzIjo1MCwicCI6ImNlbnRlcixjZW50ZXIiLCJhIjo4MH1dfQ.Wo2YqPdWav8shtmyVO2AdisHmLX-ZLDAiskLPAmTSPU/image;p=main" id="a-photo-modal-view:263986811" rel="photos-modal" target="_new" onClick="return dbzglobal_event_adapter(this);" >
               <div style="background-image:url(https://images.dubizzle.com/v1/files/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmbiI6Imtmc3cxMWgzNTB2cTMtRFVCSVpaTEUiLCJ3IjpbeyJmbiI6IjVqZXVpN3FmemllNjEtRFVCSVpaTEUiLCJzIjo1MCwicCI6ImNlbnRlcixjZW50ZXIiLCJhIjo4MH1dfQ.Wo2YqPdWav8shtmyVO2AdisHmLX-ZLDAiskLPAmTSPU/image;p=thumb_retina);"></div>
            </a>
         </span>
      </li>
      <li id="thumbnails-info">
         4 Photos
      </li>
   </ul>
   <div id="photo-count">
      4 Photos - Click to enlarge
   </div>
</div>
但我得到了一个错误:

Traceback (most recent call last):
  File "/Users/evilslab/Documents/Websites/www.futurepoint.dev.cc/dobuyme/SCRAPE/boats.py", line 47, in <module>
    images = soup.find("div", {"id": ["photos"]}).find_all("a")
AttributeError: 'NoneType' object has no attribute 'find_all'

回溯(最近一次呼叫最后一次):
文件“/Users/evisplab/Documents/Websites/www.futurepoint.dev.cc/dobuyme/SCRAPE/boats.py”,第47行,in
images=soup.find(“div”,“id”:[“photos”]}).find_all(“a”)
AttributeError:“非类型”对象没有“全部查找”属性

如何仅从href标签获取url?

您的代码更全面地适用于我(假设您的HTML为
HTML\u doc
):

但是,您的问题是,URL中的
请求
返回的文本与您给出的HTML示例不一致。尽管您尝试提供随机用户代理,但服务器返回:

<li>You\'re a power user moving through this website with super-human speed.</li>\n                        <li>You\'ve disabled JavaScript in your web browser.</li>\n                        <li>A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this <a title=\'Third party browser plugins that block javascript\' href=\'http://ds.tl/help-third-party-plugins\' target=\'_blank\'>support article</a>.</li>\n                    </ul>\n                </div>\n                <p class="we-could-be-wrong" >\n                    We could be wrong, and sorry about that! Please complete the CAPTCHA below and we’ll get you back on dubizzle right away.
  • 您是一个超级用户,以超人的速度浏览此网站。
  • \n
  • 您已禁用web浏览器中的JavaScript。
  • \n
  • 第三方浏览器插件(如Ghostery或NoScript)正在阻止JavaScript运行。更多信息可在此查看。
  • \n\n\n

    \n我们可能错了,对此表示抱歉!请完成下面的验证码,我们会马上让您回到dubizzle。

    由于验证码是为了防止刮擦,我建议尊重管理员的意愿,不要刮擦它。也许有API?

    试试这个:

    for item in soup.find_all('span'):
        try:
            link = item.find_all('a', href=True)[0].attrs.get('href', None)
        except IndexError:
            continue
        else:
            print(link)
    
    输出

    /watchlist/classified/baby-items/10832265/1/
    /watchlist/classified/baby-items/10832265/1/
    https://images.dubizzle.com/v1/files/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmbiI6ImYzYWdrZm8xcDBlai1EVUJJWlpMRSIsInciOlt7ImZuIjoiNWpldWk3cWZ6aWU2MS1EVUJJWlpMRSIsInMiOjUwLCJwIjoiY2VudGVyLGNlbnRlciIsImEiOjgwfV19.s1GmifnZr0_Bx4HG8RTR4puYcxN0asqAmnBvSpIExEI/image;p=main
    https://images.dubizzle.com/v1/files/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmbiI6Imtmc3cxMWgzNTB2cTMtRFVCSVpaTEUiLCJ3IjpbeyJmbiI6IjVqZXVpN3FmemllNjEtRFVCSVpaTEUiLCJzIjo1MCwicCI6ImNlbnRlcixjZW50ZXIiLCJhIjo4MH1dfQ.Wo2YqPdWav8shtmyVO2AdisHmLX-ZLDAiskLPAmTSPU/image;p=main
    
    

    page=requests.get(url,headers={'user-agent':user_-agent.random})soup=BeautifulSoup(page.text,'html.parser')url=“”,这意味着,没有办法做到这一点?你对假冒ipThey的看法如何?他们将我的ip列入黑名单?同样的错误。回溯(最近一次调用):文件“/Users/evilslab/Documents/Websites/www.futurepoint.dev.cc/dobuyme/SCRAP/boats.py”,第48行,在soup.find(“div”,“id”:[“photos”]})。find_all(“a”):AttributeError:“NoneType”对象没有属性“find_all”我更改了答案,试试看。否则请发送url,因为根据问题中的html,我无法复制您的errorurl=“”
    for item in soup.find_all('span'):
        try:
            link = item.find_all('a', href=True)[0].attrs.get('href', None)
        except IndexError:
            continue
        else:
            print(link)
    
    /watchlist/classified/baby-items/10832265/1/
    /watchlist/classified/baby-items/10832265/1/
    https://images.dubizzle.com/v1/files/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmbiI6ImYzYWdrZm8xcDBlai1EVUJJWlpMRSIsInciOlt7ImZuIjoiNWpldWk3cWZ6aWU2MS1EVUJJWlpMRSIsInMiOjUwLCJwIjoiY2VudGVyLGNlbnRlciIsImEiOjgwfV19.s1GmifnZr0_Bx4HG8RTR4puYcxN0asqAmnBvSpIExEI/image;p=main
    https://images.dubizzle.com/v1/files/eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJmbiI6Imtmc3cxMWgzNTB2cTMtRFVCSVpaTEUiLCJ3IjpbeyJmbiI6IjVqZXVpN3FmemllNjEtRFVCSVpaTEUiLCJzIjo1MCwicCI6ImNlbnRlcixjZW50ZXIiLCJhIjo4MH1dfQ.Wo2YqPdWav8shtmyVO2AdisHmLX-ZLDAiskLPAmTSPU/image;p=main