对web python进行爬网时出错_Python - Fatal编程技术网

对web python进行爬网时出错

python

对web python进行爬网时出错,python,Python,当我尝试运行下面的代码时，返回了此错误。如果有人能帮我指出我哪里做错了，我将不胜感激。多谢各位 Traceback (most recent call last): File "web_crawler.py", line 26, in <module> links = get_all_links(page) File "web_crawler.py", line 14, in get_all_links url, endpos = get_next_targe

当我尝试运行下面的代码时，返回了此错误。如果有人能帮我指出我哪里做错了，我将不胜感激。多谢各位

Traceback (most recent call last):
  File "web_crawler.py", line 26, in <module>
    links = get_all_links(page)
  File "web_crawler.py", line 14, in get_all_links
    url, endpos = get_next_target(page)
  File "web_crawler.py", line 2, in get_next_target
    start_link = page.find("<a href=")
TypeError: a bytes-like object is required, not 'str'

def get_next_target(page):
    start_link = page.find("<a href=")
    if start_link == -1:
        return None, 0
    start_quote = page.find('"',start_link)
    end_quote = page.find('"',start_quote+1)
    url = page[start_quote+1:end_quote]
    print(url)
    return url, end_quote

def get_all_links(page):
    links = []
    while True:
        url, endpos = get_next_target(page)
        if url:
            links.append(url)
            page = page[endpos:]
        else:
            break
    return links

import requests
url='https://en.wikipedia.org/wiki/Moon'
r = requests.get(url)
page = r.content
links = get_all_links(page)

回溯（最近一次呼叫最后一次）：
文件“web_crawler.py”，第26行，在
链接=获取所有链接（第页）
文件“web\u crawler.py”，第14行，在get\u all\u链接中
url，endpos=获取下一个目标（第页）
文件“web\u crawler.py”，第2行，在get\u next\u目标中
start\u link=page.find（“response.content
是请求的原始内容。它们没有被解码，只是原始字节
您想要使用的是response.text
属性，该属性将解码内容作为字符串包含
（您可能还希望使用类似的html解析库，而不是当前的页面。查找方法）
有关r.content
和r.text
之间的区别（当r
是从请求返回的响应对象时。获取），请参阅




[eclipse plugin]相关文章推荐



                                                        
Eclipse plugin Eclipse的谷歌插件
eclipse-plugin 
Eclipse plugin 如何在eclipse插件中从jar文件引用文件
eclipse-plugin 
Eclipse plugin CompletionProcessor Eclipse插件-替换编辑器中的实际值
eclipse-plugin 
Eclipse plugin 如何在Eclipse插件开发中获取当前选定文件的路径
eclipse-plugin 
Eclipse plugin 来自多个插件的RCP扩展'；s
eclipse-plugineclipse-rcp 
Eclipse plugin 从包含多段线的图层创建图像
eclipse-plugineclipse-rcp 
Eclipse plugin 什么'；将Eclipse插件从一个特性移动到另一个特性的好方法是什么？
eclipse-plugineclipse-rcp 
Eclipse plugin antlr3ide生成没有包信息的解析器和lexer？
eclipse-pluginantlr 
Eclipse plugin “如何访问JDT”；“静态”；来自eclipse插件的图标？
eclipse-plugin 
Eclipse plugin 第谷清洁发射配置
eclipse-pluginosgi 
Eclipse plugin 用于我的开发的Eclipse IDE插件
eclipse-plugin 
Eclipse plugin Eclipse插件项目不使用OSGI捆绑包依赖项
eclipse-plugin 
Eclipse plugin 如何检索在Eclipse中为特定文件扩展名注册的编辑器？
eclipse-plugineclipse-rcp 
Eclipse plugin 如何在Eclipse插件中设置活动的选定项目
eclipse-plugin 
Eclipse plugin 在插件的相对路径上查找资源时出现问题
eclipse-plugineclipse-rcp 
Eclipse plugin 以编程方式从OpenXTEXT编辑器访问完成建议
eclipse-plugin 
                                       





随机文章推荐



                                                        
Reference ExtJs 4对网格中工具栏按钮的引用
referenceextjs4 
Reference 使用Felix注释动态引用服务
referenceosgi 
Reference 为什么引用被称为“引用”；共享“；？
referencerust


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python 单个模块/功能等超过1个文档串。？
									Python
							 
Python 在django测试中检测权限错误
									Python
							 									Django
							 									Unit Testing
							 									Permissions
							 
Python 统计futures.ThreadPoolExecutor中的挂起任务
									Python
							 									Multithreading
							 									Concurrency
							 
Python cdecimal和SQLAlchemy实际上存储'decimal.decimal'？
									Python
							 									Sqlalchemy
							 
如何在Python中比较区分大小写的字符串？
									Python
							 									String
							 
Python multiprocessing.Manager和os.fork产生奇怪的行为
									Python
							 
使用Python的Numpy'；s十进制参数
									Python
							 									Numpy
							 
接下来是什么意思？Python Instagram API
									Python
							 									Instagram
							 
Python 创建新的virtualenv 1.11.5失败，出现setuptools问题
									Python
							 
Python Django扩展管理索引
									Python
							 									Django
							 
使用Tkinter时出现致命的python错误（pygame降落伞）分段错误
									Python
							 									Python 2.7
							 									Tkinter
							 
Python Move Flask Restplus Swagger API文档
									Python
							 									Flask
							 									Swagger
							 
Python 在openpyxl生成的XLSX文件中正确渲染新行
									Python
							 									Excel
							 									Python 3.x
							 
Python 错误：库dfftpack具有Fortran源代码，但没有Fortran编译器
									Python
							 									Git
							 									Heroku
							 
建造Docker容器时，python3中出现神秘的UnicodeDecodeError
									Python
							 									Python 3.x
							 									Unicode
							 									Docker
							 
Python多处理&x27；队列'；对象没有属性'；任务完成'；/'；加入'；
									Python
							 									Multithreading
							 									Python 2.7
							 
Python中的IQ测试函数未按预期工作
									Python
							 									Logic
							 
用于语音助手的通用高效字符串比较python
									Python
							 
Python 如何在tensorflow/keras中使用预定的内核列表初始化Conv2D层？
									Python
							 									Tensorflow
							 									Keras
							 
Python 基于多列排序的排序
									Python
							 									Pandas
							 									Numpy
							 									Sorting
							 
Python 有没有一种方法可以使用多标签分类，但当模型预测keras中只有一个标签时，将其视为正确的？
									Python
							 									Machine Learning
							 									Keras
							 									Scikit Learn
							 
Python DietPI:\u tkinter.TclError:没有显示名称和$display环境变量
									Python
							 									Raspberry Pi
							 
Python 我能超越HTML'；必需的'；位于'上的字段；取消'；按按钮？
									Python
							 									Html
							 									Flask
							 
Python多处理。传递ctype对象时队列暂停
									Python
							 
为什么在使用Python多处理池时会有空闲的工作线程？
									Python
							 									Parallel Processing
							 
Python 如何自动生成海龟对象？
									Python
							 									Python 3.x
							 									Tkinter
							 
Python TensorFlow lambda在与函数API一起使用时添加了一个维度
									Python
							 									Tensorflow
							 									Keras
							 
Python修饰函数Can'；t从另一个类更新属性
									Python
							 									Socket.io
							 
Python 如何加载电子&x2B；蟒蛇鳗鱼
									Python
							 									Electron
							 
Python 更改“的默认范围”；“查找用法”；魅力四射
									Python
							 									Pycharm
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Localization
Stream
Scripting
Snowflake Cloud Data Platform
Compilation
Shell
Encoding
Zurb Foundation
Computer Vision
Ms Office
C# 3.0
Sed
Drupal
Amazon Redshift
Windows Phone
Talend
Canvas
Nhibernate
Openlayers
Arangodb
Loops
Prestashop
Parse Platform
Cloud Foundry
Cryptography
Function
Umbraco
Tabs
Omnet++
Cypress
Apache Kafka
Facebook
Bash
Centos
Html
Microservices
Cron
Migration
Highcharts
Unit Testing
Django Rest Framework
Jasmine
D
Mod Rewrite
Sprite Kit
Pine Script
Mongodb
Syntax
Css
Kendo Ui
Jboss
Xcode
Apache Camel
Google Cloud Firestore
Ip
Scikit Learn
Time
Sbt
Keras
Amp Html
Rspec
Bison
Dynamics Crm
Transactions
Oop
Big O
Curl
Stored Procedures
Scheme
Delphi
Iis
Install4j
Hazelcast
Phpunit
Xml
Memory
Verilog
Lisp
Utf 8
Elixir
Chef Infra
Google Cloud Platform
Angular6
Php
Binding
Jpa
Vba
Google Cloud Storage
Twitter
Azure Functions
Browser
Artifactory
Java
Navigation
Ruby
Gulp
Ios6
Webstorm
Terminal
Google Apps Script
Dependencies
Spring
Search
Regex
Python 3.x
Akka
Cmd
Mariadb
Ada
Mdx
Lua
Sharepoint 2010
Maven
Google Compute Engine
Jakarta Ee
Osgi
Amazon Cloudformation
Jupyter Notebook
Android
Ftp
Network Programming
Interface
Collections
Webpack
Macros
Maps
Lucene
Adobe
Google Colaboratory
Dns
Asp.net Mvc 5
Cookies
Linq To Sql
Pascal
Visual Studio Code
Sass
Recursion
Openerp
Class
Wso2
Sap
Datatables
Firebase
Grep
Cors
Listview
Language Agnostic
Inheritance
Apache Storm
Moodle
X86
Windows Mobile
Java 8
Debugging
Sip
Silverstripe
Puppet
Here Api
Phpmyadmin
Tfs
Xsd
Download
Exchange Server
Image
Postgresql
Mongoose
Ssl
Uwp
Processing
Speech Recognition
Terraform
Sharepoint 2007
Openssl
Orientdb
Wxpython
Url Rewriting
Ag Grid
Webview
Mapbox
Couchdb
Version Control
Airflow
Karate
Pandas
Youtube
Coffeescript
Select
Primefaces
Redirect
Batch File
Import
Ajax
Reactjs
Robotframework
Reference
Shopify
Sails.js
Graphics
Jquery Mobile
Types
Google Calendar Api


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网