Python抓取列表的网站链接_Python_Beautifulsoup_Screen Scraping_Jupyter Notebook - Fatal编程技术网

Python抓取列表的网站链接

python jupyter-notebook

Python抓取列表的网站链接,python,beautifulsoup,screen-scraping,jupyter-notebook,Python,Beautifulsoup,Screen Scraping,Jupyter Notebook,我正在尝试删除网站链接，然后将链接添加到空列表中这是我的密码： from bs4 import BeautifulSoup import requests l = [] r = requests.get("http://www.betexplorer.com/soccer/england/premier-league-2016-2017/results/") c=r.content soup=BeautifulSoup(c,"html.parser") for link in soup.f

我正在尝试删除网站链接，然后将链接添加到空列表中

这是我的密码：

from bs4 import BeautifulSoup
import requests

l = []

r = requests.get("http://www.betexplorer.com/soccer/england/premier-league-2016-2017/results/")
c=r.content
soup=BeautifulSoup(c,"html.parser")
for link in soup.find_all("a",{"class":"in-match"}):
    href=link.get('href')
    l.append(href)
    print(l[0])

现在我的结果是，当我试图打印网站的第一个链接时：

/soccer/england/premier-league-2016-2017/arsenal-everton/SGPa5fvr/
/soccer/england/premier-league-2016-2017/arsenal-everton/SGPa5fvr/
/soccer/england/premier-league-2016-2017/arsenal-everton/SGPa5fvr/
/soccer/england/premier-league-2016-2017/arsenal-everton/SGPa5fvr/
.................

问题是，当我试图打印出网站的特定链接时，该链接打印了很多次，并且应该只打印一次。

行

print（l[0]）

在

for

循环的每次迭代中运行，并且总是打印列表的第一个元素

在的

循环完成后，您的列表将包含所有要打印的链接。此时，您可以遍历列表并打印每个元素。
更正代码的缩进。
print（l[0]）在for循环中，这就是它被反复执行的原因
您犯了一个简单的逻辑错误。您的打印语句当前在循环中。将其从循环范围中移除将解决您的问题
固定版本：
for link in soup.find_all("a",{"class":"in-match"}): 
    href=link.get('href')
    l.append(href)              
print(l[0])

循环执行后，l
数组将填充链接
Yesprint（l[0]）在循环代码中：）
for link in soup.find_all("a",{"class":"in-match"}): 
    href=link.get('href')
    l.append(href)              
print(l[0])




[apache kafka]相关文章推荐



                                                        
Apache kafka Kafka java.io.IOException:主机中的软件中止了已建立的连接
apache-kafka 
Apache kafka 卡夫卡进入DynamoDB
apache-kafkaamazon-dynamodb 
Apache kafka 实时将Couchbase数据传输到Kafka
apache-kafkacouchbase 
Apache kafka 通过Observable（RxJava）使用卡夫卡
apache-kafkarx-java 
Apache kafka Kafka 10 Kafka-consumer-groups.sh能否描述一个主题'；给定组的偏移量是多少？
apache-kafka 
Apache kafka 使用IBM集成总线从Apache Kafka到IBM MQ的消息
apache-kafkaibm-mq 
Apache kafka 设置卡夫卡流中的窗口[跳跃、翻滚等]并行性
apache-kafka 
Apache kafka ktable主题中的ktable连接数据在几分钟后消失
apache-kafka 
Apache kafka 卡夫卡只和事务制作人一次
apache-kafka 
Apache kafka Kafka将多个生产者指定给同一分区
apache-kafka 
Apache kafka 卡夫卡的应用程序内存不足
apache-kafka 
Apache kafka SASL_明文机制为/config/users/admin的CRAM-SHA-256-InvalidACL的凭据无效
apache-kafkaapache-zookeeper 
Apache kafka 为什么GCS连接器在GCS中创建Kafka分区目录，但不写入Kafka主题数据
apache-kafkagoogle-cloud-storage 
Apache kafka 卡夫卡制作人vs卡夫卡连接器
apache-kafka 
Apache kafka 在ksqldb中将现有卡夫卡主题与avro一起使用
apache-kafka 
Apache kafka 连接Kafdrop以保护MSK kafka代理
apache-kafka 
Apache kafka 如何将**附件**（pdf、xls、xml..）发布到**卡夫卡主题中*
apache-kafka 
Apache kafka 指定ksqlDB使用者的偏移量
apache-kafka 
Apache kafka 卡夫卡是否区分消费抵销和承诺抵销？
apache-kafka 
Apache kafka 卡夫卡制作人重复检查
apache-kafka 
                                       





随机文章推荐



                                                        
GPS调制解调器，可向我们的服务器发送数据
gps 
创建基于GPS坐标学习的MLP
gpsneural-network 
Gps 如何利用星历数据计算卫星位置
gps 
Gps 在三星galaxy S2上启用静态导航
gps 
Gps 奇怪的经纬度格式
gps


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python：Internet内容自适应协议（ICAP）客户端在Python中的实现
									Python
							 									Proxy
							 
Python IF查询
									Python
							 									If Statement
							 
python中的OpenCV。获取网络摄像头流可以正常工作，但会打印错误消息
									Python
							 									Opencv
							 
Python 尝试将dict值插入postgresql表时出现类型错误
									Python
							 									Dictionary
							 
Python SQLAlchemy中不区分大小写的列名？
									Python
							 									Mysql
							 									Sqlalchemy
							 
Python 数据集的h5py顺序（按数据集名称）
									Python
							 
Python Selenium Firefox窗口在空闲窗口关闭之前不打开
									Python
							 									Firefox
							 									Selenium
							 
Raspbian上的Python-“；类型错误：'；numpy.int32'；对象不可编辑'&引用；
									Python
							 									Opencv
							 									Numpy
							 									Raspberry Pi
							 
Python cx_freeze创建请求管理员权限的exe
									Python
							 
如何使用Python 3构建Web爬虫程序？
									Python
							 									Python 3.x
							 									Web Crawler
							 
Python 是否将事件侦听器添加到Pika IOLoop或多线程？
									Python
							 									Multithreading
							 									Asynchronous
							 									Rabbitmq
							 
运行Python脚本&x27；内存错误''；对于'；循环
									Python
							 									Loops
							 									For Loop
							 
Python 如何使我的应用程序能够进行AdUser api调用
									Python
							 									Facebook
							 									Facebook Graph Api
							 
Python Django中的下拉搜索字段
									Python
							 									Django
							 									Search
							 
python中的安全凭证存储
袭击
									Python
							 									Security
							 									Reflection
							 
Python 无法理解如何记忆子集函数
									Python
							 									Arrays
							 									Dictionary
							 									Recursion
							 
Python 读取线模块在Pycharm中不起作用
									Python
							 									Terminal
							 									Pycharm
							 
Python 大（O）符号
2.
3.
									Python
							 									Performance
							 									Time Complexity
							 									Big O
							 
Python 显示状态的Tkinter按钮
									Python
							 									Python 3.x
							 									Tkinter
							 									Arduino
							 
Python XBox One使用带EVDEV的模拟棒无线控制伺服
									Python
							 
Python Matplotlib某些乳胶符号不随图形大小的增加而缩放
									Python
							 									Matplotlib
							 									Latex
							 
Python CLI与应用程序调度
									Python
							 									Flask
							 
Python 烧瓶响应流多个值
									Python
							 									Flask
							 
具有大数据集的Python Redis hmset不会插入所有对象
									Python
							 									Redis
							 
Python Dask分布式：并行读取和分析大量单个文件
									Python
							 									Dask
							 
Python 如何使用带有BeautifulSoup的索引刮取多个表？
									Python
							 									Html
							 									Web Scraping
							 
使用AWS Lambda（Python）从多部分/表单数据保存图像
									Python
							 									Aws Lambda
							 
Python 如何将这些数据与正则表达式匹配？
Fica outorgada，位于普里菲图拉市
朱基奥，CNPJ n。46.585.964/0001-40，一项关于国际刑事法院的行政管辖权，第46.585.964/0001-40段
罗多维亚里奥的鱼翅没有朱基奥的市政标志，但符合阿拜索的要求
识别：
-Travessia Aérea 01-漂浮在里约朱基亚-库德。
地理坐标纬度S 24°19'54,00“-经度o 47°38”
54,10“-Prazo 30 anos。
-Travessia Aérea 02-漂浮在里约
									Python
							 									Regex
							 
Python 使用print打印产品矩阵时出现问题
									Python
							 									Arrays
							 									Function
							 									Matrix
							 
类要求我给出NaiveBayes模型python的self
									Python
							 									Class
							 									Machine Learning
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Floating Point
Entity Framework 4
Url Rewriting
Printing
Osgi
Kernel
Javafx
Windows 8
Ant
Blazor
Nestjs
Calendar
Salesforce
Entity Framework Core
Windows Mobile
Usb
Isabelle
Ibm Cloud
Linq To Sql
Makefile
Transactions
Awk
Automation
Xna
Playframework 2.0
Coq
Ckeditor
Netbeans
Logging
Stata
Drupal
Sharepoint
Machine Learning
Gstreamer
Parameters
Delphi
Julia
Mysql
Java Me
Wcf
Azure Cosmosdb
Autocomplete
Ssrs 2008
Collections
Sql Server
Fluent Nhibernate
Wpf
Dataframe
Zend Framework
Seo
Pyspark
Razor
Cluster Computing
Ios4
Computer Vision
Sms
Arduino
Timer
Video Streaming
Iphone
Azure Active Directory
Error Handling
Pandas
Adobe
Windows Phone 8.1
Phpstorm
Ms Access
Mercurial
Cloud Foundry
Unicode
Scikit Learn
Drupal 7
Gis
User Interface
Twig
Module
Resharper
Python Sphinx
Plsql
Laravel 5
Vb6
Silverlight
Sql Server 2008
Ecmascript 6
C++ Cli
Replace
Nosql
Google Calendar Api
Instagram
Docusignapi
Teamcity
Geolocation
Operating System
Time Complexity
Safari
Azure Service Fabric
Zend Framework2
Passwords
Lua
Data Structures
Notifications
Button
Visual Studio 2015
Vuejs2
Matplotlib
Dotnetnuke
Aframe
Flash
Assembly
Vue.js
Web Scraping
Corda
Azure Devops
Excel Formula
List
View
Canvas
Dependency Injection
Vb.net
Drop Down Menu
Ipython
Elm
Azure Data Factory
Sip
Navigation
Jquery
Optimization
Concurrency
Opencl
Google Drive Api
Knockout.js
Kubernetes
Scrapy
Cucumber
Webpack
Reactjs
Testing
Mobile
Tfs
Applescript
Function
Json
Server
Phpunit
Playframework
Eclipse
Https
Windows Installer
Heroku
Nhibernate
Phantomjs
Arangodb
Download
Ide
Latex
Version Control
Magento
Yaml
C# 3.0
Intellij Idea
Woocommerce
Dialogflow Es
Coffeescript
Exchange Server
Sparql
Spring Mvc
Perforce
Orchardcms
Ionic Framework
Xsd
Colors
Octave
Uiview
Android Studio
Ember.js
Marklogic
Checkbox
Codeigniter
Angular Material
Cocos2d Iphone
Debian
Sass
Lambda
Phpmyadmin
Compiler Errors
Oauth 2.0
Monitoring
Ios
Google Chrome
Eclipse Rcp
Microsoft Graph Api
Hash
Stm32
Azure Sql Database
Input
Search
Performance
Cmd
Asynchronous
Couchbase
Visual Studio 2008


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网