使用python和beautlfulsoup从网站中的a href提取文本_Python_Web Scraping_Beautifulsoup

使用python和beautlfulsoup从网站中的a href提取文本

python web-scraping

使用python和beautlfulsoup从网站中的a href提取文本,python,web-scraping,beautifulsoup,Python,Web Scraping,Beautifulsoup,我试图从一个网站抓取数据，我需要文本标题 [<a href="http://www.thegolfcourses.net/golfcourses/TX/38468.htm" rel="bookmark">Feather Bay Golf Course and Resort</a>] [<a href="http://www.thegolfcourses.net/golfcourses/AZ/174830.htm" rel="bookmark">Paradi

我试图从一个网站抓取数据，我需要文本标题

[<a href="http://www.thegolfcourses.net/golfcourses/TX/38468.htm" rel="bookmark">Feather Bay  Golf  Course and Resort</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/AZ/174830.htm" rel="bookmark">Paradise Valley Country Club</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/IL/129935.htm" rel="bookmark">The Golf Club at Waters Edge</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/NY/10630.htm" rel="bookmark">1000 Acres Ranch Resort</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/VA/995731.htm" rel="bookmark">1757 Golf Club, 1757 Golf Club Front 9 Golf Course</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/WI/320815.htm" rel="bookmark">27 Pines Golf Course</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/WY/823145.htm" rel="bookmark">3 Creek Ranch Golf Club</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/CA/18431.htm" rel="bookmark">3 Par At Four Points</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/AZ/470720.htm" rel="bookmark">3 Parks Fairways</a>]
[<a href="http://www.thegolfcourses.net/golfcourses/IA/074920.htm" rel="bookmark">3-30 Golf &amp; Country Club</a>]

如果我理解正确，这可能会起作用。

基本上可以尝试“.text”方法

编辑：很抱歉，我无法正确格式化。试试这个。它不好，但可以工作

x = "<a> text </a>"
y = x.split(">")[1]
z = y.split("<")[0]
print z
 text

x=“文本”
y=x.split（“>”）[1]
z=y.split（“如果我理解正确，这可能有效。

基本上可以尝试“.text”方法
编辑：很抱歉，我无法正确格式化。试试这个。它不好，但可以工作
x = "<a> text </a>"
y = x.split(">")[1]
z = y.split("<")[0]
print z
 text 

x=“文本”
y=x.split（“>”）[1]
z=y.split（“如果我理解正确，这可能有效。

基本上可以尝试“.text”方法
编辑：很抱歉，我无法正确格式化。试试这个。它不好，但可以工作
x = "<a> text </a>"
y = x.split(">")[1]
z = y.split("<")[0]
print z
 text 

x=“文本”
y=x.split（“>”）[1]
z=y.split（“如果我理解正确，这可能有效。

基本上可以尝试“.text”方法
编辑：很抱歉，我无法正确格式化。试试这个。它不好，但可以工作
x = "<a> text </a>"
y = x.split(">")[1]
z = y.split("<")[0]
print z
 text 

x=“文本”
y=x.split（“>”）[1]
z=y.split（“使用string
属性
name= item.contents[5].find_all("a")[0].string

请记住，findall
返回一个列表（ResultSet对象），因此如果您知道只有一个列表，您可以在该列表中查找第0个索引
或者，如果您知道您只对一个结果感兴趣，可以使用find

name= item.contents[5].find("a").string

使用字符串
属性
name= item.contents[5].find_all("a")[0].string

请记住，findall
返回一个列表（ResultSet对象），因此如果您知道只有一个列表，您可以在该列表中查找第0个索引
或者，如果您知道您只对一个结果感兴趣，可以使用find

name= item.contents[5].find("a").string

使用字符串
属性
name= item.contents[5].find_all("a")[0].string

请记住，findall
返回一个列表（ResultSet对象），因此如果您知道只有一个列表，您可以在该列表中查找第0个索引
或者，如果您知道您只对一个结果感兴趣，可以使用find

name= item.contents[5].find("a").string

使用字符串
属性
name= item.contents[5].find_all("a")[0].string

请记住，findall
返回一个列表（ResultSet对象），因此如果您知道只有一个列表，您可以在该列表中查找第0个索引
或者，如果您知道您只对一个结果感兴趣，可以使用find

name= item.contents[5].find("a").string

我试过了，但是当我执行那个代码时，我仍然得到了空白。@Gonzalo68这是一种低效的方法，但它可能会工作。>>>x=“text”>>>y=x.split（“>”[1]>>>z=y.split（“我试过了，但是当我执行那个代码时，我仍然得到空白）。@Gonzalo68这是一种低效的方法，但它可能会工作。>>>>x=“text“>>>y=x.split（“>”[1]>>>z=y.split”（“我试过了，但在执行代码时仍然得到空格。@Gonzalo68这是一种低效的方法，但可能会奏效。>>>x=“text”>>>y=x.split（“>”[1]>>z=y.split（”我试过了，但当我执行代码时，仍然会出现空白。@Gonzalo68这是一种效率低下的方法，但可能会奏效。>>>x=“text”>>>y=x.split（“>”[1]>>z=y.split（“