Python webscraper输出带有数字的括号_Python_Python 3.x_Web Scraping_Stockquotes

Python webscraper输出带有数字的括号

python python-3.x web-scraping

Python webscraper输出带有数字的括号,python,python-3.x,web-scraping,stockquotes,Python,Python 3.x,Web Scraping,Stockquotes,我正在windows上运行python 3.3。下面的代码进入雅虎财经，提取股票价格并打印出来。我遇到的问题是，它输出： ['540.04'] 我只想要这个数字，这样我就可以把它变成一个浮点数，然后和公式一起使用。我试着只使用float函数，但没有成功。我想我必须用一些代码删除括号和撇号 from urllib.request import urlopen from bs4 import BeautifulSoup import re htmlfile = ur

我正在windows上运行python 3.3。下面的代码进入雅虎财经，提取股票价格并打印出来。我遇到的问题是，它输出：

['540.04']

我只想要这个数字，这样我就可以把它变成一个浮点数，然后和公式一起使用。我试着只使用float函数，但没有成功。我想我必须用一些代码删除括号和撇号

    from urllib.request import urlopen
    from bs4 import BeautifulSoup
    import re

    htmlfile = urlopen("http://finance.yahoo.com/q?s=AAPL&q1=1")

    Thefind = re.compile ('<span id="yfs_l84_aapl">(.+?)</span>')

    msg=htmlfile.read()

    price = Thefind.findall(str(msg))

    print (price)

从urllib.request导入urlopen
从bs4导入BeautifulSoup
进口稀土
htmlfile=urlopen（“http://finance.yahoo.com/q?s=AAPL&q1=1")
Thefind=re.compile（“（.+？）”）
msg=htmlfile.read（）
价格=find.findall（str（msg））
印刷品（价格）

使用Python内置函数

float（price.strip（“[”]）

BeautifulSoup的美妙之处在于，您不必使用regexp解析HTML数据

这是使用BS的正确方法：

from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen("http://finance.yahoo.com/q?s=AAPL&q1=1")
soup = BeautifulSoup(html)
my_span = soup.find('span', {'id': 'yfs_l84_aapl'})
print(my_span.text)

产生

540.04

函数findall（）返回一个列表。如果只需要第一组，请按如下方式选择：

Thefind.findall(msg)[0]

Thefind.match(msg).group(1)

但提到任何团体都是这样做的：

Thefind.findall(msg)[0]

Thefind.match(msg).group(1)

注意：

组（0）

是整个匹配，而不是第一组。

我以输出结束：无，有东西搞砸了。我使用print（my_span）是因为它无法识别我的_span.text。我得到了一个错误AttributeError:“NoneType”对象没有属性“string”。我在谷歌上搜索了一下，它似乎与输出有关>>NoneI我发现真正的问题是，当我写l84时，我用了一个而不是小写的“L”。他们看起来像是个懒汉。我现在感觉很糟糕。我也试过绳子。它们都工作得很好。谢谢你的帮助。我将研究Beautiful Soup的功能。太好了！：）我以前试过用regexp解析HTML，相信我，你不会后悔用BS来代替的！与helldoc sys相比，它就像天堂一样，它是一个列表，而不是一个数组。re.findall（pattern，string[，flags]）以字符串列表的形式返回字符串中模式的所有非重叠匹配项。它不起作用，因为数字在列表中，但我学习了一个新函数。