Python 尝试按类刮取HTML跨度值,但返回错误
返回错误:Python 尝试按类刮取HTML跨度值,但返回错误,python,beautifulsoup,python-requests,Python,Beautifulsoup,Python Requests,返回错误: def getDOW(): DowURL = ["https://finance.yahoo.com/quote/%5EDJI?p=^DJI"] # requests data on the website(s) above page = requests.get(DowURL, headers=headers) # parses HTML text from website soup = BeautifulSoup(page.content
def getDOW():
DowURL = ["https://finance.yahoo.com/quote/%5EDJI?p=^DJI"]
# requests data on the website(s) above
page = requests.get(DowURL, headers=headers)
# parses HTML text from website
soup = BeautifulSoup(page.content, "html.parser")
# title = soup.find(class_="D(ib) Fz(18px)").get_text()
name = soup.find(class_= "Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)").get_text()
print (name)
很抱歉,如果之前有人问我这个问题,但我是新来的,所以我不知道发生了什么。如果有人能帮助我,或给我建议,我将不胜感激。我试图从多个站点中提取值,但使用列表时未能做到这一点,因此创建了单独的函数(我知道这是多么冗余),并遇到了此错误。当您查看页面的HTML源代码时,您会发现您感兴趣的对象不存在。原因可能是,只有在浏览器中加载页面后,才会加载内容。你可以用它来做这件事。但是,像这样加载数据不是很有效。我过去也这样做过,但这不是一个好的解决方案 由于您似乎对股票价格感兴趣,您可以使用以下方法:
raise InvalidSchema("No connection adapters were found for '%s' % url")
requests.exceptions.InvalidSchema: No connection adapters were found for '['https://finance.yahoo.com/quote/%5EDJI?p=^DJI']'
结果:
import yfinance as yf
import datetime
start = datetime.datetime(2019,11,15)
end = datetime.datetime(2019,11,16)
data = yf.download('^DJI', start=start, end=end)
print(data)
import datetime
import pandas as pd
import numpy as np
import pylab as pl
import datetime
from sklearn.cluster import AffinityPropagation
from sklearn import metrics
from matplotlib.collections import LineCollection
from pandas_datareader import data as wb
from sklearn import cluster, covariance, manifold
start = '2019-02-01'
end = '2020-02-01'
tickers = ['DJIA']
thelen = len(tickers)
price_data = []
for ticker in tickers:
prices = wb.DataReader(ticker, start = start, end = end, data_source='yahoo')[['Open','Adj Close']]
price_data.append(prices.assign(ticker=ticker)[['ticker', 'Open', 'Adj Close']])
#names = np.reshape(price_data, (len(price_data), 1))
names = pd.concat(price_data)
names.reset_index()
tickers = ['MMM',
'ABT',
'ABBV',
'ABMD',
'ACN',
'ATVI']
我会这样做
[*********************100%***********************] 1 of 1 downloaded
Open High Low Close Adj Close Volume
Date
2019-11-14 27757.20 27800.71 27676.97 27781.96 27781.96 303970000
2019-11-15 27843.54 28004.89 27843.54 28004.89 28004.89 283720000
结果:
import yfinance as yf
import datetime
start = datetime.datetime(2019,11,15)
end = datetime.datetime(2019,11,16)
data = yf.download('^DJI', start=start, end=end)
print(data)
import datetime
import pandas as pd
import numpy as np
import pylab as pl
import datetime
from sklearn.cluster import AffinityPropagation
from sklearn import metrics
from matplotlib.collections import LineCollection
from pandas_datareader import data as wb
from sklearn import cluster, covariance, manifold
start = '2019-02-01'
end = '2020-02-01'
tickers = ['DJIA']
thelen = len(tickers)
price_data = []
for ticker in tickers:
prices = wb.DataReader(ticker, start = start, end = end, data_source='yahoo')[['Open','Adj Close']]
price_data.append(prices.assign(ticker=ticker)[['ticker', 'Open', 'Adj Close']])
#names = np.reshape(price_data, (len(price_data), 1))
names = pd.concat(price_data)
names.reset_index()
tickers = ['MMM',
'ABT',
'ABBV',
'ABMD',
'ACN',
'ATVI']
注意:您可以输入任何您想要的票证,因此更改此行:
Date ticker Open Adj Close
0 2019-02-01 DJIA 25025.310547 25063.890625
1 2019-02-04 DJIA 25062.119141 25239.369141
2 2019-02-05 DJIA 25287.929688 25411.519531
3 2019-02-06 DJIA 25371.570312 25390.300781
4 2019-02-07 DJIA 25265.810547 25169.529297
.. ... ... ... ...
247 2020-01-27 DJIA 28542.490234 28535.800781
248 2020-01-28 DJIA 28594.279297 28722.849609
249 2020-01-29 DJIA 28820.529297 28734.449219
250 2020-01-30 DJIA 28640.160156 28859.439453
251 2020-01-31 DJIA 28813.039062 28256.029297
[252 rows x 4 columns]
为此:
tickers = ['DJIA']
您将获得多个股票代码的数据,如下所示
[*********************100%***********************] 1 of 1 downloaded
Open High Low Close Adj Close Volume
Date
2019-11-14 27757.20 27800.71 27676.97 27781.96 27781.96 303970000
2019-11-15 27843.54 28004.89 27843.54 28004.89 28004.89 283720000
结果:
import yfinance as yf
import datetime
start = datetime.datetime(2019,11,15)
end = datetime.datetime(2019,11,16)
data = yf.download('^DJI', start=start, end=end)
print(data)
import datetime
import pandas as pd
import numpy as np
import pylab as pl
import datetime
from sklearn.cluster import AffinityPropagation
from sklearn import metrics
from matplotlib.collections import LineCollection
from pandas_datareader import data as wb
from sklearn import cluster, covariance, manifold
start = '2019-02-01'
end = '2020-02-01'
tickers = ['DJIA']
thelen = len(tickers)
price_data = []
for ticker in tickers:
prices = wb.DataReader(ticker, start = start, end = end, data_source='yahoo')[['Open','Adj Close']]
price_data.append(prices.assign(ticker=ticker)[['ticker', 'Open', 'Adj Close']])
#names = np.reshape(price_data, (len(price_data), 1))
names = pd.concat(price_data)
names.reset_index()
tickers = ['MMM',
'ABT',
'ABBV',
'ABMD',
'ACN',
'ATVI']
为什么要使用URL列表?您需要提供URL(字符串)或为列表编制索引
DowURL[0]
。是的,我注意到括号是问题所在。谢谢@HTFThanks的建议!我会查一查