Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/353.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用Beautiful Soup从站点中抓取数据时无法获取完整的HTML_Python_Selenium_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 使用Beautiful Soup从站点中抓取数据时无法获取完整的HTML

Python 使用Beautiful Soup从站点中抓取数据时无法获取完整的HTML,python,selenium,web-scraping,beautifulsoup,Python,Selenium,Web Scraping,Beautifulsoup,我试图刮取但我得到了HTML的一部分。我想得到代币的价格 应该放在里面 <span class="text-success">$0.0000000218121</span> 我与你毫无关系 scraping.findAll("span") 试试这个: 从bs4导入美化组 从urllib.request导入请求,urlopen 从selenium导入webdriver 导入请求 导入时间#新导入 def刮取(url) browse

我试图刮取但我得到了HTML的一部分。我想得到代币的价格

应该放在里面

<span class="text-success">$0.0000000218121</span>
我与你毫无关系

scraping.findAll("span")
试试这个:

从bs4导入美化组
从urllib.request导入请求,urlopen
从selenium导入webdriver
导入请求
导入时间#新导入
def刮取(url)
browser=webdriver.PhantomJS()
browser.get(url)
时间。睡眠(5)#5秒
html=browser.page\u源
返回美化组(html,“lxml”)
#获取Html
页面=刮削(“https://poocoin.app/tokens/0x78bc22a215c1ef8a2e41fa1c39cd7bdc09bd5174")
#按str提取价格
price=page.find(“span”,class=“text success”).getText()
打印(价格)#输出3.74美元

PooCoin限制了phantomJS视图,最好使用它:

from bs4 import BeautifulSoup
from selenium import webdriver
import time
from pyvirtualdisplay import Display
import telegram_send
from api.constants import *

coin_address = '0x4e8a9d0bf525d78fd9e0c88710099f227f6924cf'


def scraping(url):
    display = Display(visible=0, size=(1200, 1200))
    display.start()
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument('--disable-extensions')
    chrome_options.add_argument('--profile-directory=Default')
    chrome_options.add_argument("--incognito")
    chrome_options.add_argument("--disable-plugins-discovery")
    print(chrome_driver_dir)
    driver = webdriver.Chrome(executable_path=chrome_driver_dir, options=chrome_options)
    driver.delete_all_cookies()
    driver.get(url)

    time.sleep(5)  # 5 seconds
    html = driver.page_source
    display.stop()

    return BeautifulSoup(html, 'lxml')

# Get Html
page = scraping("https://poocoin.app/tokens/" + coin_address)

# Extract price as str
prices = page.find_all("span", class_="text-success")
# the element position always changes
price = prices[7].getText()
print(price)
    

price=page.find(“span”,class=“text success”)
由于加载页面需要时间,而且您获取html的速度太快,因此它不会返回任何结果。查看我的更新答案(我刚刚成功运行)。它仍然有效吗?我试过了,它给了我AttributeError:'NoneType'对象没有属性'getText'我也延长了时间是的,我再次运行它没有问题。
scraping.findAll("span")
from bs4 import BeautifulSoup
from selenium import webdriver
import time
from pyvirtualdisplay import Display
import telegram_send
from api.constants import *

coin_address = '0x4e8a9d0bf525d78fd9e0c88710099f227f6924cf'


def scraping(url):
    display = Display(visible=0, size=(1200, 1200))
    display.start()
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument('--disable-extensions')
    chrome_options.add_argument('--profile-directory=Default')
    chrome_options.add_argument("--incognito")
    chrome_options.add_argument("--disable-plugins-discovery")
    print(chrome_driver_dir)
    driver = webdriver.Chrome(executable_path=chrome_driver_dir, options=chrome_options)
    driver.delete_all_cookies()
    driver.get(url)

    time.sleep(5)  # 5 seconds
    html = driver.page_source
    display.stop()

    return BeautifulSoup(html, 'lxml')

# Get Html
page = scraping("https://poocoin.app/tokens/" + coin_address)

# Extract price as str
prices = page.find_all("span", class_="text-success")
# the element position always changes
price = prices[7].getText()
print(price)