Python.BeautifulSoup：尝试使用BeautifulSoup在给定样式的网站上搜索纯文本_Python_Beautifulsoup

Python.BeautifulSoup：尝试使用BeautifulSoup在给定样式的网站上搜索纯文本

python

Python.BeautifulSoup：尝试使用BeautifulSoup在给定样式的网站上搜索纯文本,python,beautifulsoup,Python,Beautifulsoup,立即点击搜索，它将引导您找到课程、时间等的列表当我输入其他样式元素时，我得到了一些东西，但不是当前样式我正在寻找每个课程的时间数据，我正在使用beautiful soup，我的电话是 courseTimes = soup.find_all("td", {'style':'text-align: left; vertical-align: top;'}) print courseTimes 但是它返回的[]什么都不是编辑：嘿，对不起，我之前不清楚。这不是我的网站，所以我使用Beauty

立即点击搜索，它将引导您找到课程、时间等的列表

当我输入其他样式元素时，我得到了一些东西，但不是当前样式

我正在寻找每个课程的时间数据，我正在使用

beautiful soup

，我的电话是

courseTimes = soup.find_all("td", {'style':'text-align: left;
vertical-align: top;'})

print courseTimes

但是它返回的

[]

什么都不是

编辑：嘿，对不起，我之前不清楚。这不是我的网站，所以我使用Beauty soup解析HTML数据。该网站包含包装在中的纯文本

9:00AM-10:30AM

这是我的全部代码：

def parse_course_listings_for_lectures(self, raw_html):
    soup = BeautifulSoup(raw_html, 'html.parser')
    courseT = soup.find_all("td", {'style':'text-align: left; vertical-align: top;'})
    print courseT

下面将为您搜索的每一行节省时间，其中一个问题是您需要单击搜索按钮来获取数据。这可以通过URLIB请求模块或Selenium实现。BS只是一个可以使用的工具，下面是python 3.X中的一个解决方案，您需要获得决定使用的浏览器的正确版本：

from bs4 import BeautifulSoup
from selenium import webdriver


driver = webdriver.Firefox()
driver.get('https://my.sa.ucsb.edu/public/curriculum/coursesearch.aspx')

availbutton = driver.find_element_by_id('ctl00_pageContent_searchButton')
availbutton.click()

html = driver.page_source
soup = BeautifulSoup(html,'lxml')
rowindex = 0
while rowindex < 36:
        i = 0
        table_row=soup.find_all('tr',{'class':'CourseInfoRow'})[rowindex]
        for td in table_row:
                if (i == 15):
                        print(td)
                i = i + 1
        rowindex = rowindex + 1

从bs4导入美化组
从selenium导入webdriver
driver=webdriver.Firefox（）
司机，上车https://my.sa.ucsb.edu/public/curriculum/coursesearch.aspx')
availbutton=driver.find_element_by_id（'ctl00_pageContent_searchButton'））
可用按钮。单击（）
html=driver.page\u源
soup=BeautifulSoup（html，'lxml'）
行索引=0
当rowindex<36时：
i=0
table_row=soup.find_all（'tr'，{'class'：'CourseInfo'}）[rowindex]
对于表_行中的td：
如果（i==15）：
印刷品（td）
i=i+1
行索引=行索引+1

样本输出：

<td class="Header Clickable" style="text-align: left; vertical-align: top; white-space: nowrap; padding-left: 5px;
                            padding-right: 5px;">
                            2:00pm - 3:20pm
                        </td>


下午2:00-3:20

Hi，当我试图运行这段代码时，它说没有定义oddrowindex。这可能是个错误，还是我遗漏了什么？它也只为我打印一次，重复了一遍又一遍。为什么rowindex<36？ti是因为在那个特定的课程网站上有36个CourseInfo？我如何使它在不同的课程中工作？似乎只有当36个CourseInfoRow的snvm解决了这个问题时，<36才有效。非常感谢。用于更正oddrowindex的已编辑代码应为ROWDINDEX