Python 2.7 Python文件名，而不是标记。打开此文件并将文件句柄传递到Beautiful Soup中_Python 2.7_Beautifulsoup

Python 2.7 Python文件名，而不是标记。打开此文件并将文件句柄传递到Beautiful Soup中

python-2.7

Python 2.7 Python文件名，而不是标记。打开此文件并将文件句柄传递到Beautiful Soup中,python-2.7,beautifulsoup,Python 2.7,Beautifulsoup,我已经更改了Python2.7例程，以接受文件路径作为例程的参数，这样就不必通过在方法中插入多个文件路径来复制代码调用我的方法时，出现以下错误： looks like a filename, not markup. You should probably open this file and pass the filehandle into Beautiful Soup. '"%s" looks like a filename, not markup. You should probabl

我已经更改了Python2.7例程，以接受文件路径作为例程的参数，这样就不必通过在方法中插入多个文件路径来复制代码

调用我的方法时，出现以下错误：

looks like a filename, not markup. You should probably open this file and pass the filehandle into Beautiful Soup.
  '"%s" looks like a filename, not markup. You should probably open this file and pass the filehandle into Beautiful Soup.' % markup)

我的方法是：

def extract_data_from_report3(filename):
    html_report_part1 = open(filename,'r').read()
    soup = BeautifulSoup(filename, "html.parser")
    th = soup.find_all('th')
    td = soup.find_all('td')

    headers = [header.get_text(strip=True) for header in soup.find_all("th")]
    rows = [dict(zip(headers, [td.get_text(strip=True) for td in row.find_all("td")]))
        for row in soup.find_all("tr")[1:-1]]
    print(rows)
    return rows

调用该方法的步骤如下：

rows_part1 =  report.extract_data_from_report3(r"E:\test_runners\selenium_regression_test_5_1_1\TestReport\SeleniumTestReport_part1.html")
print "part1 = "
print rows_part1

如何将文件名作为参数传递？

您应该将已读取的文件的实际内容传递给

BeautifulSoup

：

html_report_part1 = open(filename,'r').read()
soup = BeautifulSoup(html_report_part1, "html.parser")

如果要传递文件句柄，则不调用read，只需传递

open（filename）

或不调用read的文件句柄即可：

def extract_data_from_report3(filename):
    html_report_part1 = open(filename,'r')
    soup = BeautifulSoup( html_report_part1, "html.parser")

或：

按建议调用read后，您可以传递

html\u report\u part1

，但您不需要，BeautifulSoup可以获取一个文件对象

BeautifulSoup是否处理该文件，还是应该将其放在with块中？@Mephy，一旦您离开该函数，该文件几乎肯定会关闭，如果没有对文件对象的引用，则在读取后将关闭该文件，您可以在下面的图中看到它。使用带有块但不是真正需要的块是无害的。

def extract_data_from_report3(filename):
    soup = BeautifulSoup(open(filename), "html.parser")