Python 解析html文件后将元组转换为字符串_Python_Parsing_Tuples_Urlparse

Python 解析html文件后将元组转换为字符串

python parsing

Python 解析html文件后将元组转换为字符串,python,parsing,tuples,urlparse,Python,Parsing,Tuples,Urlparse,我需要将解析结果保存在文本文件中 import urllib from bs4 import BeautifulSoup import urlparse path = 'A html file saved on desktop' f = open(path,"r") if f.mode == 'r': contents = f.read() soup = BeautifulSoup(contents) search = soup.findAll('div',attrs

我需要将解析结果保存在文本文件中

import urllib
from bs4 import BeautifulSoup
import urlparse

path = 'A html file saved on desktop'

f = open(path,"r")
if f.mode == 'r':       
    contents = f.read()

soup = BeautifulSoup(contents)
search = soup.findAll('div',attrs={'class':'mf_oH mf_nobr mf_pRel'})
searchtext = str(search)
soup1 = BeautifulSoup(searchtext)   

urls = []
for tag in soup1.findAll('a', href = True):
    raw_url = tag['href'][:-7]
    url = urlparse.urlparse(raw_url)
    urls.append(url)
    print url.path

with open("1.txt", "w+") as outfile:
    for item in urls:
        outfile.write(item + "\n")

然而，我明白了：回溯（最近一次呼叫最后一次）：文件“c.py”，第26行，在输出文件。写入（项目+“\n”） TypeError:只能将元组（而不是“str”）连接到元组

如何将元组转换为字符串并将其保存在文本文件中？谢谢。

问题是列表中名为

URL

的

项都是元组。元组是其他项的容器，也是不可变的。执行item+“\n”
时，要求解释器连接元组和字符串，这是不可能的
相反，您要做的是检查元组，并在每个项中选择一个字段写入输出文件：
with open("1.txt", "w+") as outfile:
    for item in urls:
        outfile.write(str(item[1]) + "\n") 

这里，元组项的第一个字段首先转换为字符串（如果它恰好是其他内容），然后用“\n”连接。如果要按原样编写元组，可以编写以下内容：
outfile.write(str(item) + "\n")

尝试打印（项目）
，您将看到它不是一个字符串，而是一个元组。只能将字符串添加到一起。