Python web刮片：刮片后写入文件失败_Python_File Io

Python web刮片：刮片后写入文件失败

python file-io

Python web刮片：刮片后写入文件失败,python,file-io,Python,File Io,我自己在练习网络抓取，并试图从一个中国在线小说网站上抓取python的在线小说系列。在我将python代码放入函数后，它似乎停止了。我写了一段代码如下： import requests from bs4 import BeautifulSoup page = requests.get('https://www.51shucheng.net/zh-tw/wuxia/shediaoyingxiongzhuan') soup = BeautifulSoup(page.content,'lxml

我自己在练习网络抓取，并试图从一个中国在线小说网站上抓取python的在线小说系列。在我将python代码放入函数后，它似乎停止了。我写了一段代码如下：


import requests
from bs4 import BeautifulSoup


page = requests.get('https://www.51shucheng.net/zh-tw/wuxia/shediaoyingxiongzhuan')
soup = BeautifulSoup(page.content,'lxml')

page_list = soup.find_all(class_='mulu-list')
pages = page_list[0].find_all('a')
print(pages[0])

for i in range(len(pages)):
    pages[i] = pages[i].get('href')
    

with open("射雕英雄傳1.txt", "w+") as file_object:
    for i in range(len(pages)):
        file_object.write('\n\n\t{}'.format(i+1))
        page = requests.get(pages[i])
        soup = BeautifulSoup(page.content,'lxml')
        content = soup.find(class_='neirong').text
        print(content[0:20])
        file_object.write(content)


with open('射雕英雄傳1.txt') as oldfile, open('射雕英雄傳.txt', 'w') as newfile:
    for line in oldfile:
        if not ('adsbygoogle' in line):
            newfile.write(line)

而且它工作得很好。然而，我想把它放在一个函数中，因此我做了以下修正。然后它就不起作用了：这个射雕英雄傳“1.txt”文件仍在创建中，但为空


import requests
from bs4 import BeautifulSoup


def scraping_novel(prefix,bookname):
    page = requests.get('https://www.51shucheng.net/zh-tw/wuxia/{}'.format(prefix))
    soup = BeautifulSoup(page.content,'lxml')
    
    page_list = soup.find_all(class_='mulu-list')
    pages = page_list[0].find_all('a')
    print(pages[0])
    for i in range(len(pages)):
        pages[i] = pages[i].get('href')
        
    with open("{}1.txt".format(bookname), "w+") as file_object:
        for i in range(len(pages)):
            file_object.write('\n\n\t{}'.format(i+1))
            page = requests.get(pages[i])
            soup = BeautifulSoup(page.content,'lxml')
            content = soup.find(class_='neirong').text
            print(content[0:20])
            file_object.write(content)

    with open("{}1.txt".format(bookname)) as oldfile, open("{}1.txt".format(bookname), 'w') as newfile:
        for line in oldfile:
            if not ('adsbygoogle' in line):
                newfile.write(line)    


scraping_novel("shediaoyingxiongzhuan","射雕英雄傳")                


#failed

我试过两件事：

将文件名从中文转换为英文，因为我认为这可能是编码方面的一些问题，但这并没有帮助。事实上，这不是我第一次刮非英语网站，我从来没有见过这样的事情

在带有语句的第一个、最后一行第二行print（content[0:20]）中，我试图检查内容。这是完全好的，所以我认为问题不在于BS，而是文件编写。输出文件中没有写入任何内容！顺便说一句，输出文件大小为零字节

如果有人能告诉我发生了什么，我将不胜感激，因为我仍然无法找出哪里出了问题。
使用python>=3.6？做
但是，对于文件的覆盖。我猜你做不到。在一条语句中打开同一文件进行读写

with open("1.txt", "w+") as oldfile: oldfile.write('test') differentName = "12.txt" with open("1.txt", "r") as oldfile, open(differentName, 'w') as newfile: assert(len(oldfile.readlines())) sameName = "1.txt" with open(sameName, "r") as oldfile, open(sameName, 'w') as newfile: assert(len(oldfile.readlines()))

Lydia van Dyke提到的打字错误导致文件被打开以供写入，并提前结束读取流。所以oldfile行上的循环执行了0次。

Btw，没有错误消息。

open（{}1.txt）.format（bookname），'w'）作为新文件

应该变成

open（{}.txt）.format（bookname），'w'）作为新文件

我认为。看起来像是复制粘贴错误。@LydiavanDyke即使有了您的更改，我也开始了，但看不到。。。在令人尴尬的很长一段时间内：-）@user9882001 diff-b或-w应该很容易告诉您。好的，由于这两个答案现在一致认为这个问题是由打字错误引起的，所以这个问题应该结束了。

with open("1.txt", "w+") as oldfile:
    oldfile.write('test')

differentName = "12.txt"
with open("1.txt", "r") as oldfile, open(differentName, 'w') as newfile:
    assert(len(oldfile.readlines()))     

sameName = "1.txt"
with open(sameName, "r") as oldfile, open(sameName, 'w') as newfile:
    assert(len(oldfile.readlines()))