Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python-csvwriter-emprty结果,但打印工作正常_Python_Csv - Fatal编程技术网

Python-csvwriter-emprty结果,但打印工作正常

Python-csvwriter-emprty结果,但打印工作正常,python,csv,Python,Csv,我编写了一个简单的脚本,从网站上抓取unicode文本,并希望结果逐行写入文件。除了写入文件函数外,代码还可以工作打印(项目)效果非常好,可以提供我需要的准确结果 import requests import unicodecsv, os from bs4 import BeautifulSoup import re countries = ["ar","th","bn","my","chin","de","es","fr","hi","ja","ko","pt","ru","th","tr"

我编写了一个简单的脚本,从网站上抓取unicode文本,并希望结果逐行写入文件。除了写入文件函数外,代码还可以工作<代码>打印(项目)效果非常好,可以提供我需要的准确结果

import requests
import unicodecsv, os
from bs4 import BeautifulSoup
import re

countries = ["ar","th","bn","my","chin","de","es","fr","hi","ja","ko","pt","ru","th","tr","vi","zh"]

f = open("lesson_list.txt","wb")
w = unicodecsv.writer(f, encoding='utf-8', delimiter=',', quotechar='"')

for country in countries:

    toi = 1
    print country

    while toi<101:
        print toi,
        url = "http://www.englishspeak.com/"+ country +"/english-lesson.cfm?lessonID=" + str(toi)
        r = requests.get(url)
        soup = BeautifulSoup(r.content)
        soup.unicode

        titles = soup.find_all('font', {"color": "#006633"})

        data = [0]

        for index, item in enumerate(titles):
            tmp = titles[index].encode("utf-8")
            replaced = re.sub(r'<font color="#006633" face="Verdana" size="4">', ' ', tmp)
            replaced = re.sub(r'\n', ' ', replaced)
            replaced = re.sub(r'\r', ' ', replaced)
            replaced = re.sub(r'</font>', ' ', replaced)
            replaced = re.sub(r'\s+', ' ', replaced)
            data[index] = replaced

        toi += 1

        for index, item in enumerate(data):
            print(item)
            w.writerow(item)
导入请求
导入Unicodesv,操作系统
从bs4导入BeautifulSoup
进口稀土
国家=[“ar”、“th”、“bn”、“my”、“chin”、“de”、“es”、“fr”、“hi”、“ja”、“ko”、“pt”、“ru”、“th”、“tr”、“vi”、“zh”]
f=打开(“lesson_list.txt”、“wb”)
w=unicodecsv.writer(f,encoding='utf-8',delimiter=',',quotechar='”)
对于国家中的国家:
toi=1
印刷国

虽然toi在这里快速查看,但您的列表是[0],一旦您枚举了过去的索引1,它就不会在其中存储任何内容

# try rewriting this
data = []

for index, item in enumerate(titles):
    tmp = titles[index].encode("utf-8")
    replaced = re.sub(r'<font color="#006633" face="Verdana" size="4">', ' ', tmp)
    replaced = re.sub(r'\n', ' ', replaced)
    replaced = re.sub(r'\r', ' ', replaced)
    replaced = re.sub(r'</font>', ' ', replaced)
    replaced = re.sub(r'\s+', ' ', replaced)
    data.append(replaced) # use append to add replaced to the list
...
# and writerow with data
print data # if you want to see the data on each loop
w.writerow(data)
#尝试重写此
数据=[]
对于索引,枚举中的项目(标题):
tmp=标题[索引].编码(“utf-8”)
替换=重新分配(r'','',tmp)
已替换=re.sub(r'\n','',已替换)
已替换=re.sub(r'\r','',已替换)
已替换=重新分段(r'','',已替换)
已替换=re.sub(r'\s+','',已替换)
data.append(replaced)#使用append将被替换的数据添加到列表中
...
#和writerow的数据
打印数据#如果要查看每个循环上的数据
w、 writerow(数据)

我想它应该达到以下目的:

import csv
import re
import requests

from bs4 import BeautifulSoup

countries = ["ar", "th", "bn", "my", "chin", "de", "es", "fr", "hi", "ja", "ko",
             "pt", "ru", "th", "tr", "vi", "zh"]

FILENAME = "lesson_list.txt"


def run():
    for country in countries:
        toi = 1
        while toi < 101:
            url = "http://www.englishspeak.com/" + country + "/english-lesson.cfm?lessonID=" + str(toi)
            r = requests.get(url)
            soup = BeautifulSoup(r.content)
            soup.unicode

            titles = soup.find_all('font', {"color": "#006633"})
            data = []
            for index, item in enumerate(titles):
                replaced = re.sub('<[^>]*>|\n|\r', '', titles[index].encode("utf-8"))
                data.append(replaced)

            toi += 1
            print data, "item"

        csv_export(data)


def csv_export(data, file_name=FILENAME):
    file_item = open(file_name, 'wb')
    dict_writer = csv.writer(file_item)
    dict_writer.writerows(data)

if __name__ == '__main__':
    run()
导入csv
进口稀土
导入请求
从bs4导入BeautifulSoup
国家=[“ar”、“th”、“bn”、“my”、“chin”、“de”、“es”、“fr”、“hi”、“ja”、“ko”,
“pt”、“ru”、“th”、“tr”、“vi”、“zh”]
FILENAME=“lesson\u list.txt”
def run():
对于国家中的国家:
toi=1
当toi<101时:
url=”http://www.englishspeak.com/“+country+”/english lesson.cfm?lessonID=“+str(toi)
r=请求。获取(url)
汤=美汤(r.含量)
soup.unicode
titles=soup.find_all('font',{“color”:“#006633})
数据=[]
对于索引,枚举中的项目(标题):
已替换=re.sub(']*>|\n |\r','',标题[索引]。编码(“utf-8”))
data.append(已替换)
toi+=1
打印数据,“项目”
csv_导出(数据)
def csv_导出(数据,文件名=文件名):
文件项=打开(文件名“wb”)
dict_writer=csv.writer(文件项)
dict_writer.writerows(数据)
如果uuuu name uuuuuu='\uuuuuuu main\uuuuuuu':
运行()

可能是输出缓冲…试着把f.close()放在脚本的末尾你的头像让我笑了。谢谢,这确实解决了问题。但文件中的数据如下:“,Ø,Ø,Ø,³,4,-,Ù,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø,Ø它显示正确的文本。谢谢!这解决了问题!我刚刚添加了
data.append(替换的.decode(“utf-8”))
append似乎可以修复,但现在。结果是每个字母都用逗号分隔,如下所示**,บ,ท,เ,ร,ี,ย,น,9,-,ค,ุ,ณ,ต,้,อ,ง,ก,า,ร,ไ,ป,เ,ม,ื,่,อ,ไ,ห,ร,่,?, **@user1695182,您不需要为/枚举操作,只需将数据写入
w.writerow('''.join(data))
谢谢!按照您的指导原则,我在上面尝试过,但没有成功。所以我只放了:w.writerow(data)这就成功了!啊,对了,w.writerow接受了这个列表。很高兴这有助于干杯;)哇!谢谢你,这是一个非常简洁的解决方案!我会把它添加到其他脚本中,谢谢