Python-csvwriter-emprty结果，但打印工作正常_Python_Csv

Python-csvwriter-emprty结果，但打印工作正常

python csv

Python-csvwriter-emprty结果，但打印工作正常,python,csv,Python,Csv,我编写了一个简单的脚本，从网站上抓取unicode文本，并希望结果逐行写入文件。除了写入文件函数外，代码还可以工作打印（项目）效果非常好，可以提供我需要的准确结果 import requests import unicodecsv, os from bs4 import BeautifulSoup import re countries = ["ar","th","bn","my","chin","de","es","fr","hi","ja","ko","pt","ru","th","tr"

我编写了一个简单的脚本，从网站上抓取unicode文本，并希望结果逐行写入文件。除了写入文件函数外，代码还可以工作<代码>打印（项目）效果非常好，可以提供我需要的准确结果

import requests
import unicodecsv, os
from bs4 import BeautifulSoup
import re

countries = ["ar","th","bn","my","chin","de","es","fr","hi","ja","ko","pt","ru","th","tr","vi","zh"]

f = open("lesson_list.txt","wb")
w = unicodecsv.writer(f, encoding='utf-8', delimiter=',', quotechar='"')

for country in countries:

    toi = 1
    print country

    while toi<101:
        print toi,
        url = "http://www.englishspeak.com/"+ country +"/english-lesson.cfm?lessonID=" + str(toi)
        r = requests.get(url)
        soup = BeautifulSoup(r.content)
        soup.unicode

        titles = soup.find_all('font', {"color": "#006633"})

        data = [0]

        for index, item in enumerate(titles):
            tmp = titles[index].encode("utf-8")
            replaced = re.sub(r'<font color="#006633" face="Verdana" size="4">', ' ', tmp)
            replaced = re.sub(r'\n', ' ', replaced)
            replaced = re.sub(r'\r', ' ', replaced)
            replaced = re.sub(r'</font>', ' ', replaced)
            replaced = re.sub(r'\s+', ' ', replaced)
            data[index] = replaced

        toi += 1

        for index, item in enumerate(data):
            print(item)
            w.writerow(item)

导入请求
导入Unicodesv，操作系统
从bs4导入BeautifulSoup
进口稀土
国家=[“ar”、“th”、“bn”、“my”、“chin”、“de”、“es”、“fr”、“hi”、“ja”、“ko”、“pt”、“ru”、“th”、“tr”、“vi”、“zh”]
f=打开（“lesson_list.txt”、“wb”）
w=unicodecsv.writer（f，encoding='utf-8'，delimiter='，'，quotechar='”）
对于国家中的国家：
toi=1
印刷国
虽然toi在这里快速查看，但您的列表是[0]，一旦您枚举了过去的索引1，它就不会在其中存储任何内容
# try rewriting this
data = []

for index, item in enumerate(titles):
    tmp = titles[index].encode("utf-8")
    replaced = re.sub(r'<font color="#006633" face="Verdana" size="4">', ' ', tmp)
    replaced = re.sub(r'\n', ' ', replaced)
    replaced = re.sub(r'\r', ' ', replaced)
    replaced = re.sub(r'</font>', ' ', replaced)
    replaced = re.sub(r'\s+', ' ', replaced)
    data.append(replaced) # use append to add replaced to the list
...
# and writerow with data
print data # if you want to see the data on each loop
w.writerow(data)

#尝试重写此
数据=[]
对于索引，枚举中的项目（标题）：
tmp=标题[索引].编码（“utf-8”）
替换=重新分配（r''，''，tmp）
已替换=re.sub（r'\n'，''，已替换）
已替换=re.sub（r'\r'，''，已替换）
已替换=重新分段（r''，''，已替换）
已替换=re.sub（r'\s+'，''，已替换）
data.append（replaced）#使用append将被替换的数据添加到列表中
...
#和writerow的数据
打印数据#如果要查看每个循环上的数据
w、 writerow（数据）
我想它应该达到以下目的：
import csv
import re
import requests

from bs4 import BeautifulSoup

countries = ["ar", "th", "bn", "my", "chin", "de", "es", "fr", "hi", "ja", "ko",
             "pt", "ru", "th", "tr", "vi", "zh"]

FILENAME = "lesson_list.txt"


def run():
    for country in countries:
        toi = 1
        while toi < 101:
            url = "http://www.englishspeak.com/" + country + "/english-lesson.cfm?lessonID=" + str(toi)
            r = requests.get(url)
            soup = BeautifulSoup(r.content)
            soup.unicode

            titles = soup.find_all('font', {"color": "#006633"})
            data = []
            for index, item in enumerate(titles):
                replaced = re.sub('<[^>]*>|\n|\r', '', titles[index].encode("utf-8"))
                data.append(replaced)

            toi += 1
            print data, "item"

        csv_export(data)


def csv_export(data, file_name=FILENAME):
    file_item = open(file_name, 'wb')
    dict_writer = csv.writer(file_item)
    dict_writer.writerows(data)

if __name__ == '__main__':
    run()

导入csv
进口稀土
导入请求
从bs4导入BeautifulSoup
国家=[“ar”、“th”、“bn”、“my”、“chin”、“de”、“es”、“fr”、“hi”、“ja”、“ko”，
“pt”、“ru”、“th”、“tr”、“vi”、“zh”]
FILENAME=“lesson\u list.txt”
def run（）：
对于国家中的国家：
toi=1
当toi<101时：
url=”http://www.englishspeak.com/“+country+”/english lesson.cfm？lessonID=“+str（toi）
r=请求。获取（url）
汤=美汤（r.含量）
soup.unicode
titles=soup.find_all（'font'，{“color”：“#006633}）
数据=[]
对于索引，枚举中的项目（标题）：
已替换=re.sub（']*>|\n |\r'，''，标题[索引]。编码（“utf-8”））
data.append（已替换）
toi+=1
打印数据，“项目”
csv_导出（数据）
def csv_导出（数据，文件名=文件名）：
文件项=打开（文件名“wb”）
dict_writer=csv.writer（文件项）
dict_writer.writerows（数据）
如果uuuu name uuuuuu='\uuuuuuu main\uuuuuuu'：
运行（）
可能是输出缓冲…试着把f.close（）放在脚本的末尾你的头像让我笑了。谢谢，这确实解决了问题。但文件中的数据如下：“，Ø，Ø，Ø，³，4，-，Ù，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø，Ø它显示正确的文本。谢谢！这解决了问题！我刚刚添加了data.append（替换的.decode（“utf-8”））
append似乎可以修复，但现在。结果是每个字母都用逗号分隔，如下所示**，บ,ท,เ,ร,ี,ย,น,9,-,ค,ุ,ณ,ต,้,อ,ง,ก,า,ร,ไ,ป,เ,ม,ื,่,อ,ไ,ห,ร,่,?, **@user1695182，您不需要为/枚举操作，只需将数据写入w.writerow（'''.join（data））
谢谢！按照您的指导原则，我在上面尝试过，但没有成功。所以我只放了：w.writerow（data）这就成功了！啊，对了，w.writerow接受了这个列表。很高兴这有助于干杯；）哇！谢谢你，这是一个非常简洁的解决方案！我会把它添加到其他脚本中，谢谢