Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 2.7 Pandas只写入CSV文件中的最后一行_Python 2.7_Dataframe_Web Scraping_Python Requests_Export To Csv - Fatal编程技术网

Python 2.7 Pandas只写入CSV文件中的最后一行

Python 2.7 Pandas只写入CSV文件中的最后一行,python-2.7,dataframe,web-scraping,python-requests,export-to-csv,Python 2.7,Dataframe,Web Scraping,Python Requests,Export To Csv,我正在从txt文件中抓取URL并将其导出到csv文件。但是在所有的过程之后,我的代码只写最后一个url中的信息。我猜我忘记了一个循环。但是在哪里呢? 这是我的密码: import requests from bs4 import BeautifulSoup import pandas as pd from urllib import urlopen file = open('urls.txt', 'r') filelines = (line.strip() for line in file)

我正在从txt文件中抓取URL并将其导出到csv文件。但是在所有的过程之后,我的代码只写最后一个url中的信息。我猜我忘记了一个循环。但是在哪里呢? 这是我的密码:

import requests
from bs4 import BeautifulSoup
import pandas as pd
from urllib import urlopen

file = open('urls.txt', 'r')
filelines = (line.strip() for line in file)
for code in filelines:
    site = urlopen(code)
    soup = BeautifulSoup(site, "html.parser")
    final = soup.find_all("span", {"class": "bd js-title-main-info"})
    print final

records = []
for pagetxt in final:
    print pagetxt.text
    records.append((pagetxt.text))
df = pd.DataFrame(records, columns=['product name'])  
df.to_csv('test.csv', index=False, encoding='utf-8') 

谢谢

当您从文件中获取数据时,您只保留变量
final
中的最后一个值。尝试更早地附加数据(我已经用
标记了更改
):

import requests
from bs4 import BeautifulSoup
import pandas as pd
from urllib import urlopen

file = open('urls.txt', 'r')
filelines = (line.strip() for line in file)
records = []                         ######
for code in filelines:
    site = urlopen(code)
    soup = BeautifulSoup(site, "html.parser")
    final = soup.find_all("span", {"class": "bd js-title-main-info"})
    print final

    for pagetxt in final:               ######
       print pagetxt.text               ######
       records.append((pagetxt.text))   ######   

df = pd.DataFrame(records, columns=['product name'])  
df.to_csv('test.csv', index=False, encoding='utf-8')