Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/haskell/10.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Python中将数组输出保存为csv_Python_Arrays_File_Csv_Save - Fatal编程技术网

在Python中将数组输出保存为csv

在Python中将数组输出保存为csv,python,arrays,file,csv,save,Python,Arrays,File,Csv,Save,我正试图从网站上搜集数据。我正在做一个循环来提取数据,并将其存储在变量中,但无法将其保存在csv文件中。作为Python和BeautifulSoup的新手,我还没走多远。代码如下: import requests from bs4 import BeautifulSoup import csv r = "https://sofia.businessrun.bg/en/results-2018/" content = requests.get(r) soup = BeautifulSoup(c

我正试图从网站上搜集数据。我正在做一个循环来提取数据,并将其存储在变量中,但无法将其保存在csv文件中。作为Python和BeautifulSoup的新手,我还没走多远。代码如下:

import requests
from bs4 import BeautifulSoup
import csv

r = "https://sofia.businessrun.bg/en/results-2018/"
content = requests.get(r)

soup = BeautifulSoup(content.text, 'html.parser')


for i in range (1,5):
    team_name= soup.find_all(class_="column-3")
    team_time= soup.find_all(class_="column-5")


for i in range (1,5):
  print (team_name[i].text)
  print (team_time[i].text)

with open("new_file.csv","w+") as my_csv:
    csvWriter = csv.writer(my_csv,delimiter=',')
    csvWriter.writerows(team_name)

任何帮助都将不胜感激

我找到了另一种方法,通过使用pandas来进行报废并将其保存在csv中。代码如下:

import requests

# I changed this
import pandas as pd

from bs4 import BeautifulSoup
import csv

r = "https://sofia.businessrun.bg/en/results-2018/"
content = requests.get(r)

soup = BeautifulSoup(content.text, 'html.parser')


for i in range (1,5):
    team_name= soup.find_all(class_="column-3")
    team_time= soup.find_all(class_="column-5")

tn_list = []
tt_list = []

# I changed this to have string in place of tags 
tn_list = [str(x) for x in team_name]
tt_list = [str(x) for x in team_time]

for i in range (1,5):
    print(team_name[i].text)
    print(team_time[i].text)

# I put the result in a dataframe
df = pd.DataFrame({"teamname" : tn_list, "teamtime" : tt_list})

# I use regex to clean your data (get rid of the html tags)
df.teamname = df.teamname.str.replace("<[^>]*>", "")
df.teamtime = df.teamtime.str.replace("<[^>]*>", "")

# The first row is actually the column name
df.columns = df.iloc[0]
df = df.iloc[1:]

# I send it to a csv
df.to_csv(r"path\to\new_file.csv")
导入请求
#我改变了这个
作为pd进口熊猫
从bs4导入BeautifulSoup
导入csv
r=”https://sofia.businessrun.bg/en/results-2018/"
content=requests.get(r)
soup=BeautifulSoup(content.text,'html.parser')
对于范围(1,5)内的i:
团队名称=汤。全部查找(class=“column-3”)
团队时间=汤。全部查找(class=“column-5”)
tn_列表=[]
tt_列表=[]
#我把它改成用字符串代替标签
tn_list=[str(x)代表团队名称中的x]
tt_列表=[str(x)代表团队时间中的x]
对于范围(1,5)内的i:
打印(团队名称[i]。文本)
打印(团队时间[i].文本)
#我把结果放在一个数据框中
df=pd.DataFrame({“teamname”:tn_list,“teamtime”:tt_list})
#我使用正则表达式来清理数据(去掉html标记)
df.teamname=df.teamname.str.replace(“]*>”,“”)
df.teamtime=df.teamtime.str.replace(“]*>”,“”)
#第一行实际上是列名
df.columns=df.iloc[0]
df=df.iloc[1:]
#我将其发送到csv
df.to_csv(r“path\to\new_file.csv”)

这应该正常工作

运行时会发生什么?有错误吗?很好用!非常感谢。