使用CSV-Python 3中的文件名从FTP或HTTP下载文件
我有一个csv文件的电子商务网站的产品,我的工作,以及FTP访问相应的图像为每个产品(~15K产品) 我想使用Python从FTP或HTTP中仅提取csv中列出的图像,并将它们保存在本地使用CSV-Python 3中的文件名从FTP或HTTP下载文件,csv,python-3.x,ftp,urllib,Csv,Python 3.x,Ftp,Urllib,我有一个csv文件的电子商务网站的产品,我的工作,以及FTP访问相应的图像为每个产品(~15K产品) 我想使用Python从FTP或HTTP中仅提取csv中列出的图像,并将它们保存在本地 import urllib.request import urllib.parse import re url='http://www.fakesite.com/pimages/filename.jpg' split = urllib.parse.urlsplit(url) filename = split
import urllib.request
import urllib.parse
import re
url='http://www.fakesite.com/pimages/filename.jpg'
split = urllib.parse.urlsplit(url)
filename = split.path.split("/")[-1]
urllib.request.urlretrieve(url, filename)
print(filename)
saveFile = open(filename,'r')
saveFile.close()
import csv
with open('test.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=",")
images = []
for row in readCSV:
image = row[14]
print(image)
我目前拥有的代码可以从URL中提取文件名,并将文件另存为该文件名。它还可以从CSV文件中提取图像的文件名。(文件名和图像完全相同)我需要它做的是将文件名从CSV输入URL的末尾,然后将该文件另存为文件名
我已毕业于此:
import urllib.request
import urllib.parse
import re
import os
import csv
with open('test.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=",")
images = []
for row in readCSV:
image = row[14]
images.append(image)
x ='http://www.fakesite.com/pimages/'
url = os.path.join (x,image)
split = urllib.parse.urlsplit(url)
filename = split.path.split("/")[-1]
urllib.request.urlretrieve(url,filename)
saveFile = open(filename,'r')
saveFile.close()
现在这很好。它工作得很好。它从CSV文件中提取正确的文件名,将其添加到URL的末尾,下载文件,并将其保存为文件名
然而,我似乎不知道如何使这项工作适用于多行CSV文件。现在,它接受最后一行,并提取相关信息。理想情况下,我会使用CSV文件,其中包含所有产品,它会浏览并下载每一个产品,而不仅仅是最后一张图像 你在做奇怪的事情
import urllib.request
import csv
# the images list should be outside the with block
images = []
IMAGE_COLUMN = 14
with open('test.csv') as csvfile:
# read csv
readCSV = csv.reader(csvfile, delimiter=",")
for row in readCSV:
# I guess 14 is the column-index of the image-name like 'image.jpg'
# I've put it in some constant
# now append all the image-names into the list
images.append(row[IMAGE_COLUMN])
# no need for the following
# image = row[14]
# images.append(image)
# make sure, root_url ends with a slash
# x was some strange name for an url
root_url = 'http://www.fakesite.com/pimages/'
# iterate through the list
for image in images:
# you don't need os.path.join, because that's operating system dependent.
# you don't need to urlsplit, because you have created the url yourself.
# you don't need to split the filename as it is the image name
# with the following line, the root_url must end with a slash
url = root_url + image
# urlretrieve saves the file as whatever image is into the current directory
urllib.request.urlretrieve(url, image)
或者简而言之,这就是您所需要的:
import urllib.request
import csv
IMAGE_COLUMN = 14
ROOT_URL = 'http://www.fakesite.com/pimages/'
images = []
with open('test.csv') as csvfile:
readCSV = csv.reader(csvfile, delimiter=",")
for row in readCSV:
images.append(row[IMAGE_COLUMN])
for image in images:
url = ROOT_URL + image
urllib.request.urlretrieve(url, image)
哇!这很有效。我很抱歉做了一些奇怪的事情,我对Python非常陌生。谢谢你的帮助。这是我的下一个问题,我已经发布在下面了。