使用CSV-Python 3中的文件名从FTP或HTTP下载文件

使用CSV-Python 3中的文件名从FTP或HTTP下载文件,csv,python-3.x,ftp,urllib,Csv,Python 3.x,Ftp,Urllib,我有一个csv文件的电子商务网站的产品,我的工作,以及FTP访问相应的图像为每个产品(~15K产品) 我想使用Python从FTP或HTTP中仅提取csv中列出的图像,并将它们保存在本地 import urllib.request import urllib.parse import re url='http://www.fakesite.com/pimages/filename.jpg' split = urllib.parse.urlsplit(url) filename = split

我有一个csv文件的电子商务网站的产品,我的工作,以及FTP访问相应的图像为每个产品(~15K产品)

我想使用Python从FTP或HTTP中仅提取csv中列出的图像,并将它们保存在本地

import urllib.request
import urllib.parse
import re

url='http://www.fakesite.com/pimages/filename.jpg'

split = urllib.parse.urlsplit(url)
filename = split.path.split("/")[-1]
urllib.request.urlretrieve(url, filename)

print(filename)

saveFile = open(filename,'r')
saveFile.close()

import csv

with open('test.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=",")

    images = []

    for row in readCSV:
        image = row[14]

print(image)
我目前拥有的代码可以从URL中提取文件名,并将文件另存为该文件名。它还可以从CSV文件中提取图像的文件名。(文件名和图像完全相同)我需要它做的是将文件名从CSV输入URL的末尾,然后将该文件另存为文件名

我已毕业于此:

import urllib.request
import urllib.parse
import re
import os
import csv

with open('test.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=",")

    images = []

    for row in readCSV:
        image = row[14]

        images.append(image)


x ='http://www.fakesite.com/pimages/'

url = os.path.join (x,image)

split = urllib.parse.urlsplit(url)
filename = split.path.split("/")[-1]
urllib.request.urlretrieve(url,filename)



saveFile = open(filename,'r')
saveFile.close()
现在这很好。它工作得很好。它从CSV文件中提取正确的文件名,将其添加到URL的末尾,下载文件,并将其保存为文件名


然而,我似乎不知道如何使这项工作适用于多行CSV文件。现在,它接受最后一行,并提取相关信息。理想情况下,我会使用CSV文件,其中包含所有产品,它会浏览并下载每一个产品,而不仅仅是最后一张图像

你在做奇怪的事情

import urllib.request
import csv

# the images list should be outside the with block
images = []
IMAGE_COLUMN = 14

with open('test.csv') as csvfile:
    # read csv
    readCSV = csv.reader(csvfile, delimiter=",")
    for row in readCSV:
        # I guess 14 is the column-index of the image-name like 'image.jpg'
        # I've put it in some constant  

        # now append all the image-names into the list
        images.append(row[IMAGE_COLUMN])

        # no need for the following
        # image = row[14]
        # images.append(image)

# make sure, root_url ends with a slash
# x was some strange name for an url
root_url = 'http://www.fakesite.com/pimages/'

# iterate through the list
for image in images:
    # you don't need os.path.join, because that's operating system dependent.
    # you don't need to urlsplit, because you have created the url yourself.
    # you don't need to split the filename as it is the image name
    # with the following line, the root_url must end with a slash
    url = root_url + image

    # urlretrieve saves the file as whatever image is into the current directory
    urllib.request.urlretrieve(url, image)
或者简而言之,这就是您所需要的:

import urllib.request
import csv

IMAGE_COLUMN = 14
ROOT_URL = 'http://www.fakesite.com/pimages/'
images = []

with open('test.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=",")
    for row in readCSV:
        images.append(row[IMAGE_COLUMN])

for image in images:
    url = ROOT_URL + image
    urllib.request.urlretrieve(url, image)

哇!这很有效。我很抱歉做了一些奇怪的事情,我对Python非常陌生。谢谢你的帮助。这是我的下一个问题,我已经发布在下面了。