如何在python中删除链接的开头并添加正斜杠

如何在python中删除链接的开头并添加正斜杠,python,pandas,numpy,urllib,Python,Pandas,Numpy,Urllib,我有一些网站链接是我从网站上刮下来的,问题是这些链接不是完全正确的,因为除非我做了两次更改,否则它们不会自动下载数据: 1)我一开始就摆脱了VM300:1 2)我在.au 有没有一种方法可以自动做到这一点?我有大约一千个链接,所以手动操作并不可取 下面是我的url的示例 urls = [ "VM300:1 https://www.powerwater.com.au__data/assets/excel_doc/0011/172775/Market_Information_System_C

我有一些网站链接是我从网站上刮下来的,问题是这些链接不是完全正确的,因为除非我做了两次更改,否则它们不会自动下载数据:

1)我一开始就摆脱了
VM300:1

2)我在
.au

有没有一种方法可以自动做到这一点?我有大约一千个链接,所以手动操作并不可取

下面是我的url的示例

urls = [
    "VM300:1 https://www.powerwater.com.au__data/assets/excel_doc/0011/172775/Market_Information_System_Control_daily_trading_day_190130.xlsx",
    "VM300:1 https://www.powerwater.com.au__data/assets/excel_doc/0004/172732/Market_Information_System_Control_daily_trading_day_190129.xlsx",
    "VM300:1 https://www.powerwater.com.au__data/assets/excel_doc/0010/172675/Market_Information_System_Control_daily_trading_day_190128.xlsx",
    "VM300:1 https://www.powerwater.com.au__data/assets/excel_doc/0009/172674/Market_Information_System_Control_daily_trading_day_190127.xlsx",
    "VM300:1 https://www.powerwater.com.au__data/assets/excel_doc/0008/172673/Market_Information_System_Control_daily_trading_day_190126.xlsx",
    "VM300:1 https://www.powerwater.com.au__data/assets/excel_doc/0007/172672/Market_Information_System_Control_daily_trading_day_190125.xlsx",
    "VM300:1 https://www.powerwater.com.au__data/assets/excel_doc/0011/172595/Market_Information_System_Control_daily_trading_day_190124.xlsx"
]
编辑1

错误:

Traceback (most recent call last):
  File "C:/Users/george/Desktop/NT/stack NT.py", line 19, in <module>
    r = requests.get(urls)
  File "C:\Python27\lib\site-packages\requests\api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "C:\Python27\lib\site-packages\requests\api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 640, in send
    adapter = self.get_adapter(url=request.url)
  File "C:\Python27\lib\site-packages\requests\sessions.py", line 731, in get_adapter
    raise InvalidSchema("No connection adapters were found for '%s'" % url)
InvalidSchema: No connection adapters were found for '['https://www.powerwater.com.au/__data/assets/excel_doc/0011/172775/Market_Information_System_Control_daily_trading_day_190130.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0004/172732/Market_Information_System_Control_daily_trading_day_190129.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0010/172675/Market_Information_System_Control_daily_trading_day_190128.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0009/172674/Market_Information_System_Control_daily_trading_day_190127.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0008/172673/Market_Information_System_Control_daily_trading_day_190126.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0007/172672/Market_Information_System_Control_daily_trading_day_190125.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0011/172595/Market_Information_System_Control_daily_trading_day_190124.xlsx']'
回溯(最近一次呼叫最后一次):
文件“C:/Users/george/Desktop/NT/stack NT.py”,第19行,在
r=请求。获取(URL)
get中第75行的文件“C:\Python27\lib\site packages\requests\api.py”
返回请求('get',url,params=params,**kwargs)
文件“C:\Python27\lib\site packages\requests\api.py”,第60行,在请求中
return session.request(method=method,url=url,**kwargs)
文件“C:\Python27\lib\site packages\requests\sessions.py”,第533行,在请求中
resp=自我发送(准备,**发送)
文件“C:\Python27\lib\site packages\requests\sessions.py”,第640行,在send中
adapter=self.get\u适配器(url=request.url)
文件“C:\Python27\lib\site packages\requests\sessions.py”,第731行,在get\u适配器中
raise InvalidSchema(“未找到“%s”的连接适配器%url)
InvalidSchema:未找到“”的连接适配器['https://www.powerwater.com.au/__data/assets/excel_doc/0011/172775/Market_Information_System_Control_daily_trading_day_190130.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0004/172732/Market_Information_System_Control_daily_trading_day_190129.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0010/172675/Market_Informat离子系统控制每日交易日190128.xlsx“https://www.powerwater.com.au/__data/assets/excel_doc/0009/172674/Market_Information_System_Control_daily_trading_day_190127.xlsx', 'https://www.powerwater.com.au/__data/assets/excel_doc/0008/172673/Market_Information_System_Control_daily_trading_day_190126.xlsx', 'https://www.powerwater.com.au/__数据/资产/excel\u doc/0007/172672/Market\u Information\u System\u Control\u daily\u trading\u day\u 190125.xlsx','https://www.powerwater.com.au/__data/assets/excel_doc/0011/172595/Market_Information_System_Control_daily_trading_day_190124.xlsx']'

谢谢

将列表理解与
拆分
替换一起使用:

urls = [x.split()[1].replace('.au__', '.au/__') for x in urls]
urls = [x.replace('VM300:1 ','').replace('.au__', '.au/__') for x in urls]
用双
替换
的另一个想法:

urls = [x.split()[1].replace('.au__', '.au/__') for x in urls]
urls = [x.replace('VM300:1 ','').replace('.au__', '.au/__') for x in urls]

将列表理解与
拆分
替换一起使用:

urls = [x.split()[1].replace('.au__', '.au/__') for x in urls]
urls = [x.replace('VM300:1 ','').replace('.au__', '.au/__') for x in urls]
用双
替换
的另一个想法:

urls = [x.split()[1].replace('.au__', '.au/__') for x in urls]
urls = [x.replace('VM300:1 ','').replace('.au__', '.au/__') for x in urls]

这个怎么样,
[url.split()[-1]。对于url中的url替换('.'/')
这个怎么样,
[url.split()[-1]。对于url中的url替换('.'/')
或者可能是
url=[x.split()[1]。对于url中的x替换('.au'.'.au/'))
,具体取决于它们的功能need@Shinratensei-刚刚测试了输出链接,文件被正确下载,所以需要
\uuu
@Shinratensei-不,谢谢。这也是我的第一个想法,但链接是错误的,返回错误页面
(未找到页面)
@newtoR-有输入错误
r=requests.get(URL)
r=requests.get(url)
(最后删除的
s
)@newtoR-以及
(路径(url)
,或者可能
url=[x.split()[1]。将url中的x替换为('.au__u','.au/'))
,具体取决于它们的功能need@Shinratensei-刚刚测试了输出链接,文件被正确下载,所以需要
\uuu
@Shinratensei-不,谢谢。这也是我的第一个想法,但链接是错误的,返回错误页面
(未找到页面)
@newtoR-有输入错误
r=requests.get(URL)
r=requests.get(url)
(最后删除的
s
)@newtoR-以及
(路径(url)