Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/293.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Python中强制将xml文件保存为xls格式_Python_Xml_Excel - Fatal编程技术网

在Python中强制将xml文件保存为xls格式

在Python中强制将xml文件保存为xls格式,python,xml,excel,Python,Xml,Excel,我这里有一段代码,以Excel 2004 xml格式下载该基金数据: import urllib2 url = 'https://www.ishares.com/us/258100/fund-download.dl' s = urllib2.urlopen(url) contents = s.read() file = open("export.xml", 'w') file.write(contents) file.close() 我的目标是以编程方式将此文件转换为.xls,然后通过该文件将

我这里有一段代码,以Excel 2004 xml格式下载该基金数据:

import urllib2
url = 'https://www.ishares.com/us/258100/fund-download.dl'
s = urllib2.urlopen(url)
contents = s.read()
file = open("export.xml", 'w')
file.write(contents)
file.close()
我的目标是以编程方式将此文件转换为.xls,然后通过该文件将其读入数据帧。我知道我可以使用python的xml库解析这个文件,但是,我注意到如果我打开xml文件并用xls文件扩展名手动保存它,pandas就可以读取它,并得到我想要的结果

我还尝试使用以下代码重命名文件扩展名,但是此方法不会强制保存该文件,而是将其保留为包含xls文件扩展名的基础xml文档

import os
import sys
folder = '~/models'
for filename in os.listdir(folder):
    if filename.startswith('export'):
        infilename = filename
        newname = infilename.replace('newfile.xls', 'f.xls')
        output = os.rename(infilename, newname)

使用Excel为Windows,考虑使用Python到COM连接到Excel对象库,使用WIN32 COM模块。具体而言,请使用Excel和方法将下载的xml另存为csv:


用Excel作为MAC,考虑VBA解决方案,因为VBA是最常用的语言,与Excel对象库接口。下面下载iSharesXML,然后将其保存为csv,以便使用OpenXML和SaveAs方法导入

注意:这未在Mac上测试,但希望Microsoft.XMLHTTP对象可用

VBA保存在启用宏的工作簿中

选项显式 子下载XML 关于错误转到错误句柄 将wb设置为工作簿 Dim xmlDoc作为对象 Dim xmlfile作为字符串,csvfile作为字符串 xmlfile=ActiveWorkbook.Path&\file.xml csvfile=ActiveWorkbook.Path&\file.csv 呼叫下载ilehttps://www.ishares.com/us/258100/fund-download.dl,xmlfile 设置wb=Excel.Workbooks.openxmlfile wb.SaveAs csvfile,6 wb.Close为真 出口: 设置wb=Nothing 设置xmlDoc=Nothing 出口接头 错误句柄: MsgBox错误编号和错误说明,vbCritical 复出 端接头 函数DownloadFileurl为字符串,filePath为字符串 Dim WinHttpReq作为对象,oStream作为对象 设置WinHttpReq=CreateObjectMicrosoft.XMLHTTP WinHttpReq.openget,url,False WinHttpReq.send 如果WinHttpReq.Status=200,则 设置oStream=CreateObjectADODB.Stream 奥斯特雷姆,开门 oStream.Type=1 oStream.Write WinHttpReq.responseBody oStream.SaveToFile文件路径,2'1=不覆盖,2=覆盖 奥斯特雷姆,完毕 如果结束 设置WinHttpReq=Nothing 设置oStream=Nothing 端函数 蟒蛇


我发现我正在使用的网站开发了一个api,从而绕过了网络抓取。然后使用python的请求模块


... 如果我打开XMl文件并手动保存它-使用什么应用程序?擅长如果是Excel,并且不关心性能,则可以执行与现在使用python编写的OLE手动执行的转换相同的转换。@Sophos是的,使用Excel手动保存它。谢谢,我将研究oletoolsI之前应该指定的,我在mac os上运行此脚本,因此无法使用win32com客户端。我实际上知道,因为您引用了Excel 2004,所以没有这样的Windows版本。对于未来的读者来说,这可能会有所帮助。考虑构建相同的宏版本,并将Python称为命令行。
import os
import win32com.client as win32    
import requests as r
import pandas as pd

cd = os.path.dirname(os.path.abspath(__file__))

url = "http://www.ishares.com/us/258100/fund-download.dl"
xmlfile = os.path.join(cd, 'iSharesDownload.xml')
csvfile = os.path.join(cd, 'iSharesDownload.csv')

# DOWNLOAD FILE
try:
    rqpage = r.get(url)
    with open(xmlfile, 'wb') as f:
        f.write(rqpage.content)    
except Exception as e:
    print(e)    
finally:
    rqpage = None

# EXCEL COM TO SAVE EXCEL XML AS CSV
if os.path.exists(csvfile):
    os.remove(csvfile)
try:
    excel = win32.gencache.EnsureDispatch('Excel.Application')
    wb = excel.Workbooks.OpenXML(xmlfile)
    wb.SaveAs(csvfile, 6)
    wb.Close(True)    
except Exception as e:
    print(e)    
finally:
    # RELEASES RESOURCES
    wb = None
    excel = None

# IMPORT CSV INTO PANDAS DATAFRAME
df = pd.read_csv(csvfile, skiprows=8)
print(df.describe())

#        Weight (%)       Price  Coupon (%)     YTM (%)  Yield to Worst (%)    Duration
# count  625.000000  625.000000  625.000000  625.000000          625.000000  625.000000
# mean     0.159888  101.298768    6.500256    5.881168            5.313760    2.128688
# std      0.126833   10.469460    1.932744    4.059226            4.224268    1.283360
# min     -0.110000    0.000000    0.000000    0.000000           -8.030000    0.000000
# 25%      0.090000  100.380000    5.130000    3.430000            3.070000    0.970000
# 50%      0.130000  102.940000    6.380000    4.930000            3.910000    2.240000
# 75%      0.190000  105.000000    7.630000    6.820000            6.070000    3.260000
# max      1.750000  128.750000   12.500000   40.900000           40.900000    5.060000
import pandas as pd

csvfile = "/path/to/file.csv"

# IMPORT CSV INTO PANDAS DATAFRAME
df = pd.read_csv(csvfile, skiprows=8)
print(df.describe())

#        Weight (%)       Price  Coupon (%)     YTM (%)  Yield to Worst (%)    Duration
# count  625.000000  625.000000  625.000000  625.000000          625.000000  625.000000
# mean     0.159888  101.298768    6.500256    5.881168            5.313760    2.128688
# std      0.126833   10.469460    1.932744    4.059226            4.224268    1.283360
# min     -0.110000    0.000000    0.000000    0.000000           -8.030000    0.000000
# 25%      0.090000  100.380000    5.130000    3.430000            3.070000    0.970000
# 50%      0.130000  102.940000    6.380000    4.930000            3.910000    2.240000
# 75%      0.190000  105.000000    7.630000    6.820000            6.070000    3.260000
# max      1.750000  128.750000   12.500000   40.900000           40.900000    5.060000
url = "https://www.blackrock.com/tools/hackathon/performance
for ticker in tickers:
    params = {'identifiers': ticker ,
              'returnsType':'MONTHLY'}
    request = requests.get(url, params=params)
    json = request.json()