Python 熊猫导入excel导出HDF5

Python 熊猫导入excel导出HDF5,python,unicode,pandas,hdf5,pytables,Python,Unicode,Pandas,Hdf5,Pytables,与熊猫和桌子一起工作。首先,从excel导入一个表,其中包含整数和浮点数列,以及其他包含字符串甚至元组的列。excel导入上的选项数量有限,不幸的是,与csv导入过程不同,数据类型必须在导入后从其推断类型转换而来,并且不能在该过程中指定 这就是说,所有非数字显然是作为unicode文本导入的,这与稍后导出到HDF5不兼容。是否有一种简单的方法将所有unicode列(以及所有列标题)转换为HDF5兼容的字符串格式 更多详情: >>> metaFrame.head()

与熊猫和桌子一起工作。首先,从excel导入一个表,其中包含整数和浮点数列,以及其他包含字符串甚至元组的列。excel导入上的选项数量有限,不幸的是,与csv导入过程不同,数据类型必须在导入后从其推断类型转换而来,并且不能在该过程中指定

这就是说,所有非数字显然是作为unicode文本导入的,这与稍后导出到HDF5不兼容。是否有一种简单的方法将所有unicode列(以及所有列标题)转换为HDF5兼容的字符串格式

更多详情:

>>> metaFrame.head()
                               ProjectName Company ContactName  \
LocationID                                                       
935          PCS Petaluma High School Site  Testco   Test Name   
937            PCS Casa Grande High School  Testco   Test Name   
3465               FUSD Fowler High School  Testco   Test Name   
3466             FUSD Sutter Middle School  Testco   Test Name   
3467        FUSD Fremont Elementary School  Testco   Test Name   

                      Contactemail  \
LocationID                           
935         test.address@email.com   
937         test.address@email.com   
3465        test.address@email.com   
3466        test.address@email.com   
3467        test.address@email.com   

                                                         Link  Systemsize(kW)  \
LocationID                                                                      
935         https://internal.testco.com/locations/935/syst...             NaN   
937         https://internal.testco.com/locations/937/syst...          675.39   
3465        https://internal.testco.com/locations/3465/sys...          384.30   
3466        https://internal.testco.com/locations/3466/sys...          198.90   
3467        https://internal.testco.com/locations/3467/sys...           35.10   

           SystemCheckStartdate SystemCheckActive  \
LocationID                                          
935         2013-10-01 00:00:00              True   
937         2013-10-01 00:00:00              True   
3465        2013-10-01 00:00:00              True   
3466        2013-10-01 00:00:00              True   
3467        2013-10-01 00:00:00              True   

            YTDProductionPriortostartdate  NumberofInverters/cktsmonitored  \
LocationID                                                                   
935                                   NaN                              NaN   
937                                   NaN                              NaN   
3465                                  NaN                              NaN   
3466                                  NaN                              NaN   
3467                                  NaN                              NaN   

                                                  InverterMfg InverterModel  \
LocationID                                                                    
935                                     PV Powered : PVP260KW           NaN   
937                                     PV Powered : PVP260KW           NaN   
3465        Advanced Energy Industries : Solaron 333kW (31...           NaN   
3466                                    PV Powered : PVP260KW           NaN   
3467                                 PV Powered : PVP35KW-480           NaN   

            InverterCECefficiency ModuleMfg Modulemodel  \
LocationID                                                
935                          97.0       NaN         NaN   
937                          97.0       NaN         NaN   
3465                         97.5       NaN         NaN   
3466                         97.0       NaN         NaN   
3467                         96.0       NaN         NaN   

            Moduleirradiancefactor  Moduleirradiancefactorslope  \
LocationID                                                        
935                            NaN                          NaN   
937                            NaN                          NaN   
3465                           NaN                          NaN   
3466                           NaN                          NaN   
3467                           NaN                          NaN   

            StraightLineIntercept  ModuleTemp-PwrDerate MeterDK      
LocationID                                                           
935                           NaN                 0.005    3291 ...  
937                           NaN                 0.005   11548 ...  
3465                          NaN                 0.005   19248 ...  
3466                          NaN                 0.005   15846 ...  
3467                          NaN                 0.005   15847 ...  

[5 rows x 27 columns]

>>> metaFrame.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 43 entries, 935 to 5844
Data columns (total 27 columns):
ProjectName                        43  non-null values
Company                            43  non-null values
ContactName                        43  non-null values
Contactemail                       43  non-null values
Link                               43  non-null values
Systemsize(kW)                     42  non-null values
SystemCheckStartdate               37  non-null values
SystemCheckActive                  43  non-null values
YTDProductionPriortostartdate      0  non-null values
NumberofInverters/cktsmonitored    2  non-null values
InverterMfg                        42  non-null values
InverterModel                      8  non-null values
InverterCECefficiency              33  non-null values
ModuleMfg                          0  non-null values
Modulemodel                        0  non-null values
Moduleirradiancefactor             0  non-null values
Moduleirradiancefactorslope        0  non-null values
StraightLineIntercept              0  non-null values
ModuleTemp-PwrDerate               43  non-null values
MeterDK                            43  non-null values
Genfieldname                       43  non-null values
WSDK                               34  non-null values
WSirradianceField                  43  non-null values
WSCellTempField                    43  non-null values
MiscDerate                         1  non-null values
InverterDKs                        37  non-null values
Invertergenfields                  37  non-null values
dtypes: bool(1), datetime64[ns](1), float64(9), object(16)
metaFrame.head() ProjectName公司联系人姓名\ 位置ID 935件Petaluma高中站点Testco测试名称 937件卡萨格兰德高中Testco考试名称 3465 FUSD Fowler高中Testco考试名称 3466 FUSD Sutter中学Testco考试名称 3467 FUSD Fremont小学Testco考试名称 联系电子邮件\ 位置ID 935测试。address@email.com 937测试。address@email.com 3465测试。address@email.com 3466测试。address@email.com 3467测试。address@email.com 链路系统尺寸(kW)\ 位置ID 935https://internal.testco.com/locations/935/syst... 楠 937https://internal.testco.com/locations/937/syst... 675.39 3465https://internal.testco.com/locations/3465/sys... 384.30 3466https://internal.testco.com/locations/3466/sys... 198.90 3467https://internal.testco.com/locations/3467/sys... 35.10 系统检查开始日期系统检查激活\ 位置ID 935 2013-10-01 00:00:00真实 937 2013-10-01 00:00:00真实 3465 2013-10-01 00:00:00真实 3466 2013-10-01 00:00:00真实 3467 2013-10-01 00:00:00真实 YTDP生产开始日期监控的逆变器/电路的数量\ 位置ID 935楠楠 937楠楠 3465楠楠 3466楠楠 3467楠楠 逆变器模型\ 位置ID 935光伏供电:PVP260KW NaN 937光伏供电:PVP260KW NaN 3465先进能源产业:太阳能333千瓦 3466光伏供电:PVP260KW NaN 3467光伏供电:PVP35KW-480 NaN 逆变器效率模块EMFG模块模型\ 位置ID 93597.0楠楠楠 937 97.0楠楠楠 346597.5楠楠楠 3466 97.0楠楠楠 346796.0楠楠楠 模块irradiancefactor模块irradiancefactorslope\ 位置ID 935楠楠 937楠楠 3465楠楠 3466楠楠 3467楠楠 直线截距模块EMP PwrDerate MeterDK 位置ID 935 NaN 0.005 3291。。。 937 NaN 0.005 11548。。。 3465南0.005 19248。。。 3466 NaN 0.005 15846。。。 3467 NaN 0.005 15847。。。 [5行x 27列] >>>metaFrame.info() INT64索引:43个条目,935到5844 数据列(共27列): ProjectName 43非空值 公司43非空值 ContactName 43非空值 Contactemail 43非空值 链接43非空值 Systemsize(kW)42个非空值 SystemCheckStartdate 37个非空值 SystemCheckActive 43非空值 YTDProductionPriortostartdate 0非空值 逆变器数量/CKTS2个非空值 逆变器RMFG 42非空值 逆变器模型8非空值 逆变器效率33非空值 ModuleMfg 0非空值 Modulemodel 0非空值 Moduleirradiancefactor 0非空值 ModuleIRradianceActorSlope 0非空值 直线截距0个非空值 ModuleTemp PwrDerate