使用多个数据集对.hdf5文件进行二次采样_Hdf5_H5py - Fatal编程技术网

使用多个数据集对.hdf5文件进行二次采样

使用多个数据集对.hdf5文件进行二次采样,hdf5,h5py,Hdf5,H5py,我试图从一个大的.h5文件中提取一些“行”，以创建一个较小的示例文件为了确保我的示例看起来像原始文件，我随机抽取行 #Get length of files and prepare samples source_file = h5py.File(args.data_path, "r") dataset = source_file['X'] indices = np.sort(np.random.choice(dataset.shape[0],args.nb_rows)) #checki

我试图从一个大的.h5文件中提取一些“行”，以创建一个较小的示例文件

为了确保我的示例看起来像原始文件，我随机抽取行

#Get length of files and prepare samples
 source_file = h5py.File(args.data_path, "r")
 dataset = source_file['X']
 indices = np.sort(np.random.choice(dataset.shape[0],args.nb_rows))

#checking we're extracting a subsample
if args.nb_rows > dataset.shape[0]:
    raise ValueError("Can't extract more rows than dataset contains. Dataset has %s rows" % dataset.shape[0] )

target_file =  h5py.File(target, "w")
for k in source_file.keys():
    dataset = source_file[k]
    dataset = dataset[indices,:,:,:]
    dest_dataset = target_file.create_dataset(k, shape=(dataset.shape), dtype=np.float32)
dest_dataset.write_direct(dataset)
target_file.close()
source_file.close()

然而，当nb_行数超过（比如10000行）时，我得到了

TypeError（“索引元素的顺序必须是递增的”）

。索引已排序，因此我认为不应出现此错误。我有什么误解吗？

我想你得到的是重复的

显然，在

args.nb_rows>dataset.shape[0]

案例中会出现重复：

In [499]: np.random.choice(10, 20)
Out[499]: array([2, 4, 1, 5, 2, 8, 4, 3, 7, 0, 2, 6, 6, 8, 9, 3, 8, 4, 2, 5])
In [500]: np.sort(np.random.choice(10, 20))
Out[500]: array([1, 1, 1, 2, 2, 2, 4, 4, 4, 5, 5, 5, 5, 6, 6, 7, 8, 8, 8, 9])

但是，当数字较小时，仍然可以获得重复项：

In [502]: np.sort(np.random.choice(10, 9))
Out[502]: array([0, 0, 1, 1, 1, 5, 5, 9, 9])

关闭

更换：
In [504]: np.sort(np.random.choice(10, 9, replace=False))
Out[504]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])

所有版本的h5py和python都是这样吗？我不明白为什么在使用Python3.7的配置上获取重复索引甚至无序索引没有问题，但在另一台使用Python3.5的机器上却没有问题。。。




[python]相关文章推荐



                                                        
如何在Python中使用pyodbc通过IP地址连接到sql server 2008
pythonsql 
如何注册一个只在Python程序*成功*退出时调用的函数？
python 
Python RuntimeWarning:在最大值中遇到无效值
pythonnumpy 
Python Numpy-矩阵向量与标量向量的点积
pythonnumpy 
Python 正在将\x转义字符串转换为UTF-8
pythonperlunicode 
Python 在听写器中迭代
pythonloopscsvdictionary 
Python scikit学习作业库错误：多处理池self.value超出'；i'；格式化代码，仅适用于大型numpy数组
pythonnumpyscikit-learnanaconda 
Python 多处理速度与核数的关系
pythonmultithreadingparallel-processing 
GitPython不签出就可以从指定的提交中获取文件吗
python 
Python 使用marshmallow sql alchemy对象时避免添加数据库会话marshmallow对象
pythonflasksqlalchemy 
PythonMySQL从cur.fetchall（）解包结果
pythonmysql 
Python 为什么我不能在图表中获得正确的日期？
pythonpython-3.xmatplotlib 
Python Scrapy spider运行并关闭，但不刮取任何数据，并且有3次调试和1次错误。
pythonweb-scrapingscrapy 
Python 基本Google API Oauth2身份验证设置无法提供对象？
pythondjangooauth-2.0 
Python 使用带有joblib的烧瓶
pythonflaskparallel-processing 
Python 为什么django一对一字段仍然显示以前的外部输入
pythondjango 
使用Python3.6和apache2没有名为芹菜的模块
pythondjango 
PythonKivy：在弹出窗口中绑定命令执行命令
python 
Python 学习！从字典中选择一个字典，并根据用户输入进行调整
pythonpython-3.xdictionary 
Python matplotlib未显示结果
pythonmatplotlib 
                                       





随机文章推荐



                                                        
Cocos2d iphone Cocos2d中的缩放精灵
cocos2d-iphone 
Cocos2d iphone Cocos2d：检测旋转精灵上的触摸？
cocos2d-iphone 
Cocos2d iphone Cocos2D查找多个层中正在使用的层
cocos2d-iphone 
Cocos2d iphone cocos2d iPhone按钮点击&；持有
cocos2d-iphone 
Cocos2d iphone 是否可以调整cocos2d的音量？
cocos2d-iphone 
Cocos2d iphone 我想使用相机覆盖作为点击事件的背景
cocos2d-iphone 
Cocos2d iphone Box2D小口袋孔创建
cocos2d-iphone 
Cocos2d iphone Cocos2d基于时间的动画和基于帧的动画混淆？
cocos2d-iphone 
Cocos2d iphone cocos2d转换坐标
cocos2d-iphone 
Cocos2d iphone 更改项目图像？
cocos2d-iphone 
Cocos2d iphone box2d检测碰撞但避免物理结果
cocos2d-iphone 
Cocos2d iphone Cocos2d-x摇摇晃晃的、液体的、波浪效应的android怪异行为
cocos2d-iphonecocos2d-x 
Cocos2d iphone cocos2d iphone js绑定的API参考
cocos2d-iphone 
Cocos2d iphone 我如何在我的CCSequence中动态更改延迟？
cocos2d-iphone 
Cocos2d iphone 如何为图层设置边界框？
cocos2d-iphone 
Cocos2d iphone 如何缩放CCLayer并保持其位置？
cocos2d-iphone 
Cocos2d iphone 完成CCBlink后移除精灵
cocos2d-iphone 
Cocos2d iphone 你如何使一个精灵看起来像《星球大战》的简介一样倾斜？
cocos2d-iphone 
Cocos2d iphone png文件在xcode+cocos2d中不可见
cocos2d-iphone 
Cocos2d iphone 如何处理cocos的触摸事件？
cocos2d-iphonecocos2d-x


                                        

                                        
                                        


                                                
                                                        [azure data factory]相关推荐
                                                        
Azure data factory Azure data factory data lake analytics链接服务设置失败
									Azure Data Factory
							 
Azure data factory 输入数据集不工作
									Azure Data Factory
							 
Azure data factory 从Azure数据工厂运行可执行文件时Azure批处理池中的节点数
									Azure Data Factory
							 
Azure data factory 在Azure数据工厂中映射自定义变量
									Azure Data Factory
							 
Azure data factory 是否可以通过Git为ADFv2中的HDInsightHive活动类型生成配置单元脚本？
									Azure Data Factory
							 
Azure data factory 如何在Azure数据工厂中参数化数据集定义文件名
									Azure Data Factory
							 
Azure data factory 使用ADF循环浏览blob存储容器中的所有容器
									Azure Data Factory
							 
Azure data factory 在axecute之前如何获取活动名称
									Azure Data Factory
							 
Azure data factory Azure数据工厂成本优化
									Azure Data Factory
							 
Azure data factory 来自多个源的数据，并基于查找SQL数据确定目标
									Azure Data Factory
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Localization
Module
Terraform
Coding Style
Amazon S3
Opencv
Sdk
Linux
Ruby On Rails 3
Mpi
Windows Phone 8.1
Yii
Maven 2
Log4j
Angular6
Wolfram Mathematica
Reporting Services
Mono
Permissions
Uiview
Ldap
Twitter Bootstrap 3
Vector
Typescript
Yocto
Firebase
Antlr
Windows
Synchronization
Node.js
Vba
Twilio
Caching
Google Analytics
Encryption
Data Structures
Eclipse Plugin
Centos
Gdb
Ios6
Redux
Internet Explorer
Ssis
Mvvm
Cookies
Snowflake Cloud Data Platform
Ftp
Proxy
Keyboard
Asp.net Mvc 3
Prometheus
Antlr4
.net
Google Chrome Devtools
Templates
Sockets
Vb.net
Socket.io
Gwt
Lotus Notes
Smalltalk
Entity Framework Core
Mercurial
Hive
Reflection
Wpf
Xamarin.android
Memory Leaks
Gitlab
Testing
Laravel
Akka
Xslt
Select
Python
Jquery Ui
Parallel Processing
Doctrine Orm
Stm32
Sencha Touch 2
Webgl
Uml
Speech Recognition
Fortran
Twitter
File
Object
Cocos2d Iphone
Codenameone
Youtube
Asp.net Mvc 4
Kotlin
Wso2
Parse Platform
Recursion
Install4j
Extjs
Google App Engine
Dialogflow Es
String
Bots
Abap
Imagemagick
Apache Storm
Content Management System
Tcp
Ajax
Activerecord
Ruby On Rails
Frameworks
Phpstorm
Sencha Touch
Csv
Google Chrome Extension
Directory
Hibernate
Google Cloud Dataflow
Sqlite
Artifactory
Javafx
Ethereum
Exchange Server
For Loop
Jquery Mobile
Url Rewriting
Jersey
Kdb
Compression
Exception Handling
Path
Http
Ms Word
Sass
Orientdb
Vagrant
Opengl Es
Ember.js
Syntax
Ipython
View
Loopbackjs
Unit Testing
Webpack
Eclipse Rcp
Notifications
Bison
Android Fragments
Sbt
Pandas
Windows 8
Import
Aurelia
Discord
Glassfish
Windows 7
Ionic2
Jsf 2
Java
Oracle10g
Next.js
Isabelle
Visual Studio Code
Cassandra
Animation
Linkedin
Windows Installer
Kibana
Google Api
Delphi
Openshift
Ansible
Com
Character Encoding
Exception
Project Management
Visual Studio 2013
Windows Phone 8
Pagination
Less
Charts
Bazel
Datetime
Powerbi
Visual C++
Jms
Programming Languages
Sugarcrm
Linux Kernel
Udp
Entity Framework
C# 3.0
Talend
Azure
Hyperledger Fabric
Logstash
Vim
Geolocation
Gtk
Azure Functions
Vue.js
Express


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网