Python 大型数据集上re.findall的效率问题_Python_Performance - Fatal编程技术网

Python 大型数据集上re.findall的效率问题

python performance

Python 大型数据集上re.findall的效率问题,python,performance,Python,Performance,我正在学习麻省理工开放式课程的算法课程。在一次讲座中提到，我们在使用re.findall时必须小心，因为re通常是指数复杂度算法。这是解析大型文件或数据集时需要考虑的问题吗？是否有正则表达式的替代方法可以有效地从数据中提取模式？这取决于您想要做什么通常，使用完成任务所需的最简单工具我想，中的将比正则表达式更有效，但不允许通配符、重复等。如果您要查找的模式都在一行中，您可以一次搜索一行，在下一行之前处理每一行（并将其从内存中取出）。如果您要查找字符串的开头或结尾，则使用mystring.st

我正在学习麻省理工开放式课程的算法课程。在一次讲座中提到，我们在使用re.findall时必须小心，因为re通常是指数复杂度算法。

这是解析大型文件或数据集时需要考虑的问题吗？是否有正则表达式的替代方法可以有效地从数据中提取模式？

这取决于您想要做什么

通常，使用完成任务所需的最简单工具

我想，中的

将比正则表达式更有效，但不允许通配符、重复等。如果您要查找的模式都在一行中，您可以一次搜索一行，在下一行之前处理每一行（并将其从内存中取出）。如果您要查找字符串的开头或结尾，则使用mystring.startswith（）
或mystring.endswith（）
-这些方法更有效
您可能能够将数据分割成更易于管理的块
如果您想要多行搜索，而不是在开始或结束，并包括通配符或重复。。。你可能被正则表达式卡住了




[performance]相关文章推荐



                                                        
Performance Telerik RadGrid工作示例
performancetelerik 
Performance 部署高性能Berkeley DB系统的最佳实践
performance 
Performance 为什么需要下载管理器来利用isp提供的从加利福尼亚计算机访问弗吉尼亚ec2实例的全下载速度？
performanceapachenetworkingtcpamazon-ec2 
Performance 一台机器上的多个Solr碎片是否可以提高性能？
performancesolrlucene 
Performance Windows Azure表存储分区何时由不同的计算机提供？
performanceazure 
Performance Z3实数算术与统计
performanceencodingstatisticsz3 
Performance 为二进制搜索预处理一组常量字符串
performancealgorithmsortingsearchdata-structures 
Performance ARM与x86-64上的堆栈扫描
performancex86arm 
Performance Symfony 2网站地图谷歌
performancesymfony 
Performance 如何提高Matlab编写代码的速度
performancematlab 
Performance 主干对象内存问题
performancebackbone.js 
Performance 与Perl相比，Haskell程序的性能较低
performanceperlhaskell 
Performance 为什么快速函数在任何函数中都会变慢？
performancepostgresql 
Performance “什么是”呢；效率“；在这个机器架构信息图中？
performance 
Performance 在数据帧上使用.apply时自定义函数的性能
performancepandas 
Performance AWS ELB每天在同一时间发生碰撞
performanceamazon-web-servicesnginx 
Performance 使用'；静态vs'；A.
performancememoryrust 
Performance 使用容器操作系统的GCP中本地SSD的NVMe与SCSI性能
performancegoogle-cloud-platformgoogle-compute-engine 
Performance Firestore查询的实时更新速度非常慢
performancefirebasefluttergoogle-cloud-firestore 
Performance 变尺寸矩阵乘法的优化
performancejulia 
                                       





随机文章推荐



                                                        
Gis 访问本地网络上的地理数据有哪些方法？
gis 
PostGIS/CartoDB：在多行线中存储每个点的时间戳
gis 
Gis Instagram API media search为何提供位于矩形中的照片
gisinstagram 
Gis 有没有工具可以创建大尺寸的OpenStreetMap JPEG地图
gis 
Gis Netlogo-导入一个Shapefile并计算每个补丁上的条目数
gisnetlogo 
Gis 从NetLogo中的光栅数据指定面片值
gisnetlogo


                                        

                                        
                                        


                                                
                                                        [python]相关推荐
                                                        
Python 如何获取用户密钥（谷歌应用程序）
									Python
							 									Google App Engine
							 
Python 区分文件名和文件路径
									Python
							 
Python Jinja活动页面在包含的文件中不工作
									Python
							 									Templates
							 									Flask
							 
Python 如何转换html链接？
									Python
							 									Html
							 									Hyperlink
							 
Python 具有Django 1.5、自定义字段和共享身份验证的多种用户类型
									Python
							 									Django
							 									Django Models
							 
Python 全球名称'；svm#U量表'；没有定义
									Python
							 
Python 金字塔线程不工作
									Python
							 									Multithreading
							 
Python 测试sqlalchemy+；中session.add（）出现UnappedInstanceError；postgresql+；普隆4.3
									Python
							 									Postgresql
							 									Sqlalchemy
							 									Plone
							 
供应商之间的Python库差异
									Python
							 									Visual Studio
							 
Python Sci套件学习SGD分类器部分拟合错误
									Python
							 									Machine Learning
							 									Scikit Learn
							 
“如何解决此错误”；未定义符号：PyUnicodeUCS4“u FromObject”；在Odoo8中包含Python包时？
									Python
							 									Python 2.7
							 
Python 如何按同一数据帧中的其他行过滤数据帧？
									Python
							 									Sql
							 									Pandas
							 
检查python子进程中的共享对象
									Python
							 
pil-python多图像独立绘制
									Python
							 									Python 2.7
							 									Optimization
							 
Python Django，Aldryn-如何显示可用类别的列表
									Python
							 									Django
							 
Python 什么'；在scipy中，MatLab/Octave filt/bode的等效值是多少？
									Python
							 									Matlab
							 
Python 数字模式匹配
									Python
							 									Python 3.x
							 									Indexing
							 
Python Kivy滚动条用鼠标滚动方向
									Python
							 									Windows
							 									Python 2.7
							 
在Python元类中添加带有类的动态属性
									Python
							 									Django Rest Framework
							 
用python中的lxml获取表的内容
									Python
							 									Xpath
							 
Python 将分组的行转换为列
									Python
							 									Pandas
							 
Python 如何打印多个文件，格式相同，标题（11行）和间隔为\t的4列
									Python
							 									Matplotlib
							 
Python Django CreateView，在同一页面上有两个表单
									Python
							 									Django
							 
Python 使用linregress | IMPORT
									Python
							 									Pandas
							 
使用Python从facebook GraphAPI中提取公共_配置文件信息
									Python
							 									Facebook
							 									Facebook Graph Api
							 
Python GPU上的Tensorflow matmul计算比CPU上的慢
									Python
							 									Performance
							 									Tensorflow
							 
如何使用python子进程正确控制命令行程序？
									Python
							 									Python 3.x
							 
Python 将绘图转换为RGB阵列
									Python
							 									Matplotlib
							 
如何在Python中深度复制xml子元素
									Python
							 									Xml
							 
Python 如何计算Pytroch中两批分布之间的KL差异？
									Python
							 									Machine Learning
							 									Pytorch
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Sms
Mercurial
Google App Maker
Xml
Sip
Rxjs
Vhdl
Imagemagick
Hybris
Eclipse Rcp
Paypal
Jdbc
Google Maps Api 3
Vuejs2
Javafx
Dataframe
Eclipse
Jetty
Joomla
Pagination
Parameters
Notepad++
Virtual Machine
Linq To Sql
Protocol Buffers
Itext
Wcf
Floating Point
Filter
Reporting Services
Highcharts
Frameworks
Tree
Amazon Redshift
Boost
Jqgrid
Polymer
Debian
Atom Editor
Ios7
Octave
D
Playframework 2.0
Google Chrome Extension
Clearcase
Jquery Plugins
Teamcity
Microsoft Graph Api
Sql Server
Compression
Path
Woocommerce
Wpf
Drupal 6
Haskell
Silverlight
Iframe
Machine Learning
Cassandra
Delphi
Mqtt
Symfony1
C
Big O
Angular Material
Firefox Addon
Three.js
Autodesk Forge
Hbase
Mariadb
Jira
Pip
Activemq
Air
Aurelia
Asp.net Mvc 5
Asp.net
Socket.io
Dask
C++
Teradata
Methods
Azure Sql Database
Rest
Android Fragments
Docker Compose
Database Design
Openshift
Redis
Web
Openid
Wicket
Passwords
Loops
Apache Spark
Types
Bash
Blackberry
Xampp
Docusignapi
Azure Data Factory
Visual Studio 2010
Geolocation
Facebook
Wolfram Mathematica
Google Sheets
Office365
Keyboard
Plot
Racket
Arrays
Ajax
Typescript
Ssrs 2008
Ftp
Syntax
Scroll
Directory
Macros
Couchbase
Iis 7
Emacs
Internet Explorer
Github
Angular
Sql Server 2005
Templates
Twig
Content Management System
Sql
String
Logic
Jsf
Antlr4
Datatables
Asynchronous
Twitter
Google Cloud Firestore
Mfc
Ethereum
Ocaml
Netty
Talend
Deep Learning
Replace
Tcl
Spring Cloud
Glsl
Windows 7
Ruby On Rails 4
Prometheus
Iphone
Xsd
Nest
Yii
Numpy
Documentation
List
Ibm Midrange
Silverstripe
Mapping
Printing
Assembly
Matlab
Youtube
Maven
Shell
Artificial Intelligence
Udp
Fiware
Pine Script
Checkbox
Cron
Spring Security
File
Gulp
Testing
Ravendb
Bluetooth
Javascript
Amazon S3
Gps
Terraform
Directx
Bots
Cobol
Shiny
Testng
Eclipse Plugin
Python Sphinx
Iis
Smalltalk
Tkinter
Nginx
Lotus Notes
Django
Ffmpeg
Sass
Karate
Fullcalendar
Mvvm


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网