Python 从文本文件中提取不同的链接？_Python_Regex - Fatal编程技术网

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/354.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从文本文件中提取不同的链接？_Python_Regex - Fatal编程技术网

Python 从文本文件中提取不同的链接？

python regex

Python 从文本文件中提取不同的链接？,python,regex,Python,Regex,我的问题是，我有一个非结构化的.txt文件，如下所示，其中包含不同的链接，因为每个链接中都有一个签名：我想要的是提取以http://web.alphorm.com开头的所有链接我使用了如下所示的正则表达式： matchObj = re.findall(r'(http:// web.alphorm.com/.*&Key-Pair-Id=APKAJF2PMCJPGKXG2GEA)"}', string) 但它并没有真正给我想要的。它缩小了文

我的问题是，我有一个非结构化的.txt文件，如下所示，其中包含不同的链接，因为每个链接中都有一个签名：

我想要的是提取以http://web.alphorm.com开头的所有链接

我使用了如下所示的正则表达式：

matchObj = re.findall(r'(http:// web.alphorm.com/.*&Key-Pair-Id=APKAJF2PMCJPGKXG2GEA)"}',
                      string)

但它并没有真正给我想要的。它缩小了文本文件，并给我搜索的链接，但与其他不受欢迎的链接和文本

有什么问题吗？

您的正则表达式中的

是贪婪的，这意味着解析引擎将匹配
http://web.alphorm.com/第一次匹配的，以及最后一次匹配的密钥对Id=APKAJF2PMCJPGKXG2GEA ，以及介于两者之间的所有内容试试这个： matchObj = re.findall(r'(http://web.alphorm.com/.*?&Key-Pair-Id=APKAJF2PMCJPGKXG2GEA)"}',string) 添加？将使匹配尽可能少地延迟注意：我还删除了http:// 和web.alphorm.com 之间的空格，因为我认为这是一个打字错误。请发布一篇文章，你真的在http:// 和web.alphorm.com 之间有空格吗？请回答你的问题，并从文本文件中输入一些实际的样本数据。你试过用那种方式解析它吗？

[regex]相关文章推荐 Regex 使用正则表达式验证名称和姓氏 regexstring Regex 使用sed替换文本 regexshelldatesed Regex 正则表达式选择行前没有特定文本（包括\t）的整行 regexnotepad++ Regex 注释的正则表达式，但不在“字符串”中/不在另一个容器中 regex Regex 是否有R包或函数用于将类似字符串转换为子字符串或正则表达式组？ regexstringr Regex 如何在sed中匹配{{param}}？ regexbashsed Regex 创建移动字符串的正则表达式 regexreplacesublimetext2 Regex 正则表达式html结构不匹配 regex Regex 使用正则表达式搜索windows注册表 regex Regex VBS正则表达式搜索和替换 regexreplacevbscript Regex Perl检查输入，否则使用默认值 regexperl Regex 用换行符预匹配所有多行 regex Regex 此模式的正则表达式：@或+；一两个单词+；：+；1个单词或以上+；链接不起作用 regex Regex 如何理解这个正则表达式 regex Regex 要替换和重新排序的命令行上的正则表达式 regexperlsed 如何编写一个只允许数字和COMA的regexp，并且只允许字符串开头和结尾的数字？ regex Regex 使用额外参数和更改的函数名进行大容量重构函数调用 regexbash Regex允许数值整数值，单个零，但禁用前导零 regexvalidationinput Regex 通过正则表达式从img htmk标记中提取图像源链接 regex Regex 是否可以将lookback与非固定长度模式结合使用？ regex 随机文章推荐 Network programming youtube中未使用实时传输协议？ network-programming Network programming NDIS驱动程序是否可以；锁；一台计算机对一个网络 network-programming Network programming Setsockopt（）返回错误号10042 network-programming Network programming 如何检查链接的状态？ network-programmingtcl

[python]相关推荐 Tags Lucene Arm Jenkins Log4j Charts Rspec Variables Jasmine Asp.net Mvc Firefox Web Crawler Ionic Framework Coq Algorithm C++11 Visual Studio 2012 Silverlight Rabbitmq Cassandra Url Rewriting Timer Sbt Ios6 Actionscript 3 Shiny Api Selenium Webdriver Scikit Learn Excel Formula X86 3d Java 8 Postgresql Proxy Puppet Xcode Rss Google Chrome Extension Cucumber Teamcity Woocommerce Signalr Compression Deep Learning Networking Scripting Command Line Entity Framework Vaadin Zsh Google Cloud Dataflow Antlr Office365 Maven Processing Google Visualization Drupal Azure Asp.net Mvc 3 Debian Web Services Powershell Events Qt Azure Functions Visual Studio 2017 Dart Flask Node.js Highcharts Google App Maker Sphinx Automation Wicket Angular Material Dictionary .net System Verilog Matlab Scrapy Http Computer Science Html Perforce Firefox Addon Nuget Apache Kafka Migration Perl Web Applications Windows Workflow Ipython Grid Opengl Es Reference Optimization Fortran Common Lisp Ruby Text Biztalk Ibm Midrange Express Methods Xna Virtual Machine Batch File Sql Server 2012 Browser Deployment Swiftui Servlets Adobe Windows 7 Apache Zookeeper Talend Keyboard Spring Colors Rally Keras Machine Learning Akka Doctrine Orm Rxjs Smalltalk Javafx 2 Knockout.js Datatables Sublimetext2 F# Ipad Laravel 4 Asynchronous Osgi Compiler Errors Localization Discord.py Jekyll Cron Gitlab Airflow Jquery Ui Cakephp Abap Configuration Memory Management Llvm Applescript Dialogflow Es Neural Network Cookies Wcf Stata Core Data Stripe Payments Vb6 Vmware Prometheus Facebook Graph Api Visual Studio Jhipster Object Blockchain Objective C Lambda Xpages Apache Camel Pentaho Indexing Math Sed Mapping Jboss Paypal Ldap Filter Enums Elixir Openstack Twitter Bootstrap 3 Linkedin Gulp Model View Controller Combobox Reflection Performance Lotus Notes Parse Platform Process Forms Ajax Azure Service Fabric Qt4 Actionscript Computer Vision Installation Properties Amazon Web Services Logging

Copyright © 2024. All Rights Reserved by - Fatal编程技术网