Php 提取html标记之外的文本_Php_Regex - Fatal编程技术网

Php 提取html标记之外的文本

php regex

Php 提取html标记之外的文本,php,regex,Php,Regex,我正在尝试使用preg_match（）提取文本，它不包含在像或这样的标记中。本文是从数据库中检索的，我正在使用PHP This should be extracted <p>I do not want this</p> This should be extracted <a>This may appear after other tags and I do not want this</a> 这应该被提取我不希望这个这应该被提取，但是当我将它粘贴

我正在尝试使用

preg_match（）

提取文本，它不包含在像

或这样的标记中。本文是从数据库中检索的，我正在使用PHP
This should be extracted <p>I do not want this</p> This should be extracted <a>This may appear after other tags and I do not want this</a>

这应该被提取我不希望这个
这应该被提取，但是当我将它粘贴到regex101.com时，提供的regex有一个模式错误
非常感谢您在这方面的帮助。
您可以使用PHP和获取所需的值。诀窍是将数据库中的HTML包装在（例如）一个
标记中，然后可以将其加载到DOMDocument
中，并使用DOMXPath
搜索
标记的子项，这些子项使用text（）
路径纯文本：
$html = 'This should be extracted <p>I do not want this</p> This should also be extracted <a>This may appear after other tags and I do not want this</a>';
$doc = new DOMDocument();
$doc->loadHTML("<div>$html</div>", LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
$xpath = new DOMXPath($doc);
$texts = array();
foreach ($xpath->query('/div/text()') as $text) {
    $texts[] = $text->nodeValue;
}
print_r($texts);

最好使用DOM解析器，HTML的正则表达式总是脆弱的。有关@barmar的正确原因，请参阅以下问题和答案：@barmar感谢您的回答。由于我的文本本身不是来自html或xml文件，而是来自数据库，因此它没有body或head标记。我检查了解析器是否使用getElementsByTagName访问节点，但我想要的文本不在标记中。有没有办法获取不在标记中的文本？@claris你是说你的数据库包含HTML标记吗？@Funk 49 Niner是的，不幸的是，由于我正在处理一个遗留系统，数据库包含HTML标记。
Array ( 
    [0] => This should be extracted
    [1] =>  This should also be extracted 
)




[regex]相关文章推荐



                                                        
Regex 方案清单修改
regexlistscheme 
Regex 在哪些语言中，使用用户提供的正则表达式是一个安全漏洞？
regexsecurity 
Regex 从HTML内容中提取标题
regex 
Regex 使用Reg Exp替换Perl中唯一的字符串
regexperl 
Regex 在灰显.eml文件时忽略base64编码的附件
regexbashgrep 
在ant replaceregexp任务中输出替换摘要
regexantreplace 
Regex SVN：添加到全局忽略，除非在文件夹内
regexsvn 
Regex 如何在SQL Server中对IP地址进行正则表达式类型匹配
regexsql-server-2005 
Regex 如何仅打印[]内的所有内容？
regexperlparsingsedawk 
Regex 从url htaccess中删除符号和文本
regexapache.htaccessmod-rewrite 
Regex 用于2个用户名的TortoiseGit筛选器
regexgit 
Regex 用于排除图像的Filezilla正则表达式过滤器
regex 
Regex 用于搜索匹配以下字符串的正则表达式
regexstringsearch 
Regex 如何在R中使用正则表达式删除以*开头的行
regexr 
Lighttpd regex重定向除一个模式外与模式匹配的所有页面
regex 
Gawk regexp以选择序列
regex 
Regex 使用带sed的正则表达式将空格替换为下划线
regexbashsed 
Regex python：将主地址拆分为主地址和辅助地址
regexpython-2.7 
Regex 正则表达式使用正向前瞻，而不是直接在
regex 
Regex 正则表达式中的最小长度
regexgo 
                                       





随机文章推荐



                                                        
Ms access 尝试打开mdb文件时，ShellExecute（）返回错误代码5>；10MB
ms-access 
Ms access .mdb-Access-在映像文件上追加数据库
ms-accessdatabase-design 
Ms access 从2个不同的数据库中选择2个表（ACCESS）
ms-accessselect 
Ms access 在Access 2003/2007中打开记录集
ms-accessvba 
Ms access 访问表单：如何获取访问表单的PDF
ms-accesspdf 
Ms access MS Access子表单
ms-access 
Ms access Access 2010查询拉取与标签值匹配的数据
ms-access 
Ms access 在组合框中选择记录ID时在报告中显示记录
ms-accessvba 
Ms access 使用MS Access 2010进行大型表单设计
ms-access 
Ms access Microsoft Access为空问题
ms-access 
Ms access 是否可以使用Ms Access淡入淡出子窗体或文本框？
ms-accessvba 
Ms access 清空记录，但当选择组合框选项时，必须填写记录
ms-accessvba 
Ms access 在Access中创建按钮以打开word文档
ms-accessms-word 
Ms access vba-需要延期订单的逻辑
ms-accessvba 
Ms access 为什么access表主键列包含null？
ms-access 
Ms access VBA DSum在access中不工作
ms-accessvba 
Ms access 丢失的图书馆是什么？
ms-accessvba 
Ms access 我想从MS Access查询中的交易表中获取期初和期末余额
ms-access 
Ms access 使用Pdf文件路径在Ms Access报告中显示Pdf预览
ms-accesspdf 
Ms access MS访问：连接运算符的问题
ms-access


                                        

                                        
                                        


                                                
                                                        [php]相关推荐
                                                        
Php 如何使用preg_match提取特定数据？
									Php
							 
将PHP数组转换为Python字典格式的字符串
									Php
							 									Python
							 									Arrays
							 									Function
							 									Dictionary
							 
Php 我应该在什么模型中使用zend的事务
									Php
							 									Zend Framework
							 
在php中访问对象数组
									Php
							 									Arrays
							 									Zend Framework
							 									Object
							 									Doctrine
							 
php和高延迟
									Php
							 									Apache Flex
							 
如何在php中隐藏机器人程序的内容
									Php
							 									Bots
							 
Php 重写规则删除
									Php
							 									Mod Rewrite
							 
无法在php中返回准确的值
									Php
							 									Postgresql
							 
Php 图像上载到codeigniter中的子文件夹
									Php
							 									Codeigniter
							 
Php 无法覆盖ActiveRecord属性值
									Php
							 									Yii
							 
Php 如何在Apache（Wamp服务器）中设置和访问多个文档根（目录）
									Php
							 									Apache
							 
php单行if语句字符串中的奇怪行为
									Php
							 									If Statement
							 
选项卡面板特定页面的php代码
									Php
							 
使用PHP'发送电子邮件；s send（）和SPF问题
									Php
							 									Email
							 									Dns
							 
PHP Mysql子菜单数组
									Php
							 									Mysql
							 									Css
							 
Php Mysql复杂排序
									Php
							 									Mysql
							 									Sorting
							 
Php Symfony flush和relation:列不能为NULL
									Php
							 									Symfony
							 									Doctrine
							 
如何使用PHP和HTML显示文件的内容
									Php
							 									Html
							 									File
							 
Php 每页的Laravel显示
									Php
							 									Laravel
							 
无法使用PHP的get方法获取查询字符串值
									Php
							 
Php Can'；t访问子目录中的.xlsx文件
									Php
							 									Wordpress
							 									.htaccess
							 
Php 将html表单表头写入csv文件
									Php
							 									Csv
							 
Php 正在尝试获取非对象、已准备语句的属性
									Php
							 									Sql
							 
Android PHP Mysql通过参数获取数据
									Php
							 									Android
							 									Mysql
							 
Php symfony composer更新“；don'；t安装symfony/symfony“；
									Php
							 									Symfony
							 									Composer Php
							 
阻止登录时访问登录和注册脚本（PHP会话）
									Php
							 									Session
							 
Php 将自定义交货日期范围计算保存到电子商务订单元数据
									Php
							 									Wordpress
							 									Woocommerce
							 
php preg_grep精确匹配
									Php
							 									Grep
							 
Php MySQL-在过期日期前30天获取数据
									Php
							 									Mysql
							 
Php 如何在嵌入数组mongodb的数组中使用$in
									Php
							 									Mongodb
							 
                                                        
                                                

                                                
                                                        Tags
                                                        
Streaming
Cron
Sugarcrm
Php
Regex
Dependency Injection
Opengl
Bazel
Google Maps
Amazon Dynamodb
Webview
Responsive Design
Devexpress
Visual Studio 2010
Couchdb
Twilio
Struts2
Webrtc
Isabelle
Visual Studio 2017
Python 3.x
Openlayers
.net
Virtual Machine
Embedded
Mips
Yaml
Geolocation
Xcode
Language Agnostic
Gps
Mod Rewrite
Ms Office
Ibm Midrange
Mono
Javafx
Magento2
Programming Languages
Sencha Touch 2
Model
Nativescript
Material Ui
Swift2
Visual Studio 2013
Log4j
Pandas
Cocoa
Clojure
Opencv
Map
Netlogo
Symfony1
Compression
Xmpp
Outlook
Asp.net Core
Seo
Jenkins
Text
Aem
Nunit
Parallel Processing
Exchange Server
Mapreduce
Reference
Angular6
Synchronization
Youtube Api
Eclipse Plugin
Openstack
Google App Maker
Enums
Pdf
Clang
Openssl
Wcf
Filter
Triggers
Ecmascript 6
Docker Compose
Apache Zookeeper
Vb.net
Nginx
Workflow
Telegram
Cocos2d X
Multithreading
Utf 8
Xampp
Sql
Influxdb
Wxpython
Networking
Dart
Google Maps Api 3
Orm
File
Google Cloud Platform
Merge
Playframework 2.0
Xamarin.android
Internet Explorer 8
Redux
Encoding
Security
Windows Services
Wolfram Mathematica
Elixir
Adobe
Checkbox
Sed
Eclipse
Atom Editor
Blackberry
Jasmine
Maps
Dialogflow Es
Visual Studio 2015
Vim
X86
Laravel 5
Unit Testing
Protocol Buffers
Service
Frameworks
Spring Boot
Npm
Node.js
Openid
Coldfusion
Gruntjs
Matrix
Big O
Logic
Pytorch
Aframe
Drupal
Dotnetnuke
Jetty
Selenium
Install4j
Marklogic
For Loop
Orientdb
Sqlite
Login
Google Calendar Api
Sass
Sharepoint 2010
Testing
Requirejs
User Interface
Sms
Pyspark
Junit
Tkinter
Project Management
Julia
Automated Tests
Loopbackjs
Typo3
Vuejs2
Django Models
Polymer
Google Drive Api
Camera
Linux
Button
Tomcat
Cryptography
Windows Runtime
Tridion
Sap
Ckeditor
Moodle
Apache
Timer
Amazon S3
If Statement
Amazon Cloudformation
Mapbox
Certificate
Exception Handling
Openerp
Leaflet
Apache2
Scheme
C# 4.0
Scripting
Ldap
Recursion
Facebook Graph Api
Perl
Jboss
Class
Entity Framework
Azure Functions
Ffmpeg
Prometheus
Zsh
Oop


                

                        
						
                        
                                
                                        
                                                
                                                        
                                                                Copyright © 2024. All Rights Reserved by  - Fatal编程技术网