使用PHP提取字符串的一些XML标记
我有以下功能:使用PHP提取字符串的一些XML标记,php,xml,simplexml,Php,Xml,Simplexml,我有以下功能: function translate($params) { $xmldata = '<?xml version="1.0" encoding="UTF-8" ?><root>' . html_entity_decode($params['data']) . '</root>'; $lang = ucfirst(strtolower($params['lang'])); if (simplexml_load_string(
function translate($params) {
$xmldata = '<?xml version="1.0" encoding="UTF-8" ?><root>' . html_entity_decode($params['data']) . '</root>';
$lang = ucfirst(strtolower($params['lang']));
if (simplexml_load_string($xmldata) === FALSE) {
return $params['data'];
} else {
$langxmlobj = new SimpleXMLElement($xmldata);
if ($langxmlobj -> $lang) {
return ($langxmlobj -> $lang);
} else {
return $params['data'];
}
}
}
但是
当字符串中有任何其他标记时:
$params['data'] = '<English><h1>Hello</h1></English><French><h1>Bonjour</h1></French>';
$params['lang'] = 'English';
$params['data']='HelloBonjour';
$params['lang']='English';
它不输出任何东西
我希望它输出:
<h1>Hello</h1> or any other tag within the <LanguageQuotes>
Hello或
把我的头发拔出来;有什么想法吗
版本2:
当字符串如下所示时,它不起作用:
$data = '<French><li><span class="pull-right">25 GB</span>Espace disque</French><English><li><span class="pull-right">25 GB</span>Disk Space</English>
<French><li><span class="pull-right">YES</span>PHP 5, MySQL 5</French><English><li><span class="pull-right">YES</span>PHP 5, MySQL 5</English>
<French><li><span class="pull-right">100</span>Bases de données</French><English><li><span class="pull-right">100</span>Databases</English>
<French><li><span class="pull-right">∞</span>E-Mails</French><English><li><span class="pull-right">∞</span>E-mails</English>';
$data='25 GB空间取消占用 25 GB磁盘空间
YESPHP 5,MySQL 5 YESPHP 5,MySQL 5
100个数据库
∞电子邮件 ∞电子邮件;
这种方法可能会对您有所帮助。我不是用XML封装数据,我认为这里不需要它。您只需要在两个自定义标记之间查找数据
/**
* $matches[0] -> Returns string with the custom tag
* $matches[1] -> Returns string without the custom tag
*
* @param string $data
* @param string $tag
* @return string
*/
function find_between_custom_tag($data, $tag) {
$regex = '/<' . $tag . '>(.*?)<\/' . $tag . '>/';
preg_match($regex, $data, $matches);
return $matches[1];
}
$data = '<English><h1>Hello</h1></English><French><h1>Bonjour</h1></French>';
$tag = 'English';
echo '<pre>';
echo htmlspecialchars( find_between_custom_tag($data, $tag) );
echo '</pre>';
/**
*$matches[0]->返回带有自定义标记的字符串
*$matches[1]->返回不带自定义标记的字符串
*
*@param string$data
*@param string$tag
*@返回字符串
*/
函数find_-between_-custom_标记($data,$tag){
$regex='/(.*?)/';
预匹配($regex,$data,$matches);
返回$matches[1];
}
$data='HelloBonjour';
$tag='English';
回声';
echo htmlspecialchars(在自定义标签($data,$tag)之间查找标签);
回声';
输出:
$data = '<French><li><span class="pull-right">25 GB</span>Espace disque</French><English><li><span class="pull-right">25 GB</span>Disk Space</English>
<French><li><span class="pull-right">YES</span>PHP 5, MySQL 5</French><English><li><span class="pull-right">YES</span>PHP 5, MySQL 5</English>
<French><li><span class="pull-right">100</span>Bases de données</French><English><li><span class="pull-right">100</span>Databases</English>
<French><li><span class="pull-right">∞</span>E-Mails</French><English><li><span class="pull-right">∞</span>E-mails</English>';
$html_data =
'<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head>
<body>'.$data.'</body>';
libxml_use_internal_errors(TRUE);
$dom = new DOMDocument();
$dom->loadHtml($html_data);
$dom->formatOutput = TRUE;
echo $dom->saveXml();
<li><span class="pull-right">25 GB</span>Espace disque
你好
你的问题有两部分
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<body>
<french>
<li><span class="pull-right">25 GB</span>Espace disque</li>
</french>
<english>
<li><span class="pull-right">25 GB</span>Disk Space</li>
</english>
<french>
<li><span class="pull-right">YES</span>PHP 5, MySQL 5</li>
</french>
<english>
<li><span class="pull-right">YES</span>PHP 5, MySQL 5</li>
</english>
...
</body>
</html>
输出:
$xpath = new DOMXpath($dom);
$root = simplexml_import_dom($xpath->evaluate('/html/body')->item(0));
var_dump($root);
$xpath = new DOMXpath($dom);
$string = '';
foreach ($xpath->evaluate('/html/body/*[name() = "english"]/*') as $node) {
$string .= $dom->saveHtml($node);
}
echo $string;
function extractXML($data,$ce) {
$all = array(
"en" => "english",
"fr" => "french",
);
$lang = $all[$ce];
if (!$lang) { $lang='english'; }
$re = "/\<".$lang."?\>(.*?)\<\/".$lang."\>/i";
preg_match_all($re,$data,$matches);
foreach ($matches[1] as $name) {
$return .= $name;
}
return $return;
}
//Load your XML data
$test = '
<english>This is in english</english>
<english><div><span>This is also in english</span></div></english>
<french><div><span>This is some text</span></div></french>
<french><span>Regex Power!</span></french>
';
$str = '<?xml version="1.0" encoding="UTF-8" ?><root></root>';
echo $str.extractXMLLang($test,'en');
或者直接获取节点并将其保存为HTML片段:
<li>
<span class="pull-right">25 GB</span>Disk Space</li><li>
<span class="pull-right">YES</span>PHP 5, MySQL 5</li><li>
<span class="pull-right">100</span>Databases</li><li>
<span class="pull-right">∞</span>E-mails</li>
输出:
$xpath = new DOMXpath($dom);
$root = simplexml_import_dom($xpath->evaluate('/html/body')->item(0));
var_dump($root);
$xpath = new DOMXpath($dom);
$string = '';
foreach ($xpath->evaluate('/html/body/*[name() = "english"]/*') as $node) {
$string .= $dom->saveHtml($node);
}
echo $string;
function extractXML($data,$ce) {
$all = array(
"en" => "english",
"fr" => "french",
);
$lang = $all[$ce];
if (!$lang) { $lang='english'; }
$re = "/\<".$lang."?\>(.*?)\<\/".$lang."\>/i";
preg_match_all($re,$data,$matches);
foreach ($matches[1] as $name) {
$return .= $name;
}
return $return;
}
//Load your XML data
$test = '
<english>This is in english</english>
<english><div><span>This is also in english</span></div></english>
<french><div><span>This is some text</span></div></french>
<french><span>Regex Power!</span></french>
';
$str = '<?xml version="1.0" encoding="UTF-8" ?><root></root>';
echo $str.extractXMLLang($test,'en');
25 GB磁盘空间
是的,MySQL 5
100个数据库
∞电子邮件
我不确定这是否适合您的目的,但您可以使用regex检查标记
$htmlSpecialFrench = htmlspecialchars('<li><span class="pull-right">25 GB</span>Espace disque');
函数extractXML($data,$ce){
$all=数组(
“en”=>“英语”,
“法语”=>“法语”,
);
$lang=$all[$ce];
如果(!$lang){$lang='english';}
$re=“/\(.*?)\/i”;
preg_match_all($re,$data,$matches);
foreach($matches[1]作为$name){
$return.=$name;
}
return$return;
}
//加载XML数据
$test='1
这是用英语写的
这也是用英语写的
这是一些文本
雷格克斯力量!
';
$str='';
echo$str.extractXMLLang($test,'en');
这将正确返回语言中的所有标记。只需使用前面提到的
extractXMLLang(字符串,语言缩写)
:在版本2中,您的XML无效,因为您在XML标记中使用的是非结束HTML
如果希望将HTML保存在XML中,则需要用HTML实体替换HTML代码的特殊字符。为此,您可以使用函数htmlspecialchars()
。您也可以使用htmlentities()
作为替代方法。后者替换了更多字符
可以使用函数HTML\u entity\u decode()
将HTML实体替换为其字符
示例:
$data = '<French><li><span class="pull-right">25 GB</span>Espace disque</French><English><li><span class="pull-right">25 GB</span>Disk Space</English>
<French><li><span class="pull-right">YES</span>PHP 5, MySQL 5</French><English><li><span class="pull-right">YES</span>PHP 5, MySQL 5</English>
<French><li><span class="pull-right">100</span>Bases de données</French><English><li><span class="pull-right">100</span>Databases</English>
<French><li><span class="pull-right">∞</span>E-Mails</French><English><li><span class="pull-right">∞</span>E-mails</English>';
$html_data =
'<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head>
<body>'.$data.'</body>';
libxml_use_internal_errors(TRUE);
$dom = new DOMDocument();
$dom->loadHtml($html_data);
$dom->formatOutput = TRUE;
echo $dom->saveXml();
<li><span class="pull-right">25 GB</span>Espace disque
对于本例中存储在$htmlSpecialEnglish
中的英文值,也必须执行相同的操作
转换后的HTML可以包含在XML标记中,而不会干扰XML语法:
<?php
$params= '<English><h1>Hello</h1></English><French><h1>Bonjour</h1></French>';
print $params;
?>
<h1> for <h1>
</h1> for</h1>
$data=“$htmlSpecialFrench$htmlSpecialEnglish”
要从
$data
获取原始HTML,首先必须使用函数提取所选语言的值。然后您可以使用html\u entity\u decode()
对转换后的html进行解码。我不知道您的意思,但这些可能会有所帮助:
复制脚本并粘贴到设计器选项卡中,然后在代码选项卡中获取脚本(使用dreamweaver处理此问题)。
例:
h1代表
/h1代表
您不应该将html字符串编码在XML中吗?在您的第2版$data
中,html格式完全不正确,缺少结尾
's@akond不,它可以是任何结构,不一定是有效的XML