用PHP解析XML。重复的节点名称
我正在使用simplexml\u load\u file()函数解析xml文件。我遇到两个同名节点时出现问题。我的XML文件结构是:用PHP解析XML。重复的节点名称,php,xml,Php,Xml,我正在使用simplexml\u load\u file()函数解析xml文件。我遇到两个同名节点时出现问题。我的XML文件结构是: <company> <name>Test Co</name> <link href="link1" rel="self"/> <link href="link2" rel="www"/> </company> [title] => Xml By
<company>
<name>Test Co</name>
<link href="link1" rel="self"/>
<link href="link2" rel="www"/>
</company>
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
如何从每家公司回显link2?您可以使用xPath:
<?php
$sxml = simplexml_load_string('<company>
<name>Test Co</name>
<link href="link1" rel="self"/>
<link href="link2" rel="www"/>
</company>');
$link = $sxml->xPath('//link[@rel="www"]');
var_dump($link);
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
xPath('//link[@rel=“www”]');
var_dump($link);
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
问候语链接[1]['href'];//注意,你必须在这里施放!
<?php
$xml = '<?xml version="1.0" ?>
<company>
<name>Test Co</name>
<link href="link1" rel="self"/>
<link href="link2" rel="www"/>
</company>';
$parameters = new SimpleXmlElement($xml);
echo (string)$parameters->link[1]['href']; // note you must cast here!
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
这将输出'link2'(不带引号)
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
另外,您可能希望预先检查这些元素是否存在,但我将留给您,因为这超出了问题的范围;) 您可以尝试我的类,它使用简单xml
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
这是密码
<?php
class xml_grabber
{
private $xml_file = '' ;
private $xml_link = '' ;
private $xml_dom = '' ;
private $xml_type = '' ;
private $xml_content = '' ;
private $xml_errors = array() ;
public $xml_stack = 0 ;
public $only_text = 0 ;
private $xml_connect_count = 0 ;
public function __construct($link_file_com = '')
{
if(!$link_file_com)
{
$this->xml_errors['construct'] = 'No Xml In Construct' ;
return false;
}
elseif(!function_exists('simplexml_load_file') || !function_exists('simplexml_load_string') || !function_exists('simplexml_import_dom'))
{
$this->xml_errors['functions'] = 'simple xml function not exists' ;
return false;
}
else
{
$this->set_xml(trim($link_file_com)) ;
}
}
public function only_text( $val = 0 )
{
$this->only_text = $val ;
}
public function set_xml($xml)
{
if(isset($xml{3}))
{
if(file_exists($xml))
{
$this->xml_type = 1 ;
$this->xml_file = $xml ;
}
elseif(filter_var($xml, FILTER_VALIDATE_URL))
{
$this->xml_type = 2 ;
$this->xml_link = $xml ;
}
else
{
$this->xml_type = 3 ;
$this->xml_dom = $xml ;
}
}
else
{
$this->xml_type = '' ;
}
}
public function get_xml()
{
if($this->xml_type == '')
{
return false ;
}
elseif($this->xml_type == 1)
{
return $this->xml_file ;
}
elseif($this->xml_type == 2)
{
return $this->xml_link ;
}
elseif($this->xml_type == 3)
{
return $this->xml_dom ;
}
}
public function set_columns($new_columns= array())
{
return $this->xml_columns = $new_columns ;
}
public function get_columns()
{
return $this->xml_columns ;
}
public function load()
{
if($this->xml_type == '')
{
$this->xml_errors['loader'] = 'Unknown XML type' ;
return false;
}
elseif($this->xml_type == 1)
{
$xml = simplexml_load_file($this->xml_file,null, LIBXML_NOCDATA) ;
$this->xml_content = $xml ;
}
elseif($this->xml_type == 2)
{
$con = $this->connect($this->xml_link);
$this->xml_content = simplexml_load_string(trim($con),null, LIBXML_NOCDATA) ;
}
elseif($this->xml_type == 3)
{
return $this->xml_dom ;
}
}
public function fetch($return = 'array')
{
$xml = array() ;
if($this->xml_content)
{
print_r ( $this->get_attribute($this->xml_content)) ;
}
}
public function get_attribute($object)
{
if( (is_array( $object ) || is_object( $object )) && count( $object ) > 0 )
{
foreach( $object as $k => $val )
{
$count = count( $val ) ;
$cou = count( $object->$k ) ;
if( $count > 0 and is_array( $val ) || is_object( $val ) )
{
if( $cou > 1 )
{
$result['item'][] = $this->get_attribute( $val ) ;
}
else
{
$result[$k] = $this->get_attribute( $val ) ;
}
}
else
{
$attr = $val->attributes() ;
if( count( $attr ) > 0 )
{
$var ;
foreach( $attr as $kk => $vv )
{
$var[$kk] = (string) $vv ;
}
if( $cou > 1 )
{
$result[$k][] = $var ;
}
else
{
$result[$k] = $var ;
}
}
else
{
$result[$k] = (string) $val ;
}
}
}
}
else
{
$result[] = $object ;
}
return $result ;
}
/*
public function fetch($return = 'array')
{
if($this->xml_content)
{
$rss_feed = $this->xml_content ;
$rss_title = (string) $rss_feed->channel->title ;
$rss_link = (string) $rss_feed->channel->link ;
$rss_cat = (string) $rss_feed->channel->category ;
$rss_image = (string) $rss_feed->channel->image->url ;
$rss_summary =
array
(
'info' =>
array(
'title'=>$rss_title ,
'link'=>$rss_link ,
'cat'=>$rss_cat ,
'image'=>$rss_image
) ,
'item' => array()
) ;
if(is_array($rss_feed->channel) or is_object($rss_feed->channel))
{
if(is_array($rss_feed->channel->item) or is_object($rss_feed->channel->item))
{
foreach($rss_feed->channel->item as $item)
{
if($item->enclosure && $item->enclosure->attributes())
{
$image0 = $item->enclosure->attributes() ;
$image_url = $image0 ['url'] ;
}
$result = array() ;
foreach($item as $k=>$v)
{
$result[strtolower($k)] = strip_tags((string) $v) ;
}
if(isset($image_url{1}))
{
$result['image0'] = $image_url ;
}
$rss_summary['item'][] = $result ;
}
}
elseif(is_array($rss_feed->channel->entry) or is_object($rss_feed->channel->entry))
{
foreach($rss_feed->channel->entry as $item)
{
if($item->enclosure && $item->enclosure->attributes())
{
$image0 = $item->enclosure->attributes() ;
$image_url = $image0 ['url'] ;
}
$result = array() ;
foreach($item as $k=>$v)
{
$result[strtolower($k)] = (string) $v ;
}
if(isset($image_url{1}))
{
$result['image0'] = $image_url ;
}
$rss_summary['item'][] = $result ;
}
}
}
else
{
if(is_array($rss_feed->item) or is_object($rss_feed->item))
{
foreach($rss_feed->item as $item)
{
if($item->enclosure && $item->enclosure->attributes())
{
$image0 = $item->enclosure->attributes() ;
$image_url = $image0 ['url'] ;
}
$result = array() ;
foreach($item as $k=>$v)
{
$result[strtolower($k)] = (string) $v ;
}
if(isset($image_url{1}))
{
$result['image0'] = $image_url ;
}
$rss_summary['item'][] = $result ;
}
}
elseif(is_array($rss_feed->entry) or is_object($rss_feed->entry))
{
foreach($rss_feed->entry as $item)
{
if($item->enclosure && $item->enclosure->attributes())
{
$image0 = $item->enclosure->attributes() ;
$image_url = $image0 ['url'] ;
}
$result = array() ;
foreach($item as $k=>$v)
{
$result[strtolower($k)] = (string) $v ;
}
if(isset($image_url{1}))
{
$result['image0'] = $image_url ;
}
$rss_summary['item'][] = $result ;
}
}
}
if($return == 'json')
{
return json_encode($rss_summary) ;
}
elseif($return == 'serialize')
{
return serialize($rss_summary) ;
}
elseif($return == 'xml')
{
return xml_encode($rss_summary) ;
}
else
{
return $rss_summary ;
}
}
else
{
$this->xml_errors['fetch'] = 'No Xml Content' ;
}
}
*/
protected function connect($link,$post='')
{
if(!filter_var($link, FILTER_VALIDATE_URL))
{
$this->xml_errors['connect'] = 'Not Vaild Link To Get data' ;
return false ;
}
if(function_exists('curl_init'))
{
$cu = curl_init();
curl_setopt($cu,CURLOPT_URL,$link);
curl_setopt($cu,CURLOPT_AUTOREFERER,true);
if($post != '')
{
curl_setopt($cu,CURLOPT_HEADER, 1);
curl_setopt($cu,CURLOPT_POST,3);
curl_setopt($cu,CURLOPT_POSTFIELDS,$post);
}
curl_setopt($cu,CURLOPT_USERAGENT,$_SERVER['HTTP_USER_AGENT']);
curl_setopt($cu,CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($cu,CURLOPT_SSL_VERIFYHOST,false);
curl_setopt($cu,CURLOPT_RETURNTRANSFER,true);
$co = curl_exec($cu) ;
if($co)
{
$con = $co ;
}
else
{
$this->xml_errors['connect'] = 'No Result From Curl' ;
$this->xml_errors['curl'] = curl_error($cu);
}
curl_close($cu) ;
}
if(!$con and function_exists('ini_get'))
{
$url_fopen = ini_get('allow_url_fopen') ;
if($url_fopen == 0)
{
if(function_exists('ini_set'))
{
ini_set('allow_url_fopen', 1) ;
}
$check_fopen = 1 ;
}
else
{
$check_fopen = 0 ;
}
if($check_fopen == 1)
{
$url_fopen = ini_get('allow_url_fopen') ;
}
if($url_fopen == 1)
{
if(function_exists('file_get_contents') and !$con)
{
$con = @file_get_contents($link) ;
if($con)
{
$con ;
}
else
{
$this->xml_errors['connect'] = 'No Result From file_get_contents' ;
}
}
elseif(function_exists('readfile') and !$con)
{
$con = @readfile($link);
if($con)
{
$con ;
}
else
{
$this->xml_errors['connect'] = 'No Result From readfile' ;
}
}
elseif(function_exists('file') and !$con)
{
$con = @file($link) ;
if($con)
{
$con ;
}
else
{
$this->xml_errors['connect'] = 'No Result From file' ;
}
}
}
}
if(!$con)
{
$this->xml_errors['connect'] = 'Curl And Allow Url Fopen Disabled On Server' ;
return false ;
}
elseif(stripos($con,'DDoS protection by CloudFlare') and $this->xml_connect_count < 1)
{
echo '<h2> Curl Answer No : '.$this->xml_connect_count.' </h2>'.$con.'<br />';
preg_match('/'.preg_quote('.val(').'(.*?)'.preg_quote(');').'/is', $con, $match);
$matc = str_replace('+', " + ", $match[1]) ;
$matc = str_replace('*', " * ", $matc) ;
$post = eval ("return $matc ;") ;
$html=new DOMDocument();
$html->loadHTML($con) ;
$xpath = new DOMXPath($html);
$tags = $xpath->query('//input[@type="hidden"]');
$fields = array();
foreach ($tags as $tag) {
$fields[trim($tag->getAttribute('name'))] = trim($tag->getAttribute('value')) ;
}
$post_data = 'act=jschl&jschl_vc='.$fields['jschl_vc'].'&jschl_answer='.$post ;
echo '<h2> I will post this info to curl </h2>'.$post_data.'<br /> ' ;
$this->xml_connect_count = $this->xml_connect_count + 1 ;
sleep(5) ;
$con = $this->connect($link , $post_data) ;
echo '<h2> Curl Answer No : 1 </h2>'. $con.'<br />' ;
preg_match('/^Set-Cookie: (.*?);/m', $con, $m);
echo '<h2> Cookie from Result </h2>' ;var_dump(parse_url($m[1]));
}
else
{
return $con ;
}
}
public function get_error()
{
return $this->xml_errors ;
}
}
?>
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
)将XML文件加载到SimpleXML中时,返回的对象表示XML根节点,在本例中为
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
因此,您只需获取第二个link标记,并获取其href
[title] => Xml By sec.php
[item] => Array
(
[0] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[1] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
[2] => Array
(
[name] => Test Co
[link] => Array
(
[0] => Array
(
[href] => link1
[rel] => self
)
[1] => Array
(
[href] => link2
[rel] => www
)
)
)
)
)
$parametrs = simplexml_load_file("my.xml");
// you need to case here, as this is actually a SimpleXML element, not a string
echo (string)$parametrs->link[1]['href'];
我只想获取link2 url。为什么要获取第二个链接?是您真正想要的还是您想要使用rel=“www”链接?请注意,此解决方案取决于元素的顺序。我建议您使用xPath方法来完成它!特别是如果您不是自己构建XML,请注意,当您将XML文件加载到SimpleXML中时,返回的对象表示XML根节点,在本例中为
。@MarkusI。如果XML每次都是这种简单的结构,只有几个节点,我实际上会投票支持上面的解决方案。虽然我非常喜欢XPath,但我发现深入到树中的特定节点更有用。我更喜欢XPath,因为他说,这个XML来自外部资源。。。我认为他决定采用link2,因为rel属性。如果将添加另一个链接,如果依赖订单,您可能会遇到麻烦…在这种情况下,这是无用的,因为您需要提前知道'link2'的值(这一操作肯定不会)。。。如果您打算使用xpath,您应该告诉它找到第二个“链接”节点。对于这个问题,这看起来太复杂了。你们班到底是做什么的?看起来不错,但我觉得这对这里没什么帮助。另外,这里的XML只是XML,不是RSS,也与RSS没有任何关系。谢谢,我只是在我的类中检查他的问题,因为它对我非常有用,我添加了我的类以确保stackoverflow用户的通知,这也将有助于我开发出好的rss阅读器:)你真的应该养成在这里使用字符串的习惯。。。在这种情况下,echo会为您完成这一任务,但如果您不知道需要强制转换,稍后您会指定一个值而不是echo,并想知道为什么它不是字符串。@quickshiftin:是的,没错,SimpleXML实际上返回的是SimpleXMLElement,而不是字符串。我会加入演员阵容。