PHP5:把XML文档加载到DOM对象树中,并进行处理.

piaoling  2011-01-29 23:27:40

 PHP5:把XML文档加载到DOM对象树中,并进行处理.

articles.xml:
<?xml version="1.0" encoding="iso-8859-1" ?>
<articles>
    <item>  
        <title>PHP Weekly: Issue # 172</title>  
        <link>http://www.zend.com/zend/week/week172.php</link>  
    </item>
    <item>  
        <title>Tutorial: Develop rock-solid code in PHP: Part three</title>  
        <link>http://www.zend.com/zend/tut/tut-hatwar3.php</link>  
    </item>
</articles>


创建一个DOM对象,加载articles.xml文件到DOM树中:

$dom = new DomDocument();
$dom->load("articles.xml");

也可以用PHP流来加载XML文档:

$dom->load("file:///articles.xml");

(or any other type of stream, as appropriate).

输出XML到浏览器和标准输出:

print $dom->saveXML();

保存XML到文件中:

print $dom->save("newfile.xml");

(Note that this action will send the filesize to stdout.)

There's not much functionality in this example, of course, so let's do something more useful: let's grab all the titles. There are different ways to do this, the easiest one being to use getElementsByTagname($tagname):

$titles = $dom->getElementsByTagName("title");
foreach($titles as $node) {
   print $node->textContent . " ";
}

textContent 属性不是W3C标准,它只是用于快速的访问元素的文本内容.

W3C的方法应该是:

$node->firstChild->data;

(如果确定firstChild是你所要的文本节点,不然,就要该迭代所有的子节点来查找需要的的文本节点:)

One other thing to notice is that getElementsByTagName() returns a DomNodeList, and not an array as the similar function get_elements_by_tagname() did in PHP 4. But as you can see in the example, you can easily loop through it with a foreach directive. You could also directly access the nodes with $titles->item(0). This would return the first title element.

Another approach to getting all the titles would be to loop through the nodes starting with the root element. As you can see, this is way more complicated, but it's also more flexible should you need more than just the title elements.

foreach ($dom->documentElement->childNodes as $articles) {
    //if node is an element (nodeType == 1) and the name is "item" loop further
    if ($articles->nodeType == 1 && $articles->nodeName == "item") {
        foreach ($articles->childNodes  as $item) {
            //if node is an element and the name is "title", print it.
            if ($item->nodeType == 1 && $item->nodeName == "title") {
                print $item->textContent . " ";
            }
        }
    }
}


(如果确定firstChild是你所要的文本节点,不然,就要该迭代所有的子节点来查找需要的的文本节点:)

One other thing to notice is that getElementsByTagName() returns a DomNodeList, and not an array as the similar function get_elements_by_tagname() did in PHP 4. But as you can see in the example, you can easily loop through it with a foreach directive. You could also directly access the nodes with $titles->item(0). This would return the first title element.

Another approach to getting all the titles would be to loop through the nodes starting with the root element. As you can see, this is way more complicated, but it's also more flexible should you need more than just the title elements.

foreach ($dom->documentElement->childNodes as $articles) {
    //if node is an element (nodeType == 1) and the name is "item" loop further
    if ($articles->nodeType == 1 && $articles->nodeName == "item") {
        foreach ($articles->childNodes  as $item) {
            //if node is an element and the name is "title", print it.
            if ($item->nodeType == 1 && $item->nodeName == "title") {
                print $item->textContent . " ";
            }
        }
    }
}


(如果确定firstChild是你所要的文本节点,不然,就要该迭代所有的子节点来查找需要的的文本节点:)

另外要注意的一件事是 getElementsByTagName()方法返回一个 DomNodeList 而非项 PHP4中  get_elements_by_tagname() 函数那样返回一个数组. 但如在此例中看到的,你可以非常容易的用 foreach 循环.也可以用  $titles->item(0) 直接访问这个节点.者会返回第一个 title 元素.

取得所有title另外的方法是从根元素循环所有的节点,这个方法更复杂,但是如果你不仅仅需要title元素的话,那么它就更加灵活.

foreach ($dom->documentElement->childNodes as $articles) {
    //if node is an element (nodeType == 1) and the name is "item" loop further
    if ($articles->nodeType == 1 && $articles->nodeName == "item") {
        foreach ($articles->childNodes  as $item) {
            //if node is an element and the name is "title", print it.
            if ($item->nodeType == 1 && $item->nodeName == "title") {
                print $item->textContent . " ";
            }
        }
    }
}

 

本文来自CSDN博客,转载请标明出处:http://blog.csdn.net/httpnet/archive/2005/07/27/436513.aspx 

类别 :  PHP_XML(5)  |  浏览(4982)  |  评论(0)
发表评论(评论将通过邮件发给作者):

Email: