Using PHP Functions in XPath Expressions

Disclaimer: this article expects familiarity with using the DOM1 extension and XPath2 expressions.

The (currently undocumentednow documented3) DOMXPath::registerPHPFunctions method is available as of PHP 5.3.0 (it was added to the code base back in December 2006) and allows the use of PHP functions (and static methods) within XPath queries to complement the normal set of XPath functions2.

Description

void DOMXPath::registerPHPFunctions ([ string|array $restrict] )

Enables the use of PHP functions as XPath functions.

Parameters

restrict
Use this parameter to only allow certain functions to be called from XPath; it can be either a string (a function name) or an array of function names.

Return Values

No value is returned.

Examples

Note: The following examples load a sample XML document called book.xml with the following contents:

<?xml version="1.0" encoding="UTF-8"?>
<books>
    <book>
        <title>PHP Basics</title>
        <author>Jim Smith</author>
        <author>Jane Smith</author>
    </book>
    <book>
        <title>PHP Secrets</title>
        <author>Jenny Smythe</author>
    </book>
    <book>
    	<title>XML basics</title>
    	<author>Joe Black</author>
    </book>
</books>

Example #1: Call a PHP function in XPath with php:functionString

This example demonstrates the basic use of DOMXPath::registerPHPFunctions by replicating the substring XPath function. The first thing that needs to be done is to register the php namespace with the associated URI http://php.net/xpath. Don’t question it, it just needs to be done!

Next, we call DOMXPath::registerPHPFunctions on our object. If no arguments are used, as in this example, then the range of functions allowed to be called is not restricted—you can call any4 function from XPath. I would always advise restricting the functions which can be called, see example 2.

Within our XPath query, we use php:functionString which allows us to name a function and provide some parameters (or indeed, no parameters) to be passed to that function. There are two flavours which can be used here: php:functionString which passes an XML node/attribute as a string and php:function (see example 3) which passes an array of XML node/attribute objects (in DOMElement / DOMAttr / etc. form) to the function . In this example the PHP function substr is called, passing along the book’s title (in string form), an offset of 0 and length of 3. This returns the first 3 characters of the book’s title which is then compared to the string PHP in order to filter our list of books down to those having titles starting with PHP.

<?php
$dom = new DOMDocument;
$dom->load('book.xml');
 
$xpath = new DOMXPath($dom);
 
// Register the php: namespace (required)
$xpath->registerNamespace("php", "http://php.net/xpath");
 
// Register PHP functions (no restrictions)
$xpath->registerPHPFunctions();
 
// Call substr function on the book title
$nodes = $xpath->query('//book[php:functionString("substr", title, 0, 3) = "PHP"]');
 
echo "Found {$nodes->length} books starting with 'PHP':\n";
foreach ($nodes as $node) {
	$title  = $node->getElementsByTagName("title")->item(0)->nodeValue;
	$author = $node->getElementsByTagName("author")->item(0)->nodeValue;
	echo "$title by $author\n";
}
?>

The example will output something like:

Found 2 books starting with 'PHP':
PHP Basics by Jim Smith
PHP Secrets by Jenny Smythe

Example #2: Restricting the functions available to XPath

To restrict the functions made available to XPath, provide either a string containing the name of the single function that you wish to allow or an array of strings containing function names as the restrict parameter (note: static methods can also be used, e.g. “Classname::method“). If functions were added with restrict and a function is called in XPath which is not one of them, an E_WARNING will be raised stating Not allowed to call handler ‘function()’ (where function is the name of the function that cannot be called).

<?php
$dom = new DOMDocument;
$dom->load('book.xml');
 
$xpath = new DOMXPath($dom);
 
// Register the php: namespace (required)
$xpath->registerNamespace("php", "http://php.net/xpath");
 
// Register PHP functions (no restrictions)
$xpath->registerPHPFunctions("strtoupper");
 
// Get first book's title in uppercase
$title = $xpath->evaluate('php:functionString("strtoupper", //book[1]/title)');
echo $title;
 
// Try a function not in our restrictions list
$fail = $xpath->evaluate('php:functionString("strtolower", //book[1]/title)');
 
?>

The example will output something like:

PHP BASICS<br />
<b>Warning</b>:  DOMXPath::evaluate() [<a href='domxpath.evaluate'>domxpath.evaluate</a>]: Not allowed to call handler 'strtolower()'. in <b>example2.php</b> on line <b>18</b><br />

Example #3: Passing DOM objects using php:function

Up to now, the examples have both used php:functionString. As mentioned above, instead of passing a string value to the PHP function it is possible to pass along an array of DOM* objects to manipulate them as you please by using php:function.

<?php
$dom = new DOMDocument;
$dom->load('book.xml');
 
$xpath = new DOMXPath($dom);
 
// Register the php: namespace (required)
$xpath->registerNamespace("php", "http://php.net/xpath");
 
// Register PHP functions (no restrictions)
$xpath->registerPHPFunctions("example3");
 
function example3($nodes) {
    // Return true if more than one author
    return count($nodes) > 1;
}
// Filter books with multiple authors
$books = $xpath->query('//book[php:function("example3", author)]');
 
echo "Books with multiple authors:\n";
foreach ($books as $book) {
    echo $book->getElementsByTagName("title")->item(0)->nodeValue . "\n";
}
?>

The example will output something like:

Books with multiple authors:
PHP Basics

Summary

Just to quickly summarise everything, here is a quick run-down. In PHP, make sure to register the php namespace (with the URI http://php.net/xpath) and then register your PHP functions (whether core, extensions or user-defined) or static methods with DOMXPath::registerPHPFunctions. In XPath, use php:functionString or php:function to call the PHP function.

If you have made use of this feature, or want to know more, then do feel free to comment. Thanks for reading.

Footnotes

  1. http://php.net/dom
  2. http://schlitt.info/opensource/blog/0704_xpath.html
  3. The documentation page, when it gets written, will beis available at http://php.net/domxpath.registerphpfunctions
  4. In truth, some functions are not suitable such as those that return non-scalar values (and cannot be cast to one) which XPath will not understand.

No comments so far. Add yours.

  1. Comments on “Creating a Crypter Class with PHP”

    The following was supposed to be a comment to the Nettuts+ article published recently entitled Creating a Crypter Class with PHP. The powers that be over there seem not to want to moderate the comment so I’ll publish it here and hopefully the trackback will connect things together. The comment is after the fold and it would make sense to perhaps at least scan over Christian’s article before reading my comments.

  2. The InfiniteIterator in PHP

    In an article back in July (Anonymous Functions and Closures (as of PHP 5.3)), I gave an example of looping over a series of values repeatedly. Whilst that example does the job (and introduces the concept of closures) it’s hardly the most convenient method of repeatedly iterating over a series of values. Introducing the InfiniteIterator which is part of the Standard PHP Library (SPL).

  3. What Firefly/Serenity Can Teach Us About PHP (external)

    This update is just really linking through to What Firefly/Serenity Can Teach Us About PHP on PHP Developer where the author makes some comparison between the awesome sci-fi series Firefly and PHP. Worth a few minutes at least: go read.

  4. Feedburner Stats with YQL

    A short while ago (read: earlier today) I got the urge to add the (tiny!) subscriber count onto my blog here: mostly so that I can see the fluctuations without having to log into the Feedburner site and click around, but also to show the lucky few of you who do visit that you’re in esteemed company! Feedburner has a really nice little API which could be used directly except that I couldn’t find a JSON output format from it (to use via JavaScript) and, well, I wanted to keep playing with YQL. So here’s what I did.

  5. YQL keeps improving

    I missed this yesterday but the YQL team have pushed out a whole heap of changes to their awesome service. A full list of changes can be found in the changelog but my favourites have to be (in no particular order): INSERT/UPDATE/DELETE, post/put/delete and a bug fix allowing remote JSON to have a top-level array.