Professional and Affordable Web Design

0800 080 5401

Problems Encountered With PHP DOM Functions

Recently we have been making heavy use of PHP's excellent DOM functions, which hugely improve on the DOM XML functions from PHP 4. The DOM functions have been used for two purposes:

  • As part of a PHP framework, we have been building a XSLT based templating engine, which will replace our much slower and less flexible regular expression based engine.
  • For a variety of SERP crawling SEO tools, including Rank Check, a tool allowing you to check your search engine rankings in a number of countries.

Naturally, whenever exploring any new programming avenues you are going to come up against a few brick walls. Due to the sparse documentation of the DOM functions we have documented a few problems that cropped up.

Problems Parsing Invalid HTML or XML

When loading invalid HTML or XML into your DOM object, you will be presented with PHP warnings. If you are scraping web pages, the chances are you will get a lot of warnings similar to: Warning: DOMDocument::loadHTML() [function.DOMDocument-loadHTML]: htmlParseEntityRef: no name in Entity, line: 32 in C:\www\md_framework\rankings.php on line 174. Although these warning will not affect the process of your script, they will obscure any data your script may output. To combat this problem, simply use the error controller operator (@) as follows:

Document Validity Issue

The above example shows how to use the @ operator to suppress warnings when loading markup into your DOM object.

DOMDocument->getElementById Can't Find IDs in XML

One of the most essential functions in DOM manipulation that will be familiar to all JavaScript gurus in the getElementById function. It may come as a surprise to anyone using the DOMDocument->getElementById method on XML documents - that it doesn't work. There are three solutions for this problem:

  • Use the DOMElement->setIdAttribute function on the DOM element in question. This will now allow you to use the DOMDocument->getElementById method on the selected XML element
  • Set 'id' as an attribute in your XML's DTD
  • Use XPath to find the attribute.
Using XPath to Select an ID

The above example will query your document and find elements with the id 'foo_bar'.

First Argument is Expected to be a Valid Callback

This one has nothing to do with the DOM, but I thought I'd throw it in anyway as it came up at the same time. You will probably only encounter this error when you are creating a beastly class, as it involves a member of the Function Handling Functions.

When using the call_user_func_array function to call a method of an object you may get the error First argument is expected to be a valid callback. This will happen if the method you are calling is set to private. Usually you would expect to get an error stating you are trying to access a private member of an object, but because you are using the call_user_func_array function to call a method you get the invalid callback message. You get this error because call_user_func_array is not allowed to call private methods of a function.

Comments2 Comments

Danp129

thanks a lot!

15 October 2009

maarlin

Many thanks!! :) I've been bothering with it few months and didn't know how to work with invalid HTML, your article really helped... (either it's so simple... :D ) little pitty, that I haven't found it earlier...

02 April 2010

All comments are moderated for spam and will not be shown. All genuine comments wil be show, however the links will be based on a NO FOLLOW RULE. Repeat commenters adding value to the articles and discusions will have removed alowing Follow Rule to Work.

Make a Comment

Notify me when someone responds

Quick Contact

See Full Portfolio Some Of Our Work

Screenshot of dsbs.co.uk website
dsbs.co.uk

Driving Schools Booking Service (DSBS) is a network of driving instructors, covering the whole of the UK. For this project, we were...

Screenshot of countysecurity.co.uk website
countysecurity.co.uk

County Security is a fully featured E - commerce solution, with an integrated "system configurator", which allows users to choose...

Customer News & Resources

At Mutiny Design we are constantly gathering together articles and help guides to assist our clients.

Introduction to sitemap.xml

Checking for a sitemap A site map (or sitemap) is a list of pages of a web site accessible to crawlers or users. It can be either a document in any form used as a planning tool for web design, or a web page that lists the pages on a web site, Some developers feel that site index is a more appropriately used term to relay page function, web visitors are used to seeing each term and generally associate both as one and the same. However, a site index is often an A-Z index that provides access to content, while a site map provides a general... Read More »

Center a web page in CSS

One simple way to center a web page using CSS is to create a container div, that is horizontally centered by having its left and right margins set to auto. Using this method, you can still apply colours / background images to the body tag, so its a flexible solution. (if you didnt need this ability, just apply margins and width to the body tag instead, and forget using the container div). The container div has the same width as your webpage and, well, contains it. All the code for your web page is placed inside the container div. This will... Read More »

-