So, lets assume you’ve got a PHP project where you’re scraping pages and trying parse fields out of the DOM. Up till now, I’ve just used regular expressions because they’re easy. I avoided trying to parse html as xml using SimpleXML because there’s just to many cases where it would fail due to invalid tags. Well, I feel like an idiot. It turns out there’s a great extension built into PHP to do just that, and it’s the DOM extension.