This article may contain affiliate links. If you buy some products using those links, I may receive monetary benefits. See affiliate disclosure here
In this post, let’s see how you can parse XML data using PHP.
By playing this video, you agree to YouTube's Terms
In today’s world, XML format is commonly used for two purposes:
- to display blog post feeds
- and for website sitemaps
In order to show you how it works let me take the example of a blog post feed. So, here is the RSS feed page on one of my blogs, which is made using WordPress.
It contains the list of posts recently published on this blog. What I want to do is, I will take this document as an input, then parse it using PHP, and then display the posts on a separate web page. Basically, that’s how a feed reader application like Feedly works.
We will use cURL to fetch the feed from the remote URL:
Using cURL to fetch a remote XML feed
$url = "https://www.coralnodes.com/feed/";
$handle = curl_init();
We need to set a couple of options. The first one is the CURLOPT_URL option, which sets the curl URL to the URL we have defined above. Then set the CURLOPT_RETURNTRANSFER option to true, so that it returns the result as a string.
curl_setopt($handle, CURLOPT_URL, $url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
Set the CURLOPT_FOLLOWLOCATION option as well to true so that the request follows any redirects to reach the final destination. For instance, HTTP to HTTPS redirect, non-www to www redirection, etc.
curl_setopt($handle, CURLOPT_FOLLOWLOCATION, true);
In our case we have set the exact URL. So there shouldn’t be any problem even if we omit this option. Then execute the curl request by calling the curl_exec()
function and pass the handle variable. Finally let’s close the curl Connection by calling the curl_close()
function.
$res = curl_exec($handle);
curl_close($handle);
Parsing using SimpleXMLElement
Next we want to parse this response. Fortunately PHP gives a built-in class called SimpleXMLElement to do that.
$feed = new SimpleXMLElement($res);
The class has a constructor method, which accepts an external string as a parameter.
In other words, what the SimpleXMLElement class does is, it converts the XML tags into PHP objects so that we can handle them using the methods defined in the class.
Let’s see how we can display the post title and description using these class methods. Close the PHP tag and then open the HTML tag.
$feed = new SimpleXMLElement($res);
?><!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>XML Parsing</title>
</head>
<body>
</body>
</html>
To get the post items we have to iterate the items
array. So let’s open a foreach loop inside PHP tags:
<body>
<?php foreach($feed->channel->item as $item) : ?>
<article>
<h2><?= $item->title ?></h2>
<p><?= $item->description ?></p>
</article>
<?php endforeach; ?>
</body>
Reload the page and we can see a list of blog posts with their title and description. Suppose I want to show the published date and the author’s name as well below each post.
<div>
<?php
$dt = new DateTime($item->pubDate);
$pub_date = $dt->format('l, F d Y');
?>
written by <?= $item->children('dc', true)->creator ?> on <?= $pub_date ?>
</div>
Creating XML
That’s how you can parse an XML document. Next let’s see how you can create an XML document using PHP.
For instance, to create an XML sitemap from the above feed data, you can do it like this:
<?php
$url = "https://www.coralnodes.com/feed/";
$handle = curl_init();
curl_setopt($handle, CURLOPT_URL, $url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
curl_setopt($handle, CURLOPT_FOLLOWLOCATION, true);
$response = curl_exec($handle);
curl_close($handle);
$feed = new SimpleXMLElement($response);
$sitemap = new SimpleXMLElement('<urlset></urlset>');
$sitemap->addAttribute("xmlns", "http://www.sitemaps.org/schemas/sitemap/0.9");
foreach($feed->channel->item as $item) {
$url = $sitemap->addChild("url");
$url->addChild("loc", $item->link);
$url->addChild("changefreq", "monthly");
}
$saved_sitemap = $sitemap->asXML();
echo $saved_sitemap;
file_put_contents("sitemap.xml", $saved_sitemap);
Searching XML using Xpath
Suppose I want to find and display all the title tags in the above feed data:
<?php
$url = "https://www.coralnodes.com/feed/";
$handle = curl_init();
curl_setopt($handle, CURLOPT_URL, $url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
curl_setopt($handle, CURLOPT_FOLLOWLOCATION, true);
$response = curl_exec($handle);
curl_close($handle);
$feed = new SimpleXMLElement($response);
$titles = $feed->xpath('/rss/channel/item/title');
foreach($titles as $title) {
echo $title . "<br>";
}