Creating Threaded Comments With PHP And Postgresql Recursive Query
Contents
Abstract
In previous tutorials, the storing of hierachical data has been demonstrated using Modified Preorder Tree Traversal or MPTT as shown in http://www.phpro.org/tutorials/Managing-Hierarchical-Data-with-PHP-and-MySQL.html. Whilst this system works well and has the advantage of space minimisation, all the maintenance for the hierachy is passed to the application level, and requires quite a few SQL queries, thus trading off many queries at retrieval time for many queries at update time.
A good alternative to the Nested Set Model could be the use of an adjancency list style table with a recursive query to order the items. Parent/child searches are still possible, but in a single recursivie query.
For the purposes of this tutorial, the database used will be PostGreSQL, however, the logic is simply applied to MySQL or SQLite, and some references to these will be made to help on your journey.
Database
The table is simply a table that maintains the hierachy of parent-child relationships in the base table. In this example, a threaded comments system is modelled.
Create a database called blog, and in the database, create an articles table, and a comments table using the following SQL.
CREATE TABLE articles( id SERIAL PRIMARY KEY NOT NULL, title VARCHAR(45) NOT NULL, content TEXT ); CREATE TABLE comments( id SERIAL PRIMARY KEY NOT NULL, content VARCHAR(250) NOT NULL, article_id INT NOT NULL REFERENCES articles(id), posted_time TIMESTAMP DEFAULT NOW(), parent_id INT NOT NULL DEFAULT 0 ); CREATE INDEX comments_parent_id_idx ON comments(parent_id);
The comments table parent_id field will be used to create the hierachical relationship for comments for a given article. For this reason, an index is placed on the table to expediate lookups.
Meanwhile, some seed data for the articles table.
INSERT INTO articles (title, content) VALUES ('Article One', 'This is Article One' ); INSERT INTO articles (title, content) VALUES ('Article Two', 'The Second Article' ); INSERT INTO articles (title, content) VALUES ('Article Three', 'The Third Article Is Weak' ); INSERT INTO articles (title, content) VALUES ('Article Four', 'The Forth Is Strong In This One' );
The articles table should look like this:
id | title | content ----+---------------+--------------------------------- 1 | Article One | This is Article One 2 | Article Two | The Second Article 3 | Article Three | The Third Article Is Weak 4 | Article Four | The Forth Is Strong In This One
Inserting Data
Creating a comment is done in the same way as been done since time immemorial. There is no magic here, so lets create a few comments for the articles above.
select * from comments; id | content | article_id | posted_time | parent_id ----+--------------------------------+------------+----------------------------+----------- 1 | First comment on article one. | 1 | 2016-04-02 19:03:22.765217 | 0 2 | First comment on article two | 2 | 2016-04-02 19:03:22.797671 | 0 3 | First comment on article three | 3 | 2016-04-02 19:03:22.805283 | 0 4 | Second comment on article one | 1 | 2016-04-02 19:03:22.814202 | 0 5 | Second comment on article two | 2 | 2016-04-02 19:03:22.825285 | 0
The above output shows the comments with the parent id, which will become important later, and parent ID is zero for all, denoting this as a root comment, and has no parent.
With the addition of child comments to the comments table, a hierachy begins to emerge.
INSERT INTO comments (content, article_id, parent_id) VALUES ('Comment on first comment in article one', 1, 1 ) RETURNING id; INSERT INTO comments (content, article_id, parent_id) VALUES ('Another comment on first comment in article one', 1, 1 ) RETURNING id;
Retrieving Data
As the comments table needs only a parent_id field to create relationships, much in the same way as an adjacency list, a mechanism is needed to build a tree and iterate through it recursively. PostGreSQL provides Common Table Expressions (CTE)s. A CTE provides a temporary table in memory which exists only for the lifetime of the query.
The recursive query should be used any time multiple levels of descendants is required, such as with the comments table, which returns all comments under a specified one, in tree order.
The order can be defined as a path, and can be created within the CTE.
-- fetch entire tree for an article WITH RECURSIVE cte (id, content, path, parent_id, depth) AS ( SELECT id, content, array[id] AS path, parent_id, 1 AS depth FROM comments WHERE parent_id=0 AND article_id=1 UNION ALL SELECT comments.id, comments.content, cte.path || comments.id, comments.parent_id, cte.depth + 1 AS depth FROM comments JOIN cte ON comments.parent_id = cte.id ) SELECT id, content, path, depth FROM cte ORDER BY path;
The above code will display all the comments for the first article. Note that there are two comments with the parent_id of zero, denoting these are seperate threads, the first thread having two child comments.
id | content | path | depth ----+-------------------------------------------------+-------+------- 1 | First comment on article one. | {1} | 1 6 | Comment on first comment in article one | {1,6} | 2 7 | Another comment on first comment in article one | {1,7} | 2 4 | Second comment on article one | {4} | 1
By adding some more comments, the full power of this method is easy to see.
Now when the recursive query is run, the hierachical nature of the results is seen
id | content | path | depth ----+-------------------------------------------------+-----------+------- 1 | First comment on article one. | {1} | 1 6 | Comment on first comment in article one | {1,6} | 2 8 | Second comment on first comment in article one | {1,6,8} | 3 9 | This comment is four deep on article one. | {1,6,8,9} | 4 7 | Another comment on first comment in article one | {1,7} | 2 4 | Second comment on article one | {4} | 1
The tree has become evident, and the full path of each comment is easily seen, along with the depth. This will allow for the creation of some userland code to put this into a webpage, such that an article may have comments. By simply changing the article ID in the query, the results for the second article can be seen.
id | content | path | depth ----+-------------------------------+------+------- 2 | First comment on article two | {2} | 1 5 | Second comment on article two | {5} | 1
Creating a HTML List
Now that the data can be represented, there needs to be a way to show the data to the public on a web page.
With a simple function, a HTML list can be created, with the use of PHP PDO. First, lets fetch the comments for the first article
<?php
// database parameters
$dsn = 'pgsql:dbname=cccrm;host=127.0.0.1';
$user = 'kevin';
$password = 'secret_password';
try
{
// connect to database
$dbh = new PDO($dsn, $user, $password);
// fetch comments for first arcticle
$comments = fetchComments( $dbh, 1 );
// format the results
// show the results
print_r( $comments );
}
catch (PDOException $e)
{
echo 'Error: ' . $e->getMessage();
}
/**
*
* Fetch all the comments for an article
*
* @param object $dbh The db object
* @param int $article_id
* @return array
*
*/
function fetchComments( $dbh, $article_id )
{
$sql = "WITH RECURSIVE cte (id, content, path, parent_id, depth) AS (
SELECT id,
content,
array[id] AS path,
parent_id,
1 AS depth
FROM comments
WHERE parent_id=0
AND article_id=:article_id
UNION ALL
SELECT comments.id,
comments.content,
cte.path || comments.id,
comments.parent_id,
cte.depth + 1 AS depth
FROM comments
JOIN cte ON comments.parent_id = cte.id
)
SELECT id, content, path, depth FROM cte
ORDER BY path";
// prepare
$stmt = $dbh->prepare( $sql );
// bind the article_id
$stmt->bindParam( 'article_id', $article_id, PDO::PARAM_INT );
// run the query
$stmt->execute();
// fetch the results
$result = $stmt->fetchAll( PDO::FETCH_ASSOC );
// return the result
return $result;
}
?>
The above code will fetch the tree data, and dump the array of data, to show what is required to create the HTML code to represent the data.
Array ( [0] => Array ( [id] => 1 [content] => First comment on article one. [path] => {1} [depth] => 1 ) [1] => Array ( [id] => 6 [content] => Comment on first comment in article one [path] => {1,6} [depth] => 2 ) [2] => Array ( [id] => 8 [content] => Second comment on first comment in article one [path] => {1,6,8} [depth] => 3 ) [3] => Array ( [id] => 9 [content] => This comment is four deep on article one. [path] => {1,6,8,9} [depth] => 4 ) [4] => Array ( [id] => 7 [content] => Another comment on first comment in article one [path] => {1,7} [depth] => 2 ) [5] => Array ( [id] => 4 [content] => Second comment on article one [path] => {4} [depth] => 1 ) )
So now, all that is left to do is to format the array of data. To do this, the tree needs be formatted into a PHP array for iteration.
Lets jump right into it with the full script
<?php
// database parameters
$dsn = 'pgsql:dbname=cccrm;host=127.0.0.1';
$user = 'kevin';
$password = 'super_secret';
try
{
// connect to database
$dbh = new PDO($dsn, $user, $password);
// fetch comments for first arcticle
$comments = fetchComments( $dbh, 1 );
// format the results
echo toList( $comments );
}
catch (PDOException $e)
{
echo 'Error: ' . $e->getMessage();
}
/**
*
* Fetch all the comments for an article
*
* @param object $dbh The db object
* @param int $article_id
* @return array
*
*/
function fetchComments( $dbh, $article_id )
{
$sql = "WITH RECURSIVE cte (id, content, path, parent_id, depth) AS (
SELECT id,
content,
array[id] AS path,
parent_id,
1 AS depth
FROM comments
WHERE parent_id=0
AND article_id=:article_id
UNION ALL
SELECT comments.id,
comments.content,
cte.path || comments.id,
comments.parent_id,
cte.depth + 1 AS depth
FROM comments
JOIN cte ON comments.parent_id = cte.id
)
SELECT id, content, path, depth FROM cte
ORDER BY path";
// prepare
$stmt = $dbh->prepare( $sql );
// bind the article_id
$stmt->bindParam( 'article_id', $article_id, PDO::PARAM_INT );
// run the query
$stmt->execute();
// fetch the results
$result = $stmt->fetchAll( PDO::FETCH_ASSOC );
// return the result
return $result;
}
/**
*
* Creates a list of comments and replies
*
* @param array $comments
* @return string The formatted HTML list
*
*/
function toList( $comments )
{
// a place holder for the threads
$threads = [];
// loop over the comments
foreach ( $comments as $comment )
{
// a container for replies
$replies = &$threads;
// get each of the IDs from the path
$ids = explode( ',', trim( $comment['path'], '{}' ) );
// pop the id off the end of the array
$commentId = array_pop($ids);
// if we have ID's
if ($ids)
{
// loop over the ids
foreach ( $ids as $id )
{
// check if there is any replies to this comment
if( !isset( $replies[$id] ) )
{
// if not, move along
$replies[$id] = ['comment' => null, 'replies' => []];
}
// add to the replies array
$replies = &$replies[$id]['replies'];
}
}
// and add the comment, with ID to the replies array
$replies[$commentId] = ['comment' => $comment, 'replies' => []];
}
// create an unordered list
$html = "<ul>\n";
// loop over the threads
foreach ( $threads as $thread )
{
// get the formatted thread
$html .= getThread( $thread );
}
// close the list
$html .= "</ul>\n";
// and return with finished product
return $html;
}
/**
*
* Get a thread and all replies under it
*
* @param array $thread
* @return string
*
*/
function getThread($thread)
{
$out = "<li>{$thread['comment']['content']}</li>\n";
if ( $thread['replies'] )
{
$out .= "<li>\n<ul>\n";
foreach ( $thread['replies'] as $reply )
{
$out .= getThread( $reply );
}
$out .= "</ul>\n</li>\n";
}
// return the thread in an HTML list
return $out;
}
?>
The hard work is done by adding some recursion with the getThread() function. Each thread is examined and if found to have replies, the function re-examinse one again, to see if there is yet more replies to each reply.
The resulting list code looks like this.
<li>First comment on article one.</li>
<li>
<ul>
<li>Comment on first comment in article one</li>
<li>
<ul>
<li>Second comment on first comment in article one</li>
<li>
<ul>
<li>This comment is four deep on article one.</li>
</ul>
</li>
</ul>
</li>
<li>Another comment on first comment in article one</li>
</ul>
</li>
<li>Second comment on article one</li>
</ul>
And the list itself, shows clearly the tree structure for the first article
- First comment on article one.
-
- Comment on first comment in article one
-
- Second comment on first comment in article one
-
- This comment is four deep on article one.
- Another comment on first comment in article one
- Second comment on article one
This shows the completed list, and now, any article ID can be used to gain a list of fully threaded comments. But wait, there is more.
Whilst the above code works well, and gets the job done, it could be nicer.. What if, instead of having strings of HTML code within the functions, the list was created programatically using DOM? Yes, that would be nicer. So...
Create Nested Tree With DOM
In the above code, a few simple functions were strung together to achieve the goal of displaying the HTML lists. Here, a different approach is used, and some reusable code in the form of a class which extends the PHP DomDocument class to do the grunt work.
<?php
// Fire up the commentsList class
$list = new commentsList( $comments );
// and simply echo
echo $list;
class commentsList extends DOMDocument
{
/**
* @var string $menu
* @access private
*/
public $root, $comments;
/**
*
* Constructor, Calls parent and sets root node
*
* @access public
* @param string $version
* @param string $encoding
*
*/
public function __construct( $comments )
{
parent::__construct();
// set the comments
$this->comments = $comments;
// format the created HTML
$this->formatOutput = true;
// create the root container
$this->root = $this->appendChild( parent::createElement( 'ul' ) );
// create the treads
$threads = $this->makeThreads();
// create the ul from the treads
$this->toList( $threads );
}
/**
*
* Create the treads array
*
* @access public
*
*/
public function makeThreads()
{
$threads = [];
foreach ($this->comments as $comment)
{
$replies = &$threads;
$ids = explode( ',', trim($comment['path'], '{}' ) );
$commentId = array_pop( $ids );
if ($ids)
{
foreach ( $ids as $id )
{
if ( !isset( $replies[$id] ) )
{
$replies[$id] = ['comment' => null, 'replies' => []];
}
$replies = &$replies[$id]['replies'];
}
}
$replies[$commentId] = ['comment' => $comment, 'replies' => []];
}
return $threads;
}
/**
*
* loop over each thread and set it
*
* @access public
* @param array $threads
*
*/
public function toList( array $threads )
{
foreach ( $threads as $thread )
{
$this-> setThread( $thread );
}
}
/**
*
* Use DOM to create the li
*
* @access public
* @param array $thread
* @param object $node
*
*/
function setThread( $thread, $node=null )
{
$node = is_null( $node ) ? $this->root : $node;
$li = $this->createElement( 'li', $thread['comment']['content'] );
$node->appendChild( $li );
if ( $thread['replies'] )
{
$subli = $this->createElement( 'li' );
$ul = $this->createElement( 'ul' );
foreach ( $thread['replies'] as $reply )
{
self::setThread( $reply, $ul );
}
$subli->appendChild( $ul );
$node->appendChild( $subli );
}
}
/**
*
* Return a string representation of the menu
*
* @access public
* @return string
*
*/
public function __toString()
{
return $this->saveHTML();
}
} // end of class
?>
Credits
A big shout out to Macca on efnet #php for not interfering with the creation of this tutorial.