PHPRO.ORG

Reciprocal Links

Reciprocal Links

By Kevin Waterson

Contents

  1. Abstract
  2. The Database
  3. Collecting Referers
  4. Adding Links
  5. Listing Reciprical Links

Abstract

This tutorial looks at the process of reciprical links. That is, links generated from websites that have links to your own page. These links back to a page can be detected from the HTTP REFERER which, in PHP, can be detected via the super global variable $_SERVER['HTTP_REFERER']. Care should be taken when using this variable as it is set from userland and, as such, should not be trusted. Remember the three golden rules of PHPRO.ORG..

NEVER TRUST USER INPUT

NEVER TRUST USER INPUT

NEVER TRUST USER INPUT

Because the referer will be inserted into a database, it is crucial that proper sanitizing and escaping is used to prevent malicious use.

The Database

For collecting the referer links, a MySQL database will be used to store and manipulate them. The reciprical_links table that will store them is quite basic and requires only three fields.

CREATE TABLE reciprical_links (
  reciprical_link_id int(11) NOT NULL AUTO_INCREMENT,
  reciprical_link_name varchar(1024) NOT NULL,
  reciprical_link_active tinyint(1) NOT NULL DEFAULT '0',
  PRIMARY KEY (reciprical_link_id),
  UNIQUE KEY reciprical_link_name (reciprical_link_name)
) ENGINE=InnDB;

The above table defines three fields for id, name and active. The reciprical_link_name field, which stores the referer url, is made unique so that mulitples do not occur. This works as a benifit when INSERTING links from referers as it can save a second call to the database.

A call to the database can be made to check for the existence of a url, if the url is not already int he database, than an INSERT can be performed. However, if the link already exists, no INSERT occurs. Whether or not the url already exists, one, and possibly two queries are needed. By making the field UNIQUE, no duplicates will be allowed, however, this will cause an error if trying to INSERT the same link.

To counter this error, the use of INSERT IGNORE is used which causes MySQL to ignore the attempted duplication, and no INSERT is performed.

The reciprical_link_active is a simple field which contains a 1 or a 0, the one being active, and zero being inactive.

Collecting Referers

The task of collecting referers is quite simple. By calling the insertReferer method, the $_SERVER['HTTP_REFERER'] is put into the database. The insertReferer method sanitizes the url, and with the use of a prepared statement, ensures maximum protection against malicious users trying SQL related attacks.

It also here that the INSERT IGNORE is used to save a second database query to check if the referring URL already exists in the table.

Adding Links

The addLinks method performs the task of checking of the link is real, and updating the database to reflect this. By first calling the getLinks() method to get all the referers from the database that are not active, the method then loops over each link and calls the page that it refers to. The parsePage() method looks up the page and verifies that the a link to the home_link actually exists. If there is indeed a link on the page, the reciprical_link_active field is updated to one(1) to verify this.

Of course, this process takes quite a bit of time and for a large database of inactive referers, this function should not be used more than one a day, perhaps running a script from cron to facilitate this.

When the list of referers has been traversed and the database updated, the referer links that remain inactive are deleted to prevent bloating in the database.

Listing Reciprical Links

Creating a list of reciprical links is now simply a matter of selecting the active links from the database and formatting them in a desired fashion. A getDomain() method is provided in the class to get the domain content of a url and which can then be used as a link or as text for a link.


<?php

class recipricalLink
{

    
/*
     * @the link name
     */
    
private $home_link;

    
/**
     *
     * @set the home link
     *
     * @access public
     *
     * @param string $link
     *
     */
    
public function setHomeLink($link)
    {
        
$this->home_link $link;
    }

    
/*
     * @put referer into db
     *
     * @access public
     *
     * @param string $referer
     *
     */
    
public function insertReferer($referer)
    {
        
/*** clean up the referer ***/
        
$referer filter_var($referer,  FILTER_SANITIZE_URL);

        
/*** connection ***/
        
$db db::getInstance();

        
$sql "INSERT
            IGNORE
            INTO
            reciprical_links(
            reciprical_link_name)
            VALUES(
            :reciprical_link_name)"
;
        
$stmt $db->prepare($sql);
        
$stmt->bindParam(':reciprical_link_name'$refererPDO::PARAM_STR);
        
$stmt->execute();
    }


    
/*
     *@ get reciprical links from db
     *
     * @access public
     *
     * @return array
     */
    
public function getLinks($active)
    {
        
$db db::getInstance();
        
$sql "SELECT
            reciprical_link_id,
            reciprical_link_name
            FROM
            reciprical_links
            WHERE
            reciprical_link_active = :reciprical_link_active"
;
        
$stmt $db->prepare($sql);
        
$stmt->bindParam(':reciprical_link_active'$activePDO::PARAM_INT);
        
$stmt->execute();
        return 
$stmt->fetchAll(PDO::FETCH_ASSOC);
    }

    
/**
     *
     * @parse a referer page to check a link exists
     *
     * @access public
     *
     */
    
public function addLinks()
    {
        
/*** get inactive links ***/
        
$links $this->getLinks(0);

        
/*** loop through the links ***/
        
foreach($links as $link)
        {
            
/*** check if the link is on the referer page ***/
            
if( $this->parsePage($link['reciprical_link_name']) == true)
            {
                
/*** activate the link ***/
                
$this->activateLink($link['reciprical_link_name']);
            }
        }
        
/*** to tidy up, delete any remaining inactive links ***/
        
$this->deleteInactiveLinks();
    }


    
/**
     *
     * @delete inactive links
     *
     * @access private
     *
     */
    
private function deleteInactiveLinks()
    {
        
$sql "DELETE FROM reciprical_links WHERE reciprical_link_active = 0";
        
db::getInstance()->query($sql);
    }

    
/*
     *
     * @UPDATE link status
     *
     * @access private
     *
     * @param string $link
     *
      */
    
private function activateLink($link)
    {
        
$sql "UPDATE
            reciprical_links
            SET
            reciprical_link_active = 1
            WHERE
            reciprical_link_name = :reciprical_link_name"
;

        
$db db::getInstance();
        
$stmt $db->prepare($sql);
        
$stmt->bindParam(':reciprical_link_name'$link);
        
$stmt->execute();
    }

    
/**
     *
     * @parse url and search for link back
     *
     * @access private
     *
     * @param string $link
     *
      * @return bool
     *
     */
    
private function parsePage($link)
    {
        
$dom = new domDocument;
        @
$dom->loadHTML(file_get_contents($link));
        
$dom->preserveWhiteSpace false;
        
$links $dom->getElementsByTagName('a');
        foreach (
$links as $tag)
        {
            if(
strpos($tag->getAttribute('href'), $this->home_link))
            {
                return 
true;
            }
        }
        
/*** if no url is found ***/
        
return false;
    }


    
/**
     * @Get the domain part of a url
     *
     * @access public
     *
     * @param string $link
     *
     * @return string
     *
     */
    
public function getDomain($link)
    {
        
/*** get the url parts ***/
        
$parts parse_url($link);

        
/*** return the host domain ***/
        
return $parts['scheme'].'://'.$parts['host'];
    }

/*** end of class ***/


class db
{

    
/*** Declare instance ***/
    
private static $instance NULL;

    
/**
    *
    * the constructor is set to private so
    * so nobody can create a new instance using new
    *
    */
    
private function __construct()
    {
          
/*** maybe set the db name here later ***/
    
}

    
/**
    *
    * Return DB instance or create intitial connection
    *
    * @return object (PDO)
    *
    * @access public
    *
    */
    
public static function getInstance()
    {
        if (!
self::$instance)
        {
            
$hostname 'localhost';
            
$dbname 'test';
            
$db_password 'username';
            
$db_username 'password'
            
$db_port 3306;
            
            
self::$instance = new PDO("mysql:host=$hostname;port=$db_port;dbname=$dbname"$db_username$db_password);
            
self::$instance-> setAttribute(PDO::ATTR_ERRMODEPDO::ERRMODE_EXCEPTION);
        }
        return 
self::$instance;
    }


    
/**
    *
    * Like the constructor, we make __clone private
    * so nobody can clone the instance
    *
    */
    
private function __clone()
    {
    }

/*** end of class ***/

Example Usage


<?php

try
{
    if(isset(
$_SERVER['HTTP_REFERER']))
    {
        
/*** a new object ***/
        
$rl = new recipricalLink;

        
/*** insert a refer ***/
        
$rl->insertReferer($_SERVER['HTTP_REFERER']);

        
/*** set the home link ***/
        
$rl->setHomeLink('phpro.org');

        
/*** run the referers function ***/
        
$rl->addLinks();

        
/*** get referers from db ***/
        
$links $rl->getLinks(1);
        foreach(
$links as $link)
        {
            echo 
'<a href="'.$link['reciprical_link_name'].'">'.$rl->getDomain($link['reciprical_link_name']).'</a><br />';
        }
    }
    else
    {
        echo 
'No Referer';
    }
}
catch(
Exception $e)
{
    echo 
$e->getMessage();
}

?>