How to make a tag cloud in PHP, MySQL and CSS

Update May 2011: Almost 4 years on, this post continues to get heaps of traffic and comments so i’m going to attempt to answer some of the most FAQ. I have added the MySQL table structure and demonstrated how a basic search page would work to increment the search term counter. I’ve also added some pretty CSS3 and added a last search date for each term. If you just want the files to download, skip to the end.

Tag Cloud
Sample Tag Cloud

Today I wanted to make a tag cloud, but all my searching proved fruitless – so I decided to stop being lazy and just work it out myself!

First you’ll need a database that stores the keywords and search terms.

Each time a user performs a search, we will check to see if the term has been searched before. If it has, we will increment the counter for that term. If the term has not been searched for before, we will insert the search term with a counter value of 1.

Lets begin with a basic MySQL table structure for the search table:

CREATE TABLE <code>search</code> (
  <code>id</code> int(11) NOT NULL AUTO_INCREMENT,
  <code>term</code> varchar(255) NOT NULL DEFAULT '',
  <code>counter</code> int(11) NOT NULL DEFAULT '1',
  <code>last_search</code> datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (<code>id</code>),
  KEY <code>term</code> (<code>term</code>)
) ENGINE=InnoDB;

Now lets make a search page. Here is some basic HTML to get started with:

<form id="search" action="?action=search" method="get">
  <input id="term" type="text" name="term" />
  <input id="submit" type="submit" name="submit" value="Search" />
</form>

In the next section, we are going to accomplish three things:

  1. Connect to the database
  2. Handle new searches so that the counter can be updated
  3. query the database searches to build a PHP array of search terms
// connect to the database
$conn = mysql_connect('localhost', 'username', 'password');
if (!$conn or !mysql_select_db('my_database', $conn)) die('cannot connect to db');

// handle new searches
if (isset($_GET['action']) and $_GET['action'] == 'search')
{
    // get the current time
    $now = date("Y-m-d H:i:s");

    // get the submitted term and prepare it for the database query
    $term = mysql_real_escape_string(strip_tags(trim($_GET['term'])));

    // check if the term has been submitted before
    if (mysql_result(mysql_query("SELECT COUNT(id) FROM search WHERE term = '$term'"), 0) > 0)
    {
        // the term exists - update the counter and the last search timestamp
        mysql_query("UPDATE search SET counter = counter+1, last_search = '$now' WHERE term = '$term'");
    } else {
        // the term does not exist - insert a new record
        mysql_query("INSERT INTO search (term, last_search) VALUES ('$term', '$now')");
    }
}

// prepare the tag cloud array for display
$terms = array(); // create empty array
$maximum = 0; // $maximum is the highest counter for a search term

$query = mysql_query("SELECT term, counter FROM search ORDER BY counter DESC LIMIT 30");

while ($row = mysql_fetch_array($query))
{
	$term = $row['term'];
	$counter = $row['counter'];

	// update $maximum if this term is more popular than the previous terms
	if ($counter > $maximum) $maximum = $counter;

	$terms[] = array('term' => $term, 'counter' => $counter);

}

// shuffle terms unless you want to retain the order of highest to lowest
shuffle($terms);

Next we’ll setup a bit of css. Feel free to adjust these as you see fit.

#tagcloud {
	width: 300px;
	background:#CFE3FF;
	color:#0066FF;
	padding: 10px;
	border: 1px solid #559DFF;
	text-align:center;
    -moz-border-radius: 4px;
    -webkit-border-radius: 4px;
    border-radius: 4px;
}

#tagcloud a:link, #tagcloud a:visited {
	text-decoration:none;
    color: #333;
}

#tagcloud a:hover {
	text-decoration: underline;
}

#tagcloud span {
	padding: 4px;
}

#tagcloud .smallest {
	font-size: x-small;
}

#tagcloud .small {
	font-size: small;
}

#tagcloud .medium {
	font-size:medium;
}

#tagcloud .large {
	font-size:large;
}

#tagcloud .largest {
	font-size:larger;
}

Finally we just want to loop through the array and display it with the appropriate css class.


<h3>Popular Searches</h3>
<div id="tagcloud">
<?php 
// start looping through the tags
foreach ($terms as $term):
 // determine the popularity of this term as a percentage
 $percent = floor(($term&#91;'counter'&#93; / $maximum) * 100);

 // determine the class for this term based on the percentage
 if ($percent < 20): 
   $class = 'smallest'; 
 elseif ($percent >= 20 and $percent < 40):
   $class = 'small'; 
 elseif ($percent >= 40 and $percent < 60):
   $class = 'medium';
 elseif ($percent >= 60 and $percent < 80):
   $class = 'large';
 else:
 $class = 'largest';
 endif;
?>
<span class="<?php echo $class; ?>">
  <a href="search.php?search=<?php echo $term&#91;'term'&#93;; ?>"><?php echo $term&#91;'term'&#93;; ?></a>
</span>
<?php endforeach; ?>
</div>

And thats it! Hope this saves someone some time. Add your comments below!

I have combined this tutorial into a single file,  download the full example here.

What is a Cron Job?

Q: What is a cron job?

A: A cron job is an automated task that runs on a linux server, similar to Scheduled Tasks on windows.

The main idea is that you can run a process without you or someone else manually setting it off – my first experience with cron jobs was when working with an online auction website. As you can imagine, an auction runs for a fixed time period. So lets say an auction runs for 7 days. What happens at the end of 7 days? The database says this auction is closed. If the auction did not sell, whether the reserve was not met, or their were no bids at all, then we want to automatically relist that auction.

There are two options:

a) run a process when a page load occurs… sucks because the page load is gunna take longer due to the extra processes occurring

b) run a cron job every x minutes, looking for auctions that have passed their end time, and either relist them or close them down permanently

I tend to code in PHP and MySQL, which means that I create a PHP file that does all the maintenance tasks required and I run it at set intervals. The cool thing is, if you want to be running different tasks at multiple times of the day/week/year, just setup multiple cron jobs.

Thats great, but how the hell do I setup a cron job?

I’m glad you asked!

If you are comfortable working with command line, then you might want to visit a site like this. If, like me, you have access to cPanel, life is much, much easier.

How to setup a cronjob in cPanel

1. login to cPanel

2. click on the cron jobs tab (near the bottom)

3. Select your time you want the script to run. The options are pretty straightforward. If you select Minutes > 30 that means the script will run at xx:30am/pm. You can also hold the Ctrl key to select multiple minutes/hours etc.

4. Enter the “command to run”. For my purposes, I generally want to load a php file. It doesn’t hurt to put these files outside of the public web folder “just in case”. Something like this should do the trick on a standard cPanel/WHM server:
/usr/local/bin/php -f /home/myusername/scripts/ascript.php

The first part is the “path to php“. You might want to ask your hosting providers what this is, if the above doesn’t work. The second part is the absolute path to your file. By default on a cPanel server the structure is /home/your_account_username/ where you can modify everything inside your account folder. Generally contained in the account username folder is a www/public_html folder with your live web files.

Hope this is of some help – cron jobs are a breeze – once they are running as desired its fun dreaming up new ways to make them more and more sophisticated… after all, now the server is actively taking away the daily grind!

If you have any questions do not hesitate to ask below.

MySQL Reserved Words

The database language MySQL has a number of reserved words that should not be used in queries or for field names. Some of the big ones are:

  • Add
  • All
  • Between
  • Both
  • Check
  • Current_Date
  • Drop
  • For
  • From
  • Group
  • INT
  • Key
  • Load
  • Lock
  • Null
  • On
  • Out
  • Real
  • Repeat
  • Table
  • Use
  • Write

And so it goes. I came across the official list when I couldn’t work out why this query wouldn’t work:
[SQL]SELECT COUNT(id) FROM messages WHERE read = ‘0′ AND to = $logged_in_id[/SQL]

The reason? Both “to” and “read” are reserved words and the query failed. See the full list here:

http://dev.mysql.com/doc/refman/5.0/en/reserved-words.html