HighlyStructured.com is a blog website run by Mike D'Agostino about search engine positioning, online marketing, php/MySQL, tae kwon do, and various other topics.

Creating a Static HTML Archive from a MySQL Database Using PHP

December 05, 2005

In an earlier entry I described how this site was created. In essence, I've created my own content management system which uses a MySQL database to store all the data, and a PHP script that converts the data into static HTML. This ensures that all the data stored in the database will get picked up by the search engines. But, there is much more involved than that and I feel as though I owe more of an explanation.

Benefits of creating static HTML pages from data stored in a database

The number one benefit is that all your precious content stored in your database will have a better chance of getting picked up on Google and the other search engines and will give your pages a better chance of ranking higher. On other levels, using this method allows you to build in to your database all kinds of opportunities to optimize your web pages.

Database structure

This website uses a small number of tables in the MySQL database. Here's a description:

Base elements:
This table contains basic elements that will be used on every page of the website. These include the header image (right now the HighlyStructured.com logo), the footer element, along with beginning and ending HTML code (explained further below). Think of this table as holding global variables.

As the name implies, this table contains all the categories that all the postings on the site can fall in to. In addition to just the names of the categories, I've included room to specify the content for the title tag, keywords and description META tags, and filename.

This table contains the data for every posting on this site. Again, along with the basic name/description field, I also have room to enter in individual title tags, keywords and description META tags, and the filename.

Static pages:
This table contains all the data for the "static" pages of the site. When I say static I mean pages that appear above the "Categories" navigation section and appear on every page. Just like the Postings database table, these pages have fields for the title tag, keywords and description META tags, and filename. In addition, I've included a "rank" field which allows me to specify the order in which these links will appear.

Now the database is set up and I've constructed my administrative interface that allows me to update everything. I've designed a basic layout and I've converted that layout to HTML using style sheets. In the past I've used tables to construct my layouts (and I still use them for most projects today), although I've been experimenting with CSS and I'm starting to move towards using CSS to control my layouts.

Creating the static HTML pages

Now that everything is set up it is time to convert the database elements to static HTML pages. In essence, what I'm doing is recreating the pages exactly as they would appear if I were using PHP to display the pages. Only I'm storing the resulting code in a variable and writing that variable to a file using PHP.

The first step is to create the individual posting pages and "static" pages. My method is to first call the base elements. These specify the beginning HTML code (DOC TYPE, etc.) that appear before and at the beginning of the HEAD tag.

Next I call each individual posting. Remember that I've built in to the database fields for the title tag, keywords and description META tags, and even the filename. So, I cycle through each posting, get the META tags etc., use the code from the predefined layout I created using HTML/CSS, get in the footer and ending HTML code (stored in the base elements table), and store all that code in one variable.

Finally I use FOPEN to write the contents of the variable to a file with the filename specified for that record.

Now all the individual posting pages are created. The next step is to create the category pages. The method is exactly the same, except you start off by calling the categories first. Once a category is retrieved, I go through all the postings to find which ones correspond to that specific category, and it is added to the category page code. Finally the code (which is again stored in a variable), is written to a file using PHP.

The next step is to create the date archive pages. This is a little tricky as the dates are actually generated from the postings themselves (for example, if there are no postings from February 2006, "February 2006" will not appear as a link). But, since PHP can do anything, it is possible to create a script to organize all the postings according to date. These pages are constructed in a similar fashion to the category pages.

Finally, once all the "static" pages, postings, category pages, and date pages are created, I then create the index.html page. Follow the same method as creating a posting page, except I use the last few postings as the main content for the page.

In addition...

I've included in each table (for the "static" pages, category pages, and individual postings) a field for "active". If active is set to "yes", a static page will be generated. If it is set to "no", the static page will not be generated. This is great except what happens if the record was set to generate a static page (thus creating a file on the web server), and then was switched to be inactive? The file that had previously been created on the web server still exists. The solution is that before any static pages are generated, I have a script that goes in to the web server and deletes all pages that contain the extension ".html". This way I can ensure no static pages exist even if they were generated in the past.


I feel as though I'm only beginning to scratch the surface. I believe there are other elements I can store in the database to help optimize the pages better for Google and the other search engines. I'm using CSS to control the layout of the site so it should make it easier to manipulate the layout and create some more creative designs without having to re-write the static HTML generator scripts and the HTML/CSS code around it.

Technorati Tags:       

Recent Articles



Other Blogs of Interest