Web site design guidelines
This article for web site developers and maintainers describes and explains a set of guidelines for implementing a web site.
This page has now been superceded by my newer page on Web site Architecture.
summary
This article is not about implementing web page design,
this article is about implementing web site design and explains ten commandments guidelines for implementing a web site. These guidelines, for web site developers and maintainers, are designed to meet the requirements that I described in
Web site architecture requirements. The guidelines on this page are that
- never change a page's URL
- all URLs shall all be at the site's root level
- all internal links shall be relative URLs, consisting of just a file name
- URLs shall use unabbreviated words or common acronyms
- each URL file name shall be the same as, or based on, the page title
- URL file names shall only use the characters a-z, 0-9 and _
- URL file names should have a .html extension
- digits shall only be used in URLs if there is a compelling reason to do so
- for small sites all of the files shall be stored in the site's root directory
- for large sites the files shall be stored in a hierarchical directory structure
navigation model guidelines
A site's navigation model is the logical structure presented to the user as 'the way the site is organised'. I have not written any guidelines for this yet; just remember that
- unlike the file systems the pages are stored on, web sites are not intrinsically hierarchical
- your choice of navigation model should reflect the relationship between the pages on your site
- the 'top' of your site is whichever page the user sees first, so the navigation module must not be based on a particular starting point.
Chances are, you should think of your web site as just a... um... web of pages.
URL organisation guidelines
GL1. Never change a page's URL.
This is the big one - 'every page should exist for ever at the same URL'. This is not as hard as it sounds if you know how to separate page content from URLs.
GL2. All URLs shall all be at the site's root level.
URLs should be at the site's root level because of the requirements that
- 'there should be some limit to the length of URLs'
- 'URLs should be 'choppable', so that the result is a valid URL'
- 'small sites should have a simple structure'.
This gives you URLs like http://your.site.com/some_page.html rather than URLs like http://your.site.com/lots/of/directories/page.html.
GL3. All internal links shall be relative URLs, consisting of just a file name.
You only need relative links consisting of just a file name, like <A href="just_a_file_name.html">, because of the previous guideline. If you include a line like
<BASE href="http://your.site.com/index.html">
in the HEAD tag of each page then all links are relative to your main page. This makes your pages are more portable, because you use relative URLs, and your links are less likely to be broken, because they are simpler.
URL naming guidelines
GL4. URLs shall use unabbreviated words or common acronyms.
This is because 'URLs should "read well"' for the user. Words should be spelled correctly; 'common acronyms' are those that you might find in a dictionary, not any acronym that the webmaster happens to understand. This approach makes URLs easier to remember and use correctly, which helps both the webmaster much as the user, who shouldn't ever need to use the URL anyway.
GL5. Each URL file name shall be the same as, or based on, the page title.
This is also because 'URLs should "read well"'. It also gives the webmaster and the user a clue about which page a URL points to.
GL6. URL file names shall only use the characters a-z, 0-9 and _.
This makes the URLs simpler by reducing the number of possible variations, which makes them less likely to be broken. In particular, it makes it easy to get the case right. Underscores should be used to separate words, which makes the URL easier to read.
GL7. URL file names should have a .html extension.
Again, this makes the URLs simpler by reducing the number of possible variations. If you are using some exotic extension for server-processed pages, such as tcl, pl, shtml or php3 say, it is a good idea to get the server to use html as an alias for the real extension so that your URLs don't change if you switch extension.
GL8. Digits shall only be used in URLs if there is a compelling reason to do so.
This is just a sensible naming convention - you should use meaningful names. For example, if you have review.html, review2.html and review3.html then 2 and 3 probably only mean 'another' and 'yet another'. .
how to construct a URL file name
To construct a URL file name that follows the above guidelines, simply
- take the page title
- convert the title to lower case
- separate words with underscores, instead of spaces or hyphens
- remove all punctuation, and other non-alphanumeric characters
- add the file extension html
- if the name is too long - more than three or four words - then remove any words that are not essential to the meaning of the title and are not required to make the file name unique.
file system guidelines
GL9. For small sites all of the files shall be stored in the site's root directory.
If you don't want to do anything fancy then you shouldn't have to - because 'small sites should have a simple structure'. If your site only consists of four pages then it you probably only want to have four HTML files together in the same directory, with URLs that point directly at the files.
GL10. For large sites the files shall be stored in a hierarchical directory structure.
This guideline is here because 'the site maintainers should be able to find files, however big the site'. Unlike the other guidelines, this one requires a little trickery if you want to follow it, while still following all of the others - you need to know...
how to separate page content from URLs
What you need is something like Server Side Includes or server-side scripting like PHP or Active Server Pages. The idea is that the URL works like a shortcut, or alias to where the real HTML content is stored, elsewhere in the file system.
For example, with Server Side Includes, the file at http://your.site.com/some_page.html might only contain the single line
<!--#INCLUDE file="/deep/in/the/file/system/some_file" -->
where some_file does not even need to be an HTML document - it could be a script, say.
With this approach, you can move some_file anywhere you like and you only have to edit some_page.html, rather than changing the URL.
further reading
- Static Site Development, by Philip Greenspun.
- What is web archtecture? (And why should I care?). This article from webreview.com explains the point of having these kinds of guidelines.