Web-Building with XHTML and XSLT
What We Did
XSLT is an open standard defined by
the World Wide Web Consortium [W3C]. It provides a powerful
template-driven syntax for converting XML documents from one format into
another, including into non-XML formats such as text and HTML.
XHTML is a reformulation of HTML4
as valid XML. This means that HTML source files, if constructed in this
way can be treated as XML files.
We have put these two technologies together to create a simple mechanism for
creating rich, easily maintainable websites.
- The source documents are kept very simple so it is very easy to write
pages, concentrating on the content without worrying about the look and
feel.
- The design elements are kept entirely separate: menus, color
schemes, fonts and layout all exist in separate files to make it easy to
work on these elements individually.
- The site can be automatically reformatted for individual browsers or
different devices (such as handheld or WAP phones): the page author
does not need to worry about this.
- The same techniques can be used for data driven pages.
Read on to find out how to do it..
Step One: converting from HTML to XML HTML
It's not as difficult as you think - here are two techniques to get you
started:
Writing new XHTML pages
If you use an up to date editor such as Microsoft Frontpage 2000 or
Macromedia Dreamweaver 3, tell it to use lower case for elements and attributes.
Frontpage 2000: Menu Tools, Page Options: tab HTML Source:
Formatting section: select Tag names are lowercase, attribute names are
lowercase
Dreamweaver 3: Menu Edit, Preferences: Category HTML Format, set Case
for Tags and Case for Attributes to lowercase.
Create a blank .htm file and edit it in your usual editor, when you've
finished use the following tips to help convert to XML:
- Rename the file to ".xml"
- Search and replace "<br>" with "<br />"
- Search and replace " " with " "
- Change the document headers as per Summary of Issues below
- If you open the .xml file in Internet Explorer 5, it will show you the
first error: go through and correct the errors.
- To get a more complete list of errors, use the W3C
validator.
Converting existing HTML pages
Often existing HTML pages have errors, as well as obviously predating the new
standards: these often go unnoticed for ages because browsers like
Internet Explorer are very tolerant of errors and usually work out how you meant
the page to be.
Help is at hand, the royalty free tool from the W3C called HTML
TIDY is designed just to do this - fix existing errors and make the page
correctly formed.
Summary of XHTML issues
The W3C XHTML page does tell you
everything you need to know fairly clearly, so this is just a recap of the main
points:
- to add extra white space, use ' ' not ' '
- use numeric codes instead of special characters (like the space above):
XML does not allow most of the named codes that HTML does. This
applies to copyright etc.
- all attributes must be in quotes, so <table width="90%">
not <table width=90%>
- and all tags must be closed so an extra trailing '/ ' is needed on tags
that are not commonly closed, eg:
- some common html editors make a mess of lists, especially nested ones like
this
To be XHTML1 compliant and pass the W3C
validator there are a couple of extra gotchas to watch out for:
- all tags and attributes must be lower case (contrary to HTML4) [most good
editors will allow you to set them to use lower case by default]
- certain tags are required and not optional: <title> within
<head> is required for example
- you must include full headings to indicate the type of document, ie: as
follows:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
- certain tags cannot be nested within other tags: broadly speaking
non-block tags can't contain block tags, so for example
<a>,<span> etc cannot contain <div>
Step 2: Using XSLT to Process XHTML
Now here is the clever stuff: applying templates to your XHTML to
produce XHTML output which is your original page plus style elements and any
modifications and conversions,.
Here is about the simplest possible template for doing this:
<?xml version='1.0'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template name="myxsl" match="/">
<html>
<head>
<xsl:apply-templates select="html/head/*"/>
<script language="javascript"
type="text/javascript" src="myscript.js"></script>
</head>
<body>
<xsl:apply-templates select="html/body/*"/>
</body>
</html>
</xsl:template>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
- The first two lines declare this as a stylesheet.
- The first <xsl:template> section matches the root of the input
document and starts outputting an html document structure. In the
middle it calls <xsl:apply-templates> which searches for templates
which match the items in the relevant section of the source document.
[The name attribute myxsl is used in the next stop]
- The second <xsl:template> section matches anything and everything
and copies it to the output document.
Given a valid input document all this stylesheet should do is add a script
reference to the header of the original document:
<script language="javascript"
type="text/javascript" src="myscript.js"></script>
Here are some examples of how you could - and we have - built on this
principle:
- Add standard menus and page structure to the page
- Build page menus or table of contents by analysing the source page
- Change or omit tags in the source page: to do this you would add
another <xsl:template> section for each tag you wanted to
change: this would be used in preference to the match-everything
template so you can customise the output for that tag. You can use
this technique to modify the source document for compatibility with
different clients - or to do whatever you like!
Step 3: Different Pages for Different Browsers
There are several possible ways of doing this:
- Generate different static versions of your site and redirect your users to
the most appropriate one.
- Use a server-side script (eg an asp page) to detect the browser type and
generate output using the most appropriate stylesheet
- Use a filter component that intercepts requests for .xml files.
On this site we've used the third option, using the Microsoft XSL
ISAPI Filter which caches stylesheets in compiled form and implements its
own error handling mechanism.
Example: including IE4/5 specific code.
Here's an example MS XSL ISAPI Filter configuration file which calls one stylesheet for Internet
Explorer 4 or 5 and a different stylesheet for other browsers:
<server-styles-config>
<!-- for IE 4 based browsers -->
<device browser="IE" version="4.0">
<stylesheet href="myxsl.xsl"/>
</device>
<!-- for IE 5 based browsers -->
<device browser="IE" version="5.0">
<stylesheet href="myxsl.xsl"/>
</device>
<!-- for non IE4/5 browsers -->
<device>
<stylesheet href="nomsdhtml.xsl"/>
</device>
</server-styles-config>
Here's how you might use this to change the stylesheet we did in Step 2 -
which we shall call myxsl.xsl. Here we will omit the script unless the client is Internet Explorer 4/5:
- Add this parameter after <xsl:template name="myxsl"> line to indicate whether we should include the script [we've
given a default of yes here]:
<xsl:param name="msdhtml" select="'yes'" />
- Add the code only if the parameter is yes:
<xsl:if test="$msdhtml='yes'">
<script language="javascript" type="text/javascript"
src="myscript.js"></script>
</xsl:if>
- Create the stylesheet nomsdhtml.xsl for other browsers (this imports the
first stylesheet and calls it by name, passing 'no' as the parameter):
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:import href="myxsl.xsl" />
<xsl:template match="/">
<xsl:call-template name="myxsl">
<xsl:with-param name="msdhtml" select="no" />
</xsl:call-template>
</xsl:template>
</xsl:stylesheet>
Appendix: Recommended Reading and Downloads
Microsoft XML Tools:
World Wide Web Consortium Tools and Info:
eWay example files:
- to see the source XHTML file for an eWay page, eg using Internet Explorer
5
- change the 'www'and 'html' in the address bar to read 'xml', ie go to http://xml.eway.co.uk/sitetech.xml
to see the xml for this page
- return to /sitetech.html
to view this page properly
- (The address /sitetech.xml will give an alternative rendering of this page using the Microsoft XSL ISAPI filter - we stopped using this mainly because the extension .xml was being bypassed by search engines: also a page that is browser-independent on the client side rather than the server side is superior because it can be cached by proxy servers and search engines and sent by email and still be viewed correctly without having to revisit the server to get the right version for a particular browser)
You will see that the source page contains none of the style/page layout, and no menus (neither the main menu at the top nor the the bookmark menu in the page).
- more examples soon!
|