If you know the term Common Look and Feel - CLF 2.0, then chances are you know that the deadline is fast approaching (December 2008). Across many Canadian Government sites progress is being made to convert content to meet the new standards. Being fully compliant means far more than just converting html pages; it also means converting legacy PDFs, updating applications and dynamic content as well. For the past year and a half, I’ve been helping out a group of webmasters convert static pages from CLF 1 to CLF 2.
One of the breakthroughs that helped us speed up conversion from CLF 1 to CLF 2 was the use of Dreamweaver and Regular Expressions (regex). Essentially, this is just a fancy ’search and replace’ that can run through a whole bunch of pages all at once. It doesn’t do everything automatically, but it does help speed up the conversion.
The way it works is that you open an old web page that needs to be converted, you identify what parts you’d like to copy, for instance, the page title, the metatags, the body of the page, the date modified and so on. Then you write a regex that splits up the page into all of these component parts. You then create a blank CLF 2 template page with variables in the spots on the page that will be replaced by the old content. With a click, this will run through a section of site and voilà! you have new CLF 2 pages. The last step required is to verify that everything worked; clean up old deprecated tags to xhtml strict 1.0 (this can be done with a click using Dreamweaver), validate, then publish.
You don’t need Dreamweaver to run a search and replace using regular expressions, there are other text/html editors out there that will also do the job such as UltraEdit, but it’s important to note that not all regex engines are the same so there are variations in the syntax depending on what tool you’re using. For Dreamweaver, the code we use looks like the following. Keep in mind that this code varies from section to section of our site depending on how the old page was structured.
(<%@)([^]*)(<title>)([^]*)(</title>)([^]*)(<html lang=”en-ca”>)([^]*)(name=”dc.creator” content=”)([^]*)(”>\n<meta name=”dc.title”)([^]*)(<!– Content Begins Here –>)([^]*)(<!– Content Ends Here –>)([^]*)
Now before you start thinking, “Whoa! This is way too complicated I need to be a programmer to understand this!” No worries, it’s actually pretty simple. Between each bracket () a match will occur with a part of your old html 4 web page. For intance, the match might be something very specific such as a title tag or it could be anything that falls between the last match and the next one. Think of it as a wild card that says, ok, I don’t care what’s there, but I do know that it starts at this spot and ends on this other spot. Each part of the content between brackets is stored in memory and can then be pasted into the new template by using a dollar sign ($) followed by a number. In the above regex there are 17 uh sorry, sixteen matches that will occur, so the variable $4 will be any text that makes up the page’s title but nothing else.
To attempt to explain regex and how this all works in greater detail would be pretty silly to do so here, so if you have a sense of what I’m talking about and think this is something you’d like to try for converting your old government pages, give me a shout and we can discuss further.
Popularity: 21%
User Experience
accessibility, CLF2.0, Common-Look-and-Feel, Government-of-Canada
Recent Comments