Future web: XHTML 2

Background

The current common recommendation is that websites be written in standards-compliant XHTML 1.1 and CSS 2 with optional Javascript 1.5 support. What’s next for the web?

The XTech conference generated a lot of buzz this past summer in the web-techie community. There were presentations on XHTML 2 and the so-called HTML 5, which are both extrapolations of current popular standards, from groups which are (sort-of) competing for mindshare. There were also talks by technologist trying to transcend HTML entirely: both Microsoft’s XAML and Mozilla’s XUL are XML-and-script models for rapid GUI application development.

I hope to talk about all of these eventually, but right now I’m interested in XHTML 2. This article will delve into a few of the more interesting changes that are proposed in that draft. For readers from the future (hi cool future people!), here’s a permalink to the latest XHTML 2 docs, which is thesame as the above as of this writing.

XHTML 2

The main goal of XHTML 1.0 and 1.1 was to be as HTML-like as possible while enjoying the improved parsability of XML, which is stricter than HTML.

In contrast, XHTML 2 is a small evolutionary step away from HTML. The goals are to reduce the amount of presentation information encoded in the data (instead pushing presentation details to CSS), reducing the amount of scripting needed to accomplish common tasks, improving accessibility, and improving hooks for semantic content (that is, metadata).

Sections

HTML included h1, h2, .. h6 tags to mark headers along with p, span and div tags to mark sections. Unfortunately, the h1h6 and p tags have strong presentation meanings (titles and paragraphs) and therefore are harder than necessary to use for general blocks of text.

XHTML 2 generalizes this with the section and h tags. The section tags can nest (unlike p tags) so you no longer need to specify a header level for the h tag — it’s implied by the depth of the section. For example:

<!-- HTML -->
<h1>Title</h1>
<p>Intro text...</p>
<h2>Chapter 1</h2>
<p> Chapter body </p>

<!-- XHTML 2 -->
<section>
  <h> Title </h>
  <p>Intro text...</p>
  <section>
    <h> Chapter 1 </h>
    <p> Chapter body </p>
  </section>
</section>

The gains from this slightly-more-verbose syntax include:

  1. The nested sections are much easier to parse for a non-human. It would be trivial to write a program to extract Chapter 1, for example.
  2. The header levels are now relative, so if you later demote Chapter 1 to a subsection, you don’t need to renumber the header tags.

XHTML 2 also changes the p tag a little bit to allow nested lists (like ul and ol) and the like, to make them more like the human concept of a paragraph as a small block of related idea than a page layout concept of a paragraph as text offset from other text by whitespace.

Code

XHTML 2 offers a new blockcode tag for presenting computer source code verbatim., instead of the generic pre tag offered by HTML. This is a minor change, but it can be a big plus for search engines looking for code examples. In conjunction with metadata markup, an author could indicate the language of the source code. For example (this is probably bad syntax):

<blockcode xmlns:prog="http://example.com/ProgrammingLanguages/"
                      property="prog:language" content="Perl5">
    sub hello {
        print "Hello, World!\n";
    }
</blockquote>
Semantic text

In HTML, the tt tag is commonly used for keyboard input, variable names, blocks of code and more. XHTML 2 adds or enhances a few tags to disambiguate those usages. Respective to the above list, one could use kbd, var and code tags.

Anchors

XHTML 2 allows href attributes to be added to any tag instead of just to a tags. This allows you to link, say, a blockquote to the source of the quote, or to link an image without wrapping it in an a tag. In the case of the quote, it would be even better to use the new cite attribute. cite works just like href but makes it more clear that the specified URL is included as reference instead of as a recommended navigation point.

XHTML 2 also adds the hreflang to go along with href to let users know that the target may be in a different language. Along the same vein, the hreftype can specify the MIME type of the target, for example application/pdf. Since HTML lacks this latter attribute, authors often tell their users the target type in prose. For example <a href="foo.pdf">Foo</a> (PDF).

Embedding

HTML coders are very familar with alt tags on images — that is displayed if the image is unavailable or if the user has a text-only browser. XHTML 2 generalizes that concept via the embedding concept. Consider this example from the XHTML 2 docs:

<p src="holiday.png" srctype="image/png">
    <span src="holiday.gif" srctype="image/gif">
        An image of us on holiday.
    </span>
</p>

This means:

  1. Load and display holiday.png
  2. If that fails, load and display holiday.gif
  3. If that fails, display the text “An image of us on holiday.”

The src attribute can be applied to any tag. Consequently, the img tag is no longer special. This is like how a is no longer special since href attributes can be applied to any tag.

Media

HTML coders are familiar with the media attribute on link tags that let you, say, write a CSS file that applies only to media="print. Now, media can be applied to any tag. For example:

<span src="photo.jpg" media="screen">Me at work</span>
<span src="photo-hires.jpg" media="print">Me at work</span>

Personally, I think this one may be a step backward since it entangles presentation and data. But it’s undeniably clever. Doing this stuff inline is easier than in a separate CSS file that has to declare display="none" or set a background image on some tag.

Metadata

XHTML 2 has copious support for metadata. If you’re interested, I recommend just reading the Metainformation Module in the XHTML 2 specification. I’ll just share a few clever examples:

<link media="print" title="The manual in PostScript"
  hreftype="application/postscript"
  rel="alternate" href="http://example.com/manual/postscript.ps"/>

<meta property="dc:creator">Chris Dolan</meta>
<meta property="dc:created" datatype="xsd:date">2005-10-28</meta>
<link rel="author" href="http://www.chrisdolan.net/" />

Note: the dc: prefix is a reference to the Dublin Core XML namespace.

Accessibility

XHTML 2 introduces “roles” to assist with accessibility. You can, for example, tag your search box as follows to make it easier for a user to get to it (note the form below is not quite real XHTML 2 syntax):

<html>
  <head>
    <access key="s" title="Search" targetrole="search" />
  </head>
  <body>
    ...
    <form role="search" action="search.cgi">
      Search: <input type="text" name="q" />
    </form>
    ...
  </body>
</html>

There is a predefined, standard set of roles (like “search”) that any page can implement.

Forms and Scripting

XHTML 2 uses XForms, which is intended to be an improvement on HTML forms. I haven’t delved into this that much, but a key point seems to be separation of model from interface. Also, the new system allows for some standard filters that can be implemented without Javascript.

XHTML 2 also uses XML Events. This generalizes the Javascript-specific event notation (think onclick="JS code...") to support any event in any language. You can optionally specify listeners to events separately from the tags that generate those events.

A major gain from the XML Events change is that the mutually-incompatible Netscape and IE event handling models are discarded in favor of a standard model. Yay!

The script tag is now called handler and is intended to support any language that the user agent can support. (Want to code your XHTML page in Perl or Python?)

An unrelated note on scripting: document.write doesn’t work. Instead, you have to add nodes via the XML DOM API. When it comes to limiting the problem of script injection and cross-site scripting attacks, this is a good thing.

PAR: Packaging for Perl applications

What is the best way to distribute a GUI application to users?

The three main choices are via an installer, via a standalone executable or via source. These choices vary a lot across platforms. Windows prefers installers, especially .msi files. Macs are quite happy with .app files, which are usually shipped on disk images. Most Linux variants use installers (.deb and .rpm) but some prefer source (e.g. Gentoo).

What if that application is written in Perl?

Perl is not typically considered a GUI language, but it does have bindings for GUI toolkits including Tk, wxWindows, Qt and GTK. Perl can be useful in the GUI realm as a rapid-development foundation or simply to add a couple of dialogs to a mostly-background process. A great barrier for entry, however, is that most platforms do not bundle these GUI toolkits with Perl and some platforms do not bundle Perl at all. Perl itself is most often distributed via installers, but the add-on modules that usually accompany any sophisticated Perl project are typically delivered as source. This poses a problem for most Windows users and many Mac users for whom this is too low-level a task to be tolerated. Only in the sysadmin-rich world of Linux and other Unixes aresudo cpan install Foo commands routinely tolerated.

The PAR project attempts to to create a solution to bundling the myriad files that usually compose a Perl application into a manageable monolith. The initial effort was modelled closely on the JAR concept that has proven to be a success in the Java community. As such, PAR files are simply ZIP files with manifests. If you have PAR installed on your computer, you can write Perl code that looks like this:

#!perl -w
use PAR 'foo.par';
use Foo;
...

and if Foo.pm is enclosed inside the foo.par file, it will be compiled from that source. Even more interesting, you can say:

#!perl -w
use PAR 'http://www.example.com/foo.par';
use Foo;
...

which will cause the foo.par archive to be downloaded and cached locally.

You may have noticed the sticky phrase above “If you have PAR installed…” That is a catch-22 of sorts. PAR helps users to skip the software installation steps, but first they have to … wait for it … install software.

To get around this, PAR takes another page from the ZIP playbook: self-extracting executables. The PAR distibution comes with a program called pp that allows a developer to wrap the core of Perl and any additional project-specific Perl modules into a PAR file with a main.pl and a .exe header to bootstrap the whole thing. What this gets you (on Windows in this example) is something like a Perl.exe with all of its modules embedded inside.

Here’s a simple example. Consider your basic Hello World application

---- hello.pl ----
#!perl -w
use strict;
use Tk;
my $mw = MainWindow->new;
$mw->Label(-text => 'Hello, world!')->pack;
$mw->Button(-text => 'Quit', -command => sub { exit })->pack;
MainLoop;

On a Mac, you have to have Tk installed (perhaps via fink install tk-pm586 if you’re on Tiger) and X11 running (perhaps via open /Applications/Utilities/X11.app). When you do so and run perl hello.pl you get something like this:

helloworld.pl screenshot

Now, say you want to give this cool new application to other Mac users. Telling them to first install Fink, Tk and X11 just for “Hello, World!” is ludicrous. Instead, you can build an executable like so:

/sw/bin/pp -o hello hello.pl

That creates a 3 MB executable called hello that includes the entire Perl and Tk. Send it to a friend who has a Mac (and X11, since we used a version of Tk that isn’t Aqua-friendly) and they can run it. If I were to make a Windows version of this it would be even easier on end users — on Windows, Tk binds directly to the native GUI so even the X11 prerequisite is not required.

Another benefit is version independence. The executable above is built against Perl 5.8.6 on Mac OS X 10.4. It should also work well on 10.3 or 10.2, even though those OSes shipped with older versions of Perl, because every part of 5.8.6 that was needed for Hello World is included in the EXE.

If you download that executable, you can open it with any Zip tool. For example:

% zipinfo hello
Archive:  hello   3013468 bytes   689 files
drwxr-xr-x  2.0 unx        0 b- stor 23-Oct-05 14:21 lib/
drwxr-xr-x  2.0 unx        0 b- stor 23-Oct-05 14:21 script/
-rw-r--r--  2.0 unx    20016 b- defN 23-Oct-05 14:21 MANIFEST
-rw-r--r--  2.0 unx      210 b- defN 23-Oct-05 14:21 META.yml
-rw-r--r--  2.0 unx     4971 b- defN 23-Oct-05 14:21 lib/AutoLoader.pm
-rw-r--r--  2.0 unx     4145 b- defN 23-Oct-05 14:21 lib/Carp.pm
... [snipped 679 lines] ...
-rw-r--r--  2.0 unx    12966 b- defN 23-Oct-05 14:21 lib/warnings.pm
-rw-r--r--  2.0 unx      787 b- defN 23-Oct-05 14:21 lib/warnings/register.pm
-rw-r--r--  2.0 unx      186 t- defN 23-May-05 22:22 script/hello.pl
-rw-r--r--  2.0 unx      262 b- defN 23-Oct-05 14:21 script/main.pl
689 files, 2742583 bytes uncompressed, 1078413 bytes compressed:  60.7%

(Note: you may see that the file sizes don’t match. That’s because the EXE also contains the whole Perl interpreter outside of the ZIP portion. That adds an extra 200% to file size in this case.)

Is it fast? No, the file need to be unzipped prior to use (which happens automatically, of course). Is it compact? No, 3 MB for Hello World is almost silly. But is it convenient? Yes. And that is often the most important quality when shipping software to users.

An interesting consequence of this distribution model is that the executable contains all of the source code. For some companies this may represent a problem (with some possible solutions listed at par.perl.org) but it is also a benefit in that you may satisfy any GPL requirements without having to offer a separate source download.


An important note for Windows is that, thanks to ActiveState.com, you do not need C compiler to build Perl yourself. They provide an installable package which include Tk pre-built. See links on par.perl.org for pre-compiled installers for PAR.