Writing the Web: di Iorio and Vitali: JoDI

Abstract

Ted Nelson's Xanadu remains an influential example of the way a world wide hypertext system should have been, allowing free access to hypertext pages for content customization and editing. This is still impossible or unacceptably difficult on the World Wide Web. Yet, the Web cannot be replaced, given the amount of data and tools that rely on its basic protocols and languages. The vision presented here is of an evolution of the Web where, within the current framework of technologies and tools, every Web page can be edited and customized, links can be created, and collaboration can be set up. In a way, this is a vision of Xanadu coming to life again, but within the framework of Web technologies, styles and tools. It is a vision of the best possible approach to a fully writable, distributed hypertext system within the limitations of real-life protocols. This writable web, already partially available with blogs and wikis, is enhanced through the implementation of xanalogical storage to take care of individual changes to documents, and mechanisms for transclusions. IsaWiki, a client-server system being developed at the University of Bologna, is presented and shown to adhere to this vision of the writable web, and as being a first step in that direction.

1 Introduction

Hypertext pioneer Ted Nelson has always been rightly praised for the appeal and power of his visions, particularly for the universal, all-encompassing, all-permeating medium he called hypertext, and for the intellectual and commercial possibilities that the hypertext system he was devising, Xanadu (Nelson 1987), would bring forth.

Although hypertext never quite emerged in the form envisioned by Nelson, the World Wide Web came into being as a simplification of Xanadu in terms of functionalities, but not of scale (Bieber et al. 1997). The completely distributed architecture, the simple and general transmission protocol, and the wide availability of free servers has undoubtedly helped in effectively spreading the Web to a planetary dimension.

The success enjoyed by the Web in the past years clearly prevents, in our mind, a completely different hypertext system from taking over, regardless of the many drawbacks and limitations that the Web shows: too much content is already on the Web, too many tools exist for creating, updating, publishing, indexing, searching, integrating and processing Web content. To think that a new alternative system could completely replace the Web to everyone's satisfaction is impossible. Rather, the Web allows incremental improvements by correctly exploiting the current Web architecture, protocols and tools (Vitali 1998). Server-based functionalities, client-based scripting and browser enhancements are now possible and even easy to implement, and can be deployed to a worldwide audience leveraging the tools that we are using already.

Conklin's (1987) seminal paper identified three different types of hypertext systems: idea-collectors, browsing systems and publishing/editing macro-systems. Xanadu, the only macro system mentioned by Conklin but which was not implemented at the time, will be discussed in depth later in this paper. In terms of the other types of systems, Conklin first proposed the major distinction we can see in hypertext systems: the distinction between hypermedia as a tool for the intellectual worker to organize, edit and play with all the documents and data encountered in daily activities, or hypermedia as a tool to access and browse, in a non-linear fashion, documents and data that have been carefully crafted and organized by others.

Although at the very beginning Tim Berners-Lee's creation, the Web client-server system on the NeXT computer, had a browsing and authoring tool as its client, today's Web is clearly just a browsing system (Cailliau and Ashman 1999). Data and information created and organized by the intellectual workers on a computer are created with all kinds of tools, but are not on the Web or integrated with the Web without a special effort. Creating new content for the Web is an autonomous task, and a very special task in itself, requiring special tools, special competences and special network setup and connections. Frequently, it is a task for professionals.

The main thesis of this paper is that it is desirable to make the creation of Web content an integral and natural part of the daily chores of an intellectual worker, integrated with the normal production and management of data and information, making the Web not just a publishing medium but fundamentally a collector and organizer of personal data and documents. In Conklin's terms, we wish to turn the Web into an idea collector and, by doing so, to actually turn it into a publishing/editing macro system, a Xanadu with a different name.

In fact, we strongly believe that the Next Big Thing in hypertext research is the complete integration of reading and editing functionalities in hypertext data, allowing readers to customize, comment and share modifications to documents and texts regardless of their ownership and storage. We believe that the Web is the only reasonable architecture where this can happen (for the simple reason that most of the interesting data is on the Web and not on other types of hypertext systems), and that providing easy creation, uploading and access is just one of the steps towards that goal.

In fact there are several steps, in our mind, if the Web is to evolve to what we wish for. Briefly, we could summarize these as follows:

editing tools integrated with the browsing environment
publication on the Web integrated with local desktop data management

Nothing unheard-of so far. But also, and most importantly:

editing tools for Web content integrated with the standard desktop tools
customization tools for existing Web content, and mechanisms for publishing and sharing customized Web content.

This last item is crucial both for the subtle points in implementing it, and for the number of features and services it guarantees. It is our contention that these features, plus a working and integrated payment system for published content, are at the heart of the Xanadu system, regardless of the implementation choices that were foreseen by Nelson.

This plan, in our mind, does not simply get us to the Xanadu dream drawn out by Ted Nelson in the mid-1960s. Rather, it updates his dream to take into account the enormous success and availability of Web data, and to exploit a number of advanced technologies that are now widely available to the majority of computer users, i.e. all those that regularly employ latest generation browsers.

In the following we discuss each item in detail, providing a few hints on how we believe they should be integrated into the current Web architecture. The last part of the paper is dedicated to IsaWiki, a tool we are developing at the University of Bologna, taking these ideas to implementation.

2 Easy editing and publishing on the Web

Creating content for the web is not easy: authors need to master a number of different technologies (such as HTML, CSS, Javascript, HTTP, server-side scripting languages such as PHP or Perl or ASP, and more recently XHTML, XML, XSLT, etc.). Authors also need access to a Web server, and to show some competence with Internet protocols (such as FTP or WebDAV). Furthermore, the expected quality of readable Web sites has increased enormously, so it is now expected that Web sites have carefully planned pages, orientation, navigation aids, layout, graphics, services, advertisement and server-side applications.

In short, creating good Web pages is in most cases a job for experts spending a large amount of energy and dedication on these tasks. Far from being a natural extension of everyone's daily activities on ideas and documents, publishing Web content is a complex and difficult task usually performed by professionals specifically hired for this.

Tools exist to ease this work, from HTML editors to graphic editors to fully-fledged content management systems (CMSs). HTML and graphic editors are often tools for professionals, and require competence with the tools themselves and the languages they rely on to get even half-decent results. CMSs (Browning et al. 2000) are appropriate for large-scale publishing tasks, and require rather advanced expertise for installation and set up. They might mitigate the technical requirements for producing content, but are meant for producing regular and professional content, and do not usually ease any other job but authoring. Many desktop applications allow direct conversion of their documents into HTML pages (e.g. Microsoft Word), but their graphic quality and overall presentation control are so poor and unpredictable to make them residual tools for emergency publishing.

The W3C project called Amaya, a direct descendant of the first Web client, is a Web browser/editor that allows documents to be created and updated directly on the Web while browsing. Unfortunately the interface is complex enough to make it little more than a showcase for the W3C technologies, rather than a serious competitor in the browser arena. Furthermore, is does not solve a fundamental issue: editing pages for which we do not have author access rights.

The most important effort towards a writable Web is certainly WebDAV (Goland et al. 1999), an HTTP extension proposed by IETF that allows users to edit and manage files collaboratively on remote Web servers. WebDAV adds methods and headers to HTTP to move, copy, modify and delete remote files in a network file-system, and provides resource locking and versioning (Clemm et al. 2002) but requires software designed (or extended) to support this new protocol explicitly.

There are two additional approaches to a writable Web that are worth mentioning: blogs and wikis. These come very close to offering full write access to the Web, providing really usable writing spaces that are immediately available on the Web and do not rely, in most cases, on additional protocols to HTTP and require no tool but the browser itself.

Weblogs (or Blogs, Blood 2000) are tools for fast editing and publishing of personal diaries, targeted towards individuals and small communities. They are also among the favorite tools for journalists and consultants, providing them with a bidirectional channel with their readers to provide services, receive feedback, create discussion forums, and in general build reader communities. Editing is mostly based on Web forms, and in some cases on WYSIWYG editors written in Javascript and DHTML, available on the Web page themselves. Blogs have a fairly limited variety of document types, the systems being targeted towards smallish and frequent notes displayed together and sorted by date. Nonetheless, they are simple and fun to use and read.

Wikis (Cunningham and Leuf 2001) take the writable Web one step further, in that they pose themselves as collaborative tools for shared writing and browsing on the Web, allowing every reader to access and edit any page of the site, through simple Web forms and an intuitive text-based syntax for typographical effects. Wikis come rather close to Nelson's initial idea of a global publishing medium open to customization and individual contributions: wikis are characterized by simple interfaces, raw layouts and an open editing philosophy that encourages participation and (to some extent) absorbs malicious exploitation of their soft-security mechanism. An ill-intentioned anonymous reader can in fact modify or even delete previous contents, but thanks to the internal revision tracking and differencing mechanisms, any damage they may cause is not definitively harmful. Wikis, like Xanadu, store every version of a document and allow readers to browse history, retrieve old versions and display differences between them, so in successful wikis an army of self-appointed "wiki gnomes" arises and takes responsibility itself to rebuild the damaged documents, making any destructions look futile and short-lived.

Blogs and wikis are limited in that the overall aesthetics of the created pages is inherently meager: there is no mistaking a wiki for a professionally created and handcrafted Web site; blogs may have a slightly more sophisticated layout and graphic presentation, but still do not come close to the complexity and graphical perfection of traditional Web sites. In many ways this is not really a problem, because these tools are meant to allow the Web to be writable by non-professionals and without specialized tools, and this they do fairly well. On the other hand, wikis and blogs are not a mechanism to create all Web sites, but a new genre of sites, clearly segregated from others.

As we have seen, mechanisms for writing the Web exist and are becoming more widespread. Technical solutions for editing are varied and creative: wikis have text entered in a plain HTML input form, where users write using a rather intuitive text-based syntax, and send the new document to the server-side application with a plain HTTP POST method through the push of a button. Blogs also propose plain HTML forms for editing in old and non-standard browsers (accepting a simplified version of HTML), but in some cases implement a sophisticated WYSIWYG editing tool in Javascript when edited within last-generation browsers.

3 Integrating desktop tools

Having easy tools to use for writing the Web is not enough. If we still consider writing for the Web to be different to writing documents for printing, or note-taking, or creating presentations, then no real integration of the Web in our daily activities can be obtained.

We have complex software installed on our computer that we use for our daily work: word processors, presentation editors, spreadsheets, graphic editors, email systems, etc. We have become accustomed to these tools, and we regularly use the sophisticated functionaly implemented by these tools. No HTML editor, no browser-based WYSIWYG text editor, can match these features (for instance spell-checking, drawing tables, customizability, footnotes, etc.). Furthermore, we have already produced hundreds of megabytes of documents with these tools, which we keep accessing and modifying and filling in and updating: standard letters, document templates, frequent presentations, etc. Making these documents accessible on the Web means being able to share them, to create collaborations, to gain access to them even outside the office.

Of course, it is easy to put a word processing file on a Web server, for people to download; yet, this document would not be integrated with the Web, but just a downloadable black box that still requires local tools to access the content and edit it. In this context the Web is little more than a dignified FTP. At the same time, it would be easy to convert the document and place the converted file on the Web as an HTML document; yet, the HTML version would be completely segregated from the original copy of the document, and it would need to be updated manually every time the original is modified. Furthermore, there is an established expectation of sophisticated presentation and uniformity in graphs and layout among all the pages of a Web site, which automatic conversion would not be able to guarantee.

Here is an apparent paradox: the document cannot be placed on the Web in its original format because it could only be downloaded and not browsed, and yet it cannot be converted to HTML because it would be ugly and irredeemably separated from the original copy. We propose two basic principles to overcome this paradox: strong and rigid separation of content and presentation, and the individuation of a generic data model amenable to automatic and sophisticated conversion from and to the original data format.

3.1 Separation of content and presentation

The enormous success of the Web introduced an unexpected ingredient in the preparation of a successful hypertext document: the presentation. In most early hypertext systems (including Xanadu, and even early Web pages) documents were basically scrollable white pages, as might be produced from a word processor. The mature Web saw graphic designers take over and change for good this aspect of Web pages. Drawing inspiration more from advertisements, leaflets and glossy magazines than from books and scientific journals, they introduced decoration, colors, typography and layout. Logos, decorative graphics, rulers, colored or patterned backgrounds, layout tables, graphic buttons and navigation bars are so much part of the current Web that they cannot be abandoned in sophisticated Web sites.

Unfortunately, early HTML was not meant for presentation, and was hardly adequate for the effects that designers were looking for. A whole array of smart and complicated HTML tricks arose and irredeemably polluted the source code of professional Web pages. The actual content had to be accurately massaged and coerced into the slots made available by the overly complex layout tables that are current Web pages. This task was often manual, relying on commercial tools or on deep and delicate immersion in the actual HTML source code.

Insight was gained from experiences drawn from the field of declarative markup languages (such as SGML and XML): content and presentation need to be clearly and rigidly divided. The actual document only needs to contain the hard textual content, and a transformation phase is implemented to create the final page, by inserting the content into a generic and graphic-intensive layout, which is then displayed to readers. The transition from hand-crafted Web pages to template-driven automatic page generation has been one of the major recent advances in the Web.

It is also an important lesson for our improved hypertext system: layout and presentation cannot be ignored, but they can be managed separately. Let the desktop tools generate plain, ugly, content-only documents; a separate transformation process will create pleasing and sophisticated Web pages through the automatic association of a page template. The transformation needs to be automatic and on-the-fly, just before final delivery of the document to the browser, so as to keep the original document unchanged.

3.2 Generic data model

Declarative markup languages such as SGML and XML stress the advantages of providing a semantic label to each significant fragment of a document, rather than polluting the text with transient and media-specific formatting instructions. Specific formatting instructions can then be applied later, after precise identification of the final medium and presentation details.

What do a wiki document, an HTML page, an XML document, Microsoft Word or PowerPoint or Excel documents, have in common? A core set of simple elements (e.g. paragraphs and sections in Word, table cells in Excel, pages and list elements in PowerPoint) to which decorations and typographical properties are applied. Since, as discussed in the previous subsection (3.1), layout and presentation are taken care of separately, we can safely ignore the decoration and typographical properties of these documents, and just consider these simple elements.

It is easy to draw from this the concept of a generic data model, which captures the fundamentals of a data document, and ignores the irrelevancies such as the typographical properties. The advantage of the generic data model is that it is easy to extract the relevant parts of a document, and it is easy to rebuild the original format afterwards, by adding (more or less randomly) the typographical properties that were extracted before. Simplifications and loss of information are inevitable, but these should be concentrated on the transient aspects of the data format, which can be safely ignored for our purposes.

So the generic data format is composed of blocks (e.g. paragraphs) containing text and inline elements (e.g. bold and italics), of tables containing individual cells, of collections of records (e.g. a drawing composed of many simple graphical elements in a vector-based graphic language), and little else. Each of these elements needs to be labeled so as to create a descriptive markup language, and may have special formatting instructions attached, to cater for special formatting needs that have to survive into the final page.

Figure 1. Selecting the style caption in MS Word

For instance, according to this generic data format, the caption accompanying Figure 1 is a paragraph with style 'caption' in MS Word, the fragment <p class='caption'>Figure 1. Selecting the style <span style='font-style: italic'>caption</span> in MS Word</p> in HTML, and the fragment <caption>Figure 1. Selecting the style <italic>caption</italic> in MS Word</caption> in XML. They are all equivalent, and can be easily transformed back and forth from one format to the others.

We do not go deeper into the details of the generic data format here, but it is possible to obtain a generic data format preserving most of the important aspects of each data format with minimal sacrifices (e.g. arbitrary element nesting in XML).

4 Customizing Web content

The hard part in our proposal is the integration of the editing architecture described in the previous section with the existing Web, in particular allowing editor access to Web pages and sites we have no control over.

The basic idea discussed in this section is to let readers access just about any page on any Web site, edit it, customize it, and save the modified version as a document on the local Web server. This page is customarily served any time the user asks for the original URL.

We cannot pretend there are only technical issues in this statement: the general architecture of this customization has to be clarified (e.g. do I lose the modifications to the original page that happened after I created my local copy?), the user interface issues need to be specified (e.g. how do I ask for the original page, ignoring the local copy?), and, more importantly, the ethical and legal issues need to be considered as well: would it be legal and ethical to modify and customize my view of someone else's pages, possibly covered by copyright or containing sensible material? Would it be legal to share my modifications of someone else's pages with my friends and readers?

Given the objections that have been raised on the topic of deep linking or Microsoft SmartTags (Ard and Musil 2001), two much less controversial issues without doubt, we can only expect that Web customization would be fought strenuously and bitterly by all copyright holders. Yet, we hope that the discipline of fair use could be employed here as well, clearly differentiating those who make correct use of Web customization mechanisms and those who do not. Of course, there are communities and situations where this feature would be acceptable and even desirable: scientific and technical communities, intranet Web resources, discussion groups, and so on. These would largely benefit from shared modifications, and could constitute the first example of correct usage of them.

Legal issues apart, we shall discuss now what exactly we mean by Web customization, and we shall provide a few proposal for the architecture of such a system. Since, as we have already mentioned, we are only interested in modifying the actual content of a page, and the average HTML page now contains much more than that in terms of layout, decoration, advertisements, etc., we also need a mechanism to identify the content and let the reader modify just that, possibly using the concept of generic data format previously described.

Web customization, in our view, is a server-based service for registered users to allow local editing of any Web page. By subscribing to the service and activating it, an interface gizmo would appear during normal navigation, providing two basic services: editing, and access to edited pages.

By selecting the edit command, a content editor would appear and allow the user to add, delete and modify the text content of the page. Ideally the system would be smart enough to allow in-place editing and to differentiate between the actual content and the presentational parts of the page. Every modification to the page is recorded, with author and time information. On saving the changes, the modifications would be sent to the server and made available to all subscribers or only to the author, according to user preferences.

Access to a modified page is checked for every page visited. Every time the user surfs to a page, the browser would check with the Web customization server. If a local copy is present the server would send it and it would be displayed in the browser instead of the original page. Multiple versions of the same page can be kept and displayed individually. If the original page has changed on the original server, it would constitute just another version of the document, displayed by default or accessible from the version menu according to the user's preferences.

An analogous service would be made available for user-defined links and comments on Web pages: Rather than storing text and text modifications, it would store link information and pointers to places in the document, but the overall mechanism would otherwise be identical.

4.1 An architecture for Web customization

How is such a service, as just described, implemented? Certainly not as a new client application, server application, transmission protocol or markup language. The Web is now a mature environment, where thousands of different applications carry out a vast number of different tasks, each assuming correct behavior and a number of shared assumptions from the other applications it is interoperating with. Creating yet another standalone application or transmission protocol will not serve our purposes (Vitali 1998).

Another reason to believe that starting from scratch is not an adequate approach is that applications and protocols already exist that can be adapted to our purposes in a compatible manner and with limited effort. It would make little sense to invent when existing technologies can be adapted.

The features we are trying to describe here come closest to resembling Nelson's original vision for Xanadu. Since then work directly inspired by Nelson's idea has attempted to adapt such systems to the Web. We should start by mentioning the work of the original Xanadu developer team (with other contributions), which developed the original code and data structures, such as Udanax Green and Udanax Gold (Udanax.com), two open source projects that began in 1999. Nelson and others worked to enhance existing systems (firstly the Web), rather than to provide a completely new environment. Examples of this activity are

HTS (Hyper-Transaction System), a transpublishing system under development by Nelson at Keio University
Pam's (1997) fine-grained transclusions RFC, a proposal to apply transpublishing to HTML
OSMIC (Nelson 1996), a versioning system for managing document history and branching.

Although transpublishing was designed before the Web, Nelson (2001) proposed a solution to adapt this vision to current Web technologies, based on a particular file format called VLIT (Virtual Literary Format). VLIT is a set of specifications to allow transpublishing: a VLIT file is a sequence of references to a span of text contained in external resources. A specialized server parses a VLIT file and puts together the retrieved fragments to form the final document. Furthermore, VLIT provides powerful two-way linking mechanisms, allows to express in different mode (and syntax) the inclusions and to change how a span is displayed once it is converted to HTML. Some applications have been developed (or extended) to support VLIT: for instance, VLit Chrome is a client-side tool for viewing, editing and creating VLIT files with Mozilla. On the server side, GNU Eprints, generic and highly configurable Web-based archive software developed by the University of Southampton, has been experimentally extended to interpret a VLIT file and generate the composite documents.

We must also mention systems and protocols that allow personal intervention on external materials on the Web. These include annotation systems such as CritLink, which allows users to comment on every Web page, adding annotations (through forms) inserted on-the-fly into the original document through a non-transparent HTTP proxy.

Another, slightly different example is Hunter Gatherer (schraefel et al. 2002), a Web application that collects content fragments from within Web pages and manages the resulting information collections. Hunter Gatherer is integrated in the browser, providing a simple interface to select fragments from the accessed pages (smaller-than-page-sized information components), to store references to these components, and to manage these collections. No content is copied into the collections. The collections are only composed of references to the original fragments through a mechanism called Aggregated URLs, which are easy to share and retrieve.

The W3C has proposed a shared Web annotation system (and an underlying protocol to make it work), Annotea (Koivunen 2004). This exploits existing technologies such as RDF, XPointer, XLink and HTTP to allow users to attach their own comments or bookmarks to any Web document. Annotations are external to the original pages and can be stored in pre-established annotation servers. Annotea clients, such as Amaya or Annozilla, request the annotations (expressed in RDF syntax) from these servers and show them side-by-side with the original unmodified documents. It is worth mentioning that annotation servers have been proposed in many papers (see for instance Roscheisen et al. (1995)), and were used in a discontinued version of a major browser, version 2.6 of NCSA Mosaic.

The W3C standard XLink (DeRose et al. 2001) enhances links in Web pages by introducing, among other improvements, the possibility of expressing a link externally to the resources it connects. External linkbases can now be created that do not modify the content of the pages, yet can be shown within the pages during browsing. Our own XLink-Proxy (Ciancarini et al. 2002), based on XLink and a non-transparent HTTP proxy mechanism, allows every user to add links from and to every Web page, regardless of ownership and access rights.

Relying on a HTTP proxy for on-the-fly addition of content or links to an external Web page has been proposed several times: we could just mention CritLink (Yee 2002), several projects at the University of Southampton (e.g. DeRoure et al. (1996) and DeRoure et al. (1999)), our own Xlink-Proxy (Ciancarini et al. 2002), and Goate (Duncan and Ashman 2002). A big advantage of HTTP proxies is that they require absolutely no intervention on the client machine (except for the proxy configuration, of course), but there are disadvantages as well. First, interposing a proxy between browser and server will double all network connections, since the browser will have to ask the proxy, which will in turn ask the server, for every resource. This clearly slows down the response time and doubles the risk of timeouts. Second, the proxy will be consulted not only for HTML and XML documents, which is appropriate, but also for every other Web resource, such as images, animations, javascript and CSS files, not to mention binaries and downloads of all sorts. Finally, a proxy as an elective intermediary for content modification will not work in those situations were another proxy is being used (e.g. when the browser is behind a firewall), unless an explicit chain of proxies is set up and managed.

Some or most of these limitations can be overcome. For instance, smart proxy implementations exist that reduce proxy-caused latency in HTTP response-request pairs, and special configuration files (using the Navigator Proxy Auto-Config file format) can be specified to make the browser select different proxies (or no proxy at all) based on properties of the URL to be loaded. The problems associated with proxy chaining, on the other hand, need to be solved at the infrastructure level, by convincing the managers of either proxy to specify the other in the connection chain. This may pose some difficulties, especially when dealing with firewalls protecting large organizations or whole countries. All in all, using a proxy for on-the-fly addition of content or links appears to be difficult and somewhat prone to malfunctions or policy restrictions. A better solution, in our view, is required.

Fortunately both major browsers of the current generation (Mozilla and Internet Explorer) have a number of customization mechanisms that come in handy for this kind of functionality. For example, both make it quite easy to create interface gizmos (menus, toolbars, sidebars, etc., that reside with the browser, do not interfere with normal navigation (as a frame, for instance, would do), and do not tax the transfer protocol (as would happen with a proxy). Just a few hundred lines of XUL (for Mozilla) or Visual Basic (for Explorer) are necessary to create a simple application that provides an interface in a sidebar and monitors the URL loaded in the main browser window. Whenever the URL changes, this process could start an autonomous HTTP connection to a content or link server, download the required data, and insert it in the actual document downloaded in parallel from its natural server. This architecture is completely transparent to firewalls, other proxies, and efficiency issues in the server of the original document.

4.2 Page analysis and content determination

As discussed previously, presentation (i.e. careful addition of graphics, decoration, typographical effects and layout to obtain a more pleasing display) is a consolidated aspect of the mature Web.

Presentation is a problem in our architecture not only because readers expect it when accessing a Web site, but because there is no easy way of telling content from presentation. Section 3.1 discussed the issue of automatically adding sophisticated presentation to raw content. Here the opposite case, of separating the presentation from the content, is considered.

The architecture we have described so far considers presentation to be a transient characteristic of a Web page. Of course, there are many Web pages where the presentation is much more important than the content, or where content is practically indistinguishable from the presentation. Graphic pages, animated sites, etc., place little stress on content, but there are a lot of Web pages where content is clearly relevant and independent of the particular display. In many cases, pages where content is to any practical degree identical, may vary considerably in terms of source code due to site redesign, in decoration, and in the underlying software platform.

On the other hand, we believe the user wishing to edit and customize a Web page is mainly interested in its content, and would prefer to be spared most of the details of the presentation, except possibly for smaller decisions (such as requesting a few words to be put in bold or italics). The HTML code provides no help in separating content from presentation, so we have to resort to empirical mechanisms to identify the HTML code containing real content, and the HTML code providing decoration and layout.

Oddly enough, similar problems arise in completely different situations, for instance re-flowing applications to convert Web pages to fit small PDA screens. Algorithms have been developed for automatic analysis of parts of an HTML document. Rahman et al. (2001) proposed efficient algorithms to extract the content of Web pages and build a faithful reproduction of the original pages with the important content intact, but on a smaller scale. Kunze and Rosner (2001) proposed linguistic analysis; others have proposed structural analysis (Vijjappu et al. 2001) or even mixed techniques (Soderland 1997). Chen et al. (2003) examined the structure of the page, searching for high-level content blocks according to their location, dimension and shape. Gupta et al. (2003) sought a heuristic to determine the roles of each part of the document by computing, e.g., the ratio between words and anchors in a navigation link list. Yang and Zhang (2001) proposed semantic analysis of the structure of Web pages, and suggested classifying pages according to the number of typical layouts or layout details that are used consistently by page designers throughout the Web.

Inevitably these techniques are imprecise, limited, and subject to continuous evolution in the style of Web page design. Nonetheless, there exist a number of reliable and persistent clues that help in determining the parts of an HTML source that are surely content and the parts that are surely presentation and layout. Once these parts are determined, it is possible to provide an editing environment for the content, and (within limits) to re-flow the modified content within the original presentation and layout, so as to make the modified page appear natural and consistent.

Recent browsers even allow users to edit content directly within the Web page, through properties of the page elements, such as contentEditable in Internet Explorer and designMode in Mozilla. This allows inline editing of the content within the actual page that hosts the original content, with the same styles and layouts. Other browsers cannot offer this functionality, so it is necessary to foresee alternative input mechanisms for those applications. In this case, a server-side mechanism has to extract the content of the page and place it within a form element in a fictitious HTML page, which is then displayed to the user.

4.3 Customization, versioning, and transclusions

So far we have discussed customization of Web content as an atomic operation, that has the whole content of the Web page as the input, and the whole content of the modified page as the output.

In Nelson's Xanadu, change tracking (i.e. the ability to identify each individual modification to the previous version of a document) is a fundamental mechanism for two major functionalities described in the system: versioning and transclusions (Nelson 1997). Xanadu was to be built on the concept of xanalogical storage, a system to store Xanadu documents not as whole blocks on a file system, but as a list of references to fragments combined into the final document, on demand. Each fragment represents an individual change to a document, stored separately and individually, in order to allow reconstruction of the original, final and any intermediate state of the document throughout the history of its editing. Nothing is ever deleted, everything is added, and each version of a document is but a list of pointers to the data that was part of that specific version.

This mechanism has many advantages, mainly transclusions, or the reuse, in whole or in part, of content from another document. This differs from pure copy and paste in that the document stores only a reference to the external material. The software is expected to fetch the current content and place it inline with the main material, so that the included content is always current and updated with respect to its source document. In fact, even versioning in Xanadu is implemented using a form of transclusion, whereby each subsequent version of a document is a new document specifying the new content, plus a transclusion of each fragment of the previous version that has survived the editing.

In software engineering, the collection of the differences between two subsequent versions of a source file is called a delta. Tools such as diff can create the delta between two entire versions, or it can be accumulated by capturing the information while the author edits the work (MS Word, for instance, can store change-tracking information within the document during editing). Deltas can be stored externally to the document they refer to (e.g. the output of diff), or internally (the MS Word approach). Internal deltas grow in complexity and fragility when multiple versions need to be compared to each other. We proposed a language for internal deltas for HTML documents (Vitali and Durand 1995). External deltas require a pointer mechanism for determining with great exactness the point in the document to which the single modification is attached. XPointer (DeRose et al. 2001) is one such language, and there are implementations for several architectures (we provided two such implementations (Vitali et al. 2002)).

Managing multiple versions is fundamental in collaborative writing systems (such as wikis) because it is the only safe solution against inadvertent or malicious manipulation of the content. Provision of deltas is also useful, since it allows the actual contribution of an author to a document to be determined. Most wikis only deal with version management, but have the whole document as the smallest versioned unit; other wikis extend these functionalities by exploiting fine-grained addressing mechanisms, closer to the byte-oriented version management typical of xanalogical systems. For instance, PurpleWiki, based on Purple (Kim 2001), is meant to manage HTML documents and make them addressable at the paragraph level. It does this by automatically creating name anchors with static and hierarchical addresses at the beginning of each text node. This pointing mechanism adds granular addressability to HTML documents and allows full control of content fragments. So PurpleWiki merges the benefits provided by a wiki (universally editable pages, automatic linking mechanism and versions management) with a fine-grained addressing mechanism that can provide change-tracking and, ultimately, transclusions.

5 Conclusions

The Web we have been using in recent years is nothing more than a read-only medium, where production and update of content lies on a completely different level from browsing and reading. The success of blogs and wikis, on the other hand, demonstrates there is a need, a deeply felt need, for users and readers to be directly involved in the creation of content.

The idea behind the original Xanadu project, and most of the subsequent projects we have discussed in this paper, adds another dimension to the ease of publishing new content: the idea of allowing people access to edit everyone else's content easily and freely. Technical issues aside, there is one reason the vision described in this article could never come true: fear of the legal consequences. Content providers have been shown to jealously guard their content and the presentation they have provided with it. Discussions surrounding the issue of deep linking make it clear that many Web content providers are not happy to allow others to play with their data. It is highly possible that customization of content would be perceived as really too much to stand.

Reassurances of fair use, correct presentation of modifications, and fair management of copyright royalties could help is calming the legitimate fears of content providers. Maybe, had Tim Berners-Lee considered transclusion as a fundamental functionality right from the beginning, as he did for links, the idea of user customization may already have been accepted and part of the very idea of a hypertext system.

Another set of problems that has emerged with the success of the Web is without doubt connected to the relationship between content and presentation, and the level of graphical sophistication users now expect from Web sites. Template-based Web page design can help in providing consistent and sophisticated layout and presentation to new Web content, but creates difficult problems in allowing quotations, personal links and transclusions.

The ideas presented in this paper do not come out of the blue, of course. We are implementing a system, called IsaWiki, that can be considered a research prototype for the immediate application of the ideas described. IsaWiki draws its inspiration directly from ISA (Vitali 2003) and XanaWord (Di Iorio and Vitali 2003), two previous research prototypes that have been used as the starting points for this endeavor.

IsaWiki draws from the experiences of these systems, and a number of wiki systems currently available, to deliver an implementation of the architecture described so far.

The server deals with storage of both local documents and local versions of documents residing on different servers, provides conversion services from and to a number of well-known data formats (such as MS Word, HTML, XML, Wiki, etc.) by relying on the generic data model discussed in section 3.2. Whenever a document is requested in HTML and has no associated layout, IsaWiki can add a layout from the many created with a graphic application. The server also provides differencing and versioning services for different versions of the same document, being able to identify and show all the modifications that a document has undergone. Currently the server runs as a PHP application on an Apache HTTP server.

The client is composed of a sidebar preinstalled on the user's browser. The client provides services such as navigation monitoring (by which the browser can determine whether a personalized version of a Web page exists on the IsaWiki server), page customization tools (by which the reader can access, modify and customize the actual content of the page), a page analysis tool (used to determine the actual content areas of a Web document, and ignore the layout parts), and tools to save different versions of a document, access them individually and compare any number of them.

IsaWiki takes these issues into account, providing an easy environment to create sophisticated graphical templates, to add content to new Web pages using either Web-based editors or well-known commercial tools (such as MS Word), to allow customization of Web pages via transclusion mechanisms, and to separate content fragments clearly from presentation fragments in allowing customization of pages. To our knowledge, no other tool or research prototype deals with such a large set of objectives.

Writing the Web