HTML 5 contains a dizzying array of features. Below I created a cheat sheet identifying features that I think are likely to have some relevance to the peer to peer web. This is mostly for my own reference. Note that not all these features are actually part of HTML 5. Some were standardized separately. Some haven’t finished standardization. But whatever, this gives me a sense of the landscape.
1 Playing with HTML
Microdata A standard for putting annotations inside of HTML to identify strings as values such as names, addresses, phone numbers, a user’s home page, etc. This is primarily meant to help search engines extract more data from web pages but one can imagine calendar, address book and other apps pulling information from this data as well.
RDFa A markup language for markup languages. Think microdata but with a URLs used for attribute names. The real difference is the data model, the one in RDFa is based on RDF which essentially describes a tuple based graph (look up RDF you can drown for days in data about it). The Microdata format is simpler. Which one is more important? Which ever one ends up getting more adoption. :)
Selectors L1 Cleans up the APIs for selecting elements in the DOM from Javascript using path matching.
HTML Templates These are named chunks of inert HTML. Literally there is a template element with an id name and whatever is inside of it isn’t rendered. But one can refer to the template tag’s content by its ID and use it to shove HTML inside other places. Basically this is a quoting mechanism to make it really easy to ’hide’ HTML that is written as HTML (not quoted inside a string) and available for use elsewhere. A second key feature is the content tag which specifies parts of the template that are to be filled in at runtime. Content tags can either be replaced programmatically or they can include select attributes to specify locations in the parent context where they should take values from. The point is that you can create a template that essentially fills in itself when its instantiated.
Shadow DOM A self contained DOM that one can hang off a host element in another DOM. The main purpose of the Shadow DOM is to let one create styles and other display artifacts without having to worry about altering the page one is hosted in. The work in the Shadow DOM is only visible within the context and content (for things like HTML Templates) of the host element. The point is that one can create a fancy presentation with style sheets and the rest of it and shove it into an arbitrary document (with potentially radically different styling) and know that all the UX fun will only apply to the host element and it’s children and won’t mess up the rest of the page. It’s an encapsulation mechanism.
Custom Elements One can dynamically declare new HTML element definitions. These new elements can inherit from existing elements and have their own life cycle call backs to let one do fun things like dynamically put new content into the element at run time when it is instantiated. Thus one could have a “user-name” element which would call a call back that would look up the user’s name and replace the element’s contents with the user’s name. HTML Templates and the Shadow DOM make this feature even more powerful by allowing one to declare predefined structures to be filled inside the custom element and the whole thing can be shoved into a shadow DOM so its details (and style sheets) don’t leak out to the rest of the document.
2 Networking
online/offline events API to let the app know if it’s online or offline (e.g. connected or not connected to the Internet) and to register for events to let it know when that status changes.
Web Messaging This is a standard API for cross-site scripting, that is, sending messages between web pages in the same browser that were loaded from different domains. A web page can specify what other domains it is willing to accept messages from and based on that those other domains can send messages within the browser. Note that putting this in Networking is a little misleading since the whole point of Web Messaging is to communicate between local pages from different domains without ever having to hit the wire.
Web Sockets This is a new protocol that sends full duplex messages over a TCP link. The protocol starts off with a HTTP request so it can tunnel through firewalls/proxies etc. But the HTTP syntax is quickly dropped in favor of Web Socket’s own syntax. Personally I’d like to just stick with HTTP rather than having to invent a new REST style protocol on top of Web Sockets. Also the Web sockets API only supports establishing an outgoing connection, not accepting an incoming one.
Server Sent Events This is essentially a Javascript API for comet style communication. The assumption is that the client will send a request to the server and the server will respond with a request to the client encoded using a format specified in the API. This allows the client initiated connection to be used to push messages from the server to the client.
WebRTC A set of Javascript APIs to enable Real Time Communications. This is really a wrapper around RTP and STUN/ICE with various codecs for sound/video. RTP is a transport protocol optimized for streaming media such as audio and video. What’s interesting about WebRTC is that there is a native C++ API interface defined as part of the standard for plugging in external video/audio support.
XMLHttpRequest A bunch of new features for our old friend XMLHttpRequest. One can now specify how one wants to get back the response body, as text, an array buffer, a blob or a document. And, symmetrically, it’s now possible to send requests with a content body of type DOMString, Document, FormData, Blob, File or ArrayBuffer.
Cross Origin Resource Sharing (CORS) Specifies a HTTP response header Access-Control-Allow-Origin that specifies what domains scripts need to be loaded from in order to be allowed to make requests to that server. In other words if foo.bar.com wants to receive requests from scripts loaded on the browser from ick.blah.com then a GET/OPTIONS request to foo.bar.com would return an Access-Control-Allow-Origin header with the value ick.blah.com. And yes, * is also supported as a response value. So now scripts can make cross domain calls if their origin is specified in the Access-Control-Allow-Origin header of target. Remember, this is different than Web Message. Web Messaging is about scripts running in the same browser instance that were loaded from different domains talking directly to each other inside the browser. CORS is about relaxing the cross domain access rule for XMLHttpRequest so that a script loaded in the browser from domain X can make a XMLHttpRequest (e.g. on the wire) to a location in domain Y.
Network Service Discovery This is a fairly early proposal for a Javascript API to let a webpage perform discovery using Zeroconf, UPnP or DIAL. DIAL, btw, is just a profile on top of SSDP 1.1 (aka UPnP).
3 Javascript Fun
Web Workers This allows for a web page to spawn separate Javascript threads that can do work in the background and communicate back with the spawning page via message passing. Near as I can tell (I’m not smart enough to read the specs) the lifetime of a web worker is tied to the lifetime of the document (or documents in the case of shared workers) that created it. So while web workers might be useful in general they don’t help to solve a key peer to peer web issue of needing essentially infinitely long workers hanging out in the background doing good things like synchronization and receiving incoming requests.
ArrayBuffer A data type to hold a fixed length immutable binary buffer. The buffer is interacted with by creating an ArrayBufferView which specifies a subset of the ArrayBuffer and what type to map the buffer’s contents to (types include ints, floats, longs, etc.). A single ArrayBuffer can have multiple ArrayBufferViews. Note that it is possible to manipulate the contents of an ArrayBufferView but this just affects the view and not the original ArrayBuffer (e.g. a copy is made).
4 Storage
AppCache An API that lets the web site tell the browser which pages need to be persistently stored in the cache for the app to work offline. It also provides APIs to query the current state of the cache as well as APIs to register for cache related events.
Web Storage This provides storage for name/value pairs that are kept on the local client. This is intended to handle scenarios that weren’t well addressed by cookies. The first type of web storage is session storage and is unique to a particular page. So if the browser has multiple pages open to a site each would have a unique session storage. The second is local storage which holds megabyte long values and is available to all pages from the same origin, it’s meant for long term local storage of things like documents that are being edited locally.
Web SQL This is an API proposal to make SQL style capabilities available to web pages. The standardization effort is apparently on hold due to disagreements within the standards group.
Indexed Database This is meant as a competitor to Web SQL. This API creates a local high performance transacted DB that stores records which contain name/value pairs. A primary name can be identified and indexes can be built on any of the names in a record. One can then query on the indices.
File API This provides a standardized set of extensions to enable one to ask a user to select one or more files and to then get back meta data about those files along with file handles. It also provides standard events to handle dragging and dropping file objects onto the web page. Once a file handle has been retrieved then it is possible to use standard APIs to read the file in as a string, byte array, ArrayBuffer or data URL as well as monitor the reading of the file (if it’s large) and read it in pieces instead of all at once. Near as I can tell there is no way to just provide a file path and get a file back through this API. There is a hint that one should be able to use a file URL through XMLHttpRequest but browser origin rules for files are a mess so in practice this isn’t terribly safe.
FileSystem API This one is only supported by Google and Opera but it seems interesting. It proposes allowing an app to ask for the moral equivalent of a local disk which can either be scoped to the session life time or be persistent. The app could then read/write/whatever it wanted in that local disk which would only be visible to it. There are all sorts of quota issues here that the browser would have to prompt the user to help with.
Contacts API This was a fairly cool idea for how to let websites ask for part of a user’s address book, the website would fire off a Javascript request in the webpage which would tell the browser to get the user to pick whatever contacts they wanted to share and then return that to the Javascript call. It’s not clear how alive this work is since it’s built on web intents a framework that is no longer being pursued in the W3C.