Contemporary Web Development
Over the years, there's been a number of shifts in my thinking about how web application should work. Back when the web was young, we used shell scripts to output fully formed pages, and used Berkeley DB to store values for future use. An app would have as many databases as datatypes. It scaled well when you had one server and one host.
Quickly the need to scale databases off of www to multiple web servers (remember when www was just a hostname?) big guns like Oracle came out and allowed us to make use of a bunch of CPUs on a Sparc server. Even in the late '90's 16 cores were seldom enough. Having an enterprise database meant we could do all the heavy lifting in the DB, freeing the web frontends to be cheap pizza boxes sufficient to handle the flood of AOL dial up users with sucky connections.
By 2000, we had adopted fancy templating engines written in C/C++ that allowed us to serve millions of users on commodity 4x AMD servers. Some less performance critical web applications would run in PHP, Perl, or even Python, but those sites also ran MySQL and usually had .com in the company letterhead. Big iron ruled the DB set, but now involved clusters of DB servers partitioned to handle load.
After the bubble burst, people coding webservers in C/C++ were few and far between, and some of us dabbled in other languages like Ocaml, but most of what we did no longer involved templating. By about 2002, nearly all of the UI was in the browser. We became heavy users of flash, and used XML sockets connected to backend chat servers talking XMPP, and later cheaper URL encoded string protocols. Backends were primarily C/C++, but also leveraged embedded Perl and Python to script those bits which changed more frequently as the business grew. Often this would take the form of chat bots that would communicate business logic decisions to the customers and data stores.
By 2004, consistent hashing was all the rage! Our main application was now almost entirely in the browser. The backend was a custom C++ chat server that had a distributed object store. The object store persisted complex C++ objects to a database, and used a partitioned keyspace to distribute objects. Each node could support more than one keyspace, but in practice we ran multiple nodes on a single machine, with shadow nodes providing fail over support. With a 65k partition object space and nodes running in triplicate, we never had enough hardware to exhaust this in practice. Since nodes were managing live C++ objects and each uses S2S messaging the entire system was eventually consistent. In practice this worked very well as each node also had its own DB shard allowing for ACID compliant persistence on the data store side. Technically we were using Postgres as a distributed key/value store but never talked about it that way.
2010 and the development of better mobile browsers meant less C/Java and more webkit views embedded in apps. At this point the JS frameworks developed into widgets inside of iframes, entirely self contained and communicating via cross document messaging. This was a huge advancement since each button, every form field, every video, and every sound effect could live in an isolated context, meaning reuse suffered no ill effect from namespace collisions. Only in the messaging hub did you have to worry about naming your endpoints. Even CSS ceased to be an issue as each widget has its own space, and changing a button could only change that button. A typical app would have 100+ iframes on an iPad and be as snappy as 100 divs. This allowed apps to actually reuse components. It also meant the main event loop needed some central planning. Gone is relying on DOM events, and instead old school controllers modeling keyboard, mouse, and touchpad send messages directly to widgets upon activation. It becomes the domain of each widget to respond accordingly, and the role of the main screen to compose the scene.
Also in 2010, local storage, SQLite, and client side caches moved the database put of the database into the edge. By replicating data from a client side database to a central store, we created a better offline experience and solved the consistency of experience problem by making the user the authoritative source. In aggregate, even at 2MB of data per client, 10,000 users bring with them 20GB of storage, 100k bring 200GB, and 1M bring 2TB. In other words, as with clients rendering pages, storage scales linearly with clients as well.
So what will 2012 bring?
- transparent remote method invocation
- edge side conditionally consistent database
- ubiquitous messaging, with Multi-actor dispatch
- bundled component architecture with client side hosting
In short we will see a no server movement where clients will only use the web server as a way to establish a locatable endpoint so that other apps can communicate with it when it is online. And that will be a far cry from a bunch of echo statements in a shell script.