The Simplest Web That Could Possibly Work

This post originally appeared on the Software Carpentry website.

A new web tool called If This Then That generated a flurry of interest this week (see for example Scott Hanselman's blog post). Simply put, IFTTT lets you connect web services together using if-then rules. It's as simple as:

Click on "this" to start.
Select "Twitter", and tick off boxes to create a rule (such as "when Alan Turing tweets").
Click on "that".
Select what you want to happen (e.g., echo the tweet to your own Facebook page).
There is no step 5.

That's the beauty of it: there is no step 5. It knows how to work with all sorts of services, from Instapaper and Pinboard to weather reports (yes, weather reports), and there is no step 5. Just as Twitter stripped blogs down to their bare essentials, IFTTT takes graphical workflow tools like Yahoo! Pipes and says, "What's the simplest version of this idea that could possibly be useful?"

What's the equivalent for scientists? Michael Nielsen's answer is, "IFTTT itself." "Facebook for scientists" and "Twitter for scientists" projects have foundered because we already have Facebook and Twitter, and they already do most of what scientists need—why would this be different?

But as Cameron Neylon pointed out:

The problem on the research side is the variety of "output types". There are lots of inconsistent and non-standard outputs that can never quite be connected up the way you want, formats wrong, header broken, wrapped up in the wrong ASCII encoding or whatever.

Once again, it comes back to the third of Jon Udell's principles of computational thinking: knowing the difference between structured and unstructured data. A lot of what's on scientists' hard drives cannot be understood by programs, despite being digital. A lot of the rest falls into the "long tail" trap: so few people (and files) use the format that writing tools to handle it isn't economical (by which I mean, general-purpose services won't, and the scientists who have their data in that format are too busy getting their next paper out to get around to it). This is scientific computing's "last mile"; if we really want to make the world writable, we need to focus on this rather than petaflops.