Not on the Shelves

This post originally appeared on the Software Carpentry website.

Every few years, I indulge in a bit of sympathetic magic by writing reviews of books that don't actually exist in the hope that it will inspire someone to write them. Previous versions written in 1997, 2003, and 2009 led to Beautiful Code, Making Software, The Architecture of Open Source Applications, and a few others books as well. I'd welcome comments on what isn't in this list that you really wish you could read.

Software Carpentry for Scientists and Engineers

This book is an introduction to lab skills for scientific computing aimed at graduate students and professionals whose backgrounds are in science, engineering, medicine, and related fields. The four core topics—task automation, version control, structured programming, and data management—are are introduced via tutorials on the Unix shell, Git, Python, and SQL, then elaborated on with further tutorials on using the web to share data, creating reproducible workflows, and testing software when the right answer isn't actually known. While it necessarily glosses over many fine points, the book does give readers a useful toolkit and a sense of where to go next.

Note: we have a beta version of the first half of this, and hope to deliver it mid-2014.

Big, Fast, Cheap, or Good: A Student's Guide to Software Engineering

The two dominant undergraduate textbooks in software engineering leave out a lot of the things real software engineers do, and have only a tenuous relationship with the realities of undergraduate student life. In contrast, this short book focuses on empirical results in software engineering research, the design and construction of actual open source applications, and a development process that makes sense for students who are developing in teams for the first time while time-slicing commitments to several courses.

A Practical Introduction to Debugging

Most programmers spend a large part of their time debugging, but most books only show working code, and never discuss how to prevent, diagnose, and fix errors. Most books ostensibly about debugging are either high-level handwaving ("Make sure you're solving the right problem") user's guides for particular debugging tools, or out of date. (The one notable exception, Zeller's Why Programs Fail is an excellent read, but too advanced for most undergraduates.) This book fills that gap by combining an exploration of how debugging tools actually work with dozens of case studies showing how to apply them to real-world problems. And while the author only occasionally makes this explicit, the book also shows how to write programs that are easier to fix.

Software Tools for the World-Wide Web

Software Tools and its sequel Software Tools in Pascal were among the most influential books in the history of computing, as they introduced a whole generation of programmers to the Unix philosophy of tool-based computing. In retrospect, one of the reasons that philosophy succeeded was its reliance on a universal data format (strings of ASCII text) and communication protocol (standard input and standard output). This book's starting point is the now-commonplace observation that HTTP and data formats like XML and JSON have taken their place, and goes on to build a suite of ever-more-sophisticated tools for assembling web-based applications that use them. Drawing from sources as diverse as Jon Udell's Seven Ways to Think Like the Web, Microsoft PowerShell, and the Kinetic Rule Language, the authors present a vision in which syndication of distributed streams of events is the new normal.

Computing and the Law: A Guide for the Perplexed

The legal aspects of software have always been complicated; the web has done nothing to make them simpler. This book seeks to help programmers understand the rules (or lack thereof) they have to live with by tracing the historical development of patents, copyrights, privacy, and professional liability from the Industrial Revolution to the present day. Aimed squarely at people with no prior exposure to legal terminology, it explains concepts clearly and provides examples for each.

Difference Engines

Modern version control systems handle text well, but are much clumsier when it comes to images, MP3s, spreadsheets, and other so-called "binary" files. The reason is simple: those formats are supported by tools for reading and writing, but not for differencing and merging. This survey describes a collection of open source libraries (the "engines" of the title) that can handle many of those formats in a more-or-less uniform way. Readers will enjoy the combination of theory (such as proofs of some algorithms' performance characteristics) and practice (the design and implementation of the tools themselves).

Dialogue & Discussion

Comments must follow our Code of Conduct.

Edit this page on Github