How Much Of This Should Scientists Understand?
This post originally appeared on the Software Carpentry website.
Let's start with the problem description:
All of the Software Carpentry course material (including lecture notes, code samples, data files, and images) is stored in a Subversion repository. That's currently hosted at the University of Toronto, but I'd like to move it to the software-carpentry.org domain (along with this blog). However, software-carpentry.org is hosted with site5.com, who only provide one shell account per domain for cheap accounts like the one I bought.
Why is this a problem? Because when someone wants to commit to the repository, they have to authenticate themselves. I could let everyone who's writing material for the course share a single user ID and password, but that would be an administration nightmare (as well as a security risk). Site5 does have a workaround based on public/private keys, but it's fairly complicated—i.e., it could break in lots of hard-to-diagnose ways. Another option would be to use the mod_dav_svn plugin for Apache, but Site5 doesn't support per-domain Apache modules either. Dreamhost.com does, so I may be switching hosts in a few weeks.
So: how much of this should the average research scientist be expected to understand? If the answer is "none", then how are they supposed to make sensible decisions about moving their work online? If the answer is "all", where does the time come from? (It takes me 30 seconds to read the two paragraphs above; it would take many hours of instruction to teach people enough to do the analysis themselves.) And if the answer is "some", then which parts? To what depth? And who takes care of the rest on scientists' behalf?