Creating Assessments and What to do With the Data

This post originally appeared on the Software Carpentry website.

Our new assessment questionnaires are up for your review, so I thought I'd talk a little bit about the work behind the scenes, and some lessons learned. Over the past couple months I've had the opportunity to interview bootcamp participants who attended the Michigan State University Software Carpentry Bootcamp a little over one year ago, May of 2012. The goal for interviews were two-fold: develop assessment questionnaires, and gather data.

Feedback As I completed interviews I began noticing response trends. Participants indicated the most useful topic was the command line, and in retrospect they would've liked more time allocated to the topic. In particular subtopics such as piping, scripting, and miscellaneous tricks were most useful. Least useful was SQL, with comments that its usefulness was understood, but it "wasn't for my research." Participants either had no need for databases, or had a simpler problem which didn't require complex data querying.

By the end of a typical workshop participants have been give a whirlwind tour of various technologies and practices with a minimal amount of hands on experience. In this case most participants felt abandoned with little knowledge of how to apply the learned material to their own research, or how to explore integration or migration with their current workflow. To those not extremely self-motivated, plunging into a new field and hitting the road running with application of new skills and concepts is a daunting task. Outlining a way forward, even with one specific participant's project as an example, could prove invaluable for long-term impact.

Interviewing In conducting interviews I had a chance to try several methods of interviewing and collecting data. The method of response collection had the most direct benefit. I first recorded audio which allowed me to freely conduct the interview without worrying about taking notes. On the other hand the interview responses were then difficult to access or code for analysis, and the prospect of having to transcribe audio is not thrilling. Directly transcribing participant responses using my laptop allowed me to transcribe while listening. There was an unexpected benefit to this approach which was that I often spent a few more seconds typing after the participant had finished replying which gave them time for addendum or further clarification. This extra time proved quite valuable in receiving higher quality responses.

As I learned more about research interviewing I learned how difficult it is to conduct a proper interview. The goal is to avoid producing a bias in responses while following a carefully crafted script. This is best achieved through an inhuman unresponsive stoic disposition. That means no smiling, nodding, mutterings of understanding, or facial expressions positive or negative. As you can imagine this requires a lot of tongue-biting for the conversationally inclined. I'll be conducting remote interviews soon and will be able to test if some combination of only audio and video will make this part of interviewing any easier.

Analysis Questions When enough data becomes available we'll try answering questions such as

Did participant's self-perceived ability to solve SWC-topic problems change and how?
Are there any correlations between participant difficulties, their improvement, and their prior knowledge that we could use to our advantage as instructors?
On a an individual bootcamp basis, were there discrepancies between instructor-perceived success and participant-perceived success? Are those differences correlated with anything we've measured?
What infrastructure (stuff on machines) works the best or worst for a workshop?
What kind of relation exists, if any, between pacing of instruction and participant self-perceived SWC-topic example problem solving ability?
Which pedagogical approaches worked best for which types of participants? (example based, principle based, etc.)

Analysis Approach There will largely be two types of data here: categorical and ordinal. The categorical data come from the "What was used" questions, and ordinal data from the roughly-on-a-scale questions, such as "Were there problems?" with a typical Likert-style scale response of None to Lots of Problems. Correlation, regression, and dimensional reduction are going to be some of my first tools to assess data and answer our questions. I would appreciate any feedback or pointers about analysis questions or analysis approaches.