When Do Workshops Work? A Response to the 'Null Effects' paper from Feldon et al.
This post originally appeared on the Software Carpentry website.
Author: Karen R. Word
Contributors: Kari Jordan, Erin Becker, Jason Williams, Pamela Reynolds, Amy Hodge, Maxim Belkin, Ben Marwick, and Tracy Teal.
“Null effects of boot camps and short-format training for PhD students in life sciences” is the provocative title of a recent article in the Proceedings of the National Academy of Sciences. Those of us who enthusiastically design and deliver short-format training promptly took note, then scratched our heads a bit. We waited a little for a response, wondering if one or more of the programs that participated in the study might step up to their own defense. Nothing happened. We thought about letting it go - we’ve got our own programs, with distinct goals, and our own assessment data, so maybe this broad-brush study isn’t so important. But … it keeps being raised. Someone will bring it up here and there, asking what we think about it. Whenever this paper comes up in conversation, its title certainly throws some weight around.
So, do workshops work? However certain we may be about the value of our own programs, it seems important to have a little sit-down with this paper and talk about what it means to us, what it doesn’t mean and, most importantly, what it does not address at all: the question of what you can do with a short course [1] when a short course is all you’ve got.
The premise: Spacing instruction over time is better for learning
When given a choice between teaching a two-day short course versus stretching those same hours and content across several weeks of repeated meetings, you can expect to get a lot more learning out of the longer course. This point, described as a core premise for the PNAS study, is essentially irreproachable. There is abundant evidence that distributing instruction over time maximizes learning in comparison with the “massed practice” that occurs when teaching is concentrated into an intensive short-format course.
The problem: Spacing instruction over time is often impractical
Traditional courses match students and faculty on a spaced schedule over a quarter or semester time period. When this format is possible, it should be pursued and optimized, not replaced with short courses.
But when isn’t it possible?
When there aren’t enough instructors. If expertise in an area is scarce, the time demand for distributed training often exceeds the FTEs available to meet that need. Until that shortage can be remedied, a large number of people are left to self-teach or go without. Under these circumstances, short-format workshops are often the only practical way to deliver training to the many more who need it. This is currently the situation with regard to training in data management and analysis, and in many cases, with foundational computing skills as well.
When learners don’t have time. A similar scenario emerges when those in need of training are fully committed to jobs or research or are otherwise unavailable for a time-distributed course. This is the case for most professional-development training. Even within academia, researchers may need training right away and can’t wait for the next semester-long course offering.
When opportunity knocks. Even within graduate school, where long-format courses are the norm, some opportunities are concentrated in time. For example, a short course may be able to attract many faculty simultaneously, allowing students to observe them engaging with and learning from each other. Some research experiences or team-building activities may also be possible only on a concentrated schedule. Also where traditional course curricula can be slow to change, short-courses can permit rapid inclusion of new and needed skills before they can be added elsewhere.
For those of us who work within the short course mandate, then, the question becomes: how can we optimize that format to best meet learners’ needs? When setting goals for impact, we tend to think in terms of how much and what type of impact we can have, and to focus our efforts accordingly.
One reason why the paper by Feldon et al. raises concern within our community is because it frames the question as “whether”. And if the answer to “whether” we can have an impact with a short course is “no”, then we’ve clearly got a problem on our hands. However, in our experience, that simply is not the case. To the contrary, our evidence suggests that there is quite a lot you can accomplish with a workshop when you accept its constraints, focus on specific goals, and leverage the strengths of this format. In the next section, we’ll take a look at the study described in the paper, evaluate its claims, and examine its relevance to the kind of training we provide. Then we’ll circle back around to our goals, our strategies, and the kind of data that we collect to assess and inform the development of our workshops.
The study
There is a lot to love in this work! This was not a simple survey study. They graded papers – multiple times, with validation, for 294 students from 53 institutions. They also repeatedly administered tests and surveys over the course of two years. The dataset must be impressive; we assume there is a LOT of other interesting stuff there that relates to graduate student development and correlates of early success. However, it is hard to know since the data are not publicly available or displayed in the paper. We’re eager to see more publications and perhaps more extensively summarized data come out of this project in the future.
That being said, in discussion with our community members, several persistent questions and concerns emerged. These are a few of the most pertinent questions:
1. How diverse are the program goals? This study lumps together an unknown number of programs administered at the outset of life-science PhD programs as a single treatment. We know only that 53 institutions were sampled and that, of the 294 students in the study, 48 were short-course “participants”. According to Feldon et al., the unifying goal of these programs is to “accelerate the development of doctoral students’ research skills and acculturation”, with emphasis on research design, statistics, writing, and socialization. However, specific emphasis seems likely to vary, and herein lies the concern most frequently voiced in our community: any given program might focus its efforts on any or all of the components identified (research, statistics, writing, or socialization). Indeed, the more astutely a program identifies and engages with short-format limitations, the more focused their program may be. By surveying students across 53 different institutions, it seems highly likely that the specific aims of different programs are heading in different directions. If some programs are particularly good at socializing students and preparing them to cope with the hurdles ahead, while others emphasize grant writing, otherwise ‘significant’ impacts within a sub-group of similar programs are likely to be lost when combined and assessed with the group overall. This is particularly clear if we consider the sample size of 48 students as being further split (e.g. 10, 10, 15, 13) by distinct program emphases. Lumping together successful programs with different aims is likely to show that all are ineffective in each category.
2. How generalizable is this context? The public reading of these findings seems to be, “Too bad short courses don’t work”. However, pre-PhD short-courses are a highly specific and unusual context for a short course. In most other cases, short courses arise out of necessity or unique opportunity, such that there is no subsequent distributed content that re-teaches or even remotely overlaps with the content taught in the short course. In pre-PhD programs, specifically, any effects are potentially in direct competition with gains made via traditional course content. The extent to which the same or overlapping content is otherwise available in each program is also unclear. The authors of this paper might not have intended their work to generalize to other contexts, but the tendency of readers to generalize makes this question a vital one. Benefits of a short course are easily lost in a sea of positive outcomes resulting from graduate training, but that has little bearing on the impact such courses may have when they stand alone.
3. Is this the right experiment to test graduate student outcomes? While we found the methods to be impressive and worthwhile in many respects, several people expressed concern about the two-year assessment regime. This included questions as to whether a graduate student is likely to have matured and, particularly, to have written substantively in their content area within the first two years of study, as well as whether a regime of continuous surveys might itself have a sizeable impact on student development. As with any study that takes volunteers, willingness to participate – both in the short course programs and in the study itself – may bias toward more motivated or engaged students overall, and this could have an impact on the interpretation of the results. These are the sorts of problems that plague any effort at assessing students at scale, and are worth noting only as a standard “grain of salt” with which any study should be (but is not always) considered when it stands alone.
4. How do we go about making short courses more successful? This paper provides no means of evaluating variation between programs, which is really where our interests lie. This is not a criticism: it is simply not the purpose of the paper. It is the next question, the natural response to such results: if these programs really aren’t making a difference, how might we capture the opportunity, with existing funded and institutionally invested programs, to change that? Is it that short course workshops have no impact on anything, or that we need to better understand and plan for what they can accomplish?
We have a few suggestions.
What We Do
Software and Data Carpentry offer short-course training for academics and professional researchers in software and data management skills. Many of our affiliates, who have also contributed to this response, offer other short courses in related subjects. We are all driven to the short-course format out of necessity. We recognize that this format places severe constraints on the quantity of information that can successfully be conveyed, but we design our curriculum and train our instructors specifically to maximize our effectiveness in this format. Here’s how we do it:
Streamline content. We aim to teach only the most immediately useful skills that can be taught and learned quickly. We teach our instructors to resist the urge to “get through everything” or pack extra details into their explanations.
Teach strategically. We keep learners active by using live coding (in which learners work through lessons along with the instructor) and frequent formative assessment. We teach instructors to be mindful of the limitations of short-term memory and to focus instruction and assessments to minimize cognitive load.
Meet learners where they are. Our workshops attract a diverse population of learners, from novices to experienced IT personnel. Our learners use colored sticky notes to indicate when they are stuck. We teach instructors how to use this to adjust their pacing. We also recruit workshop “helpers” who can directly coach learners who may be struggling. The absence of performance-based grades gives us added flexibility to meet diverse needs by generating diverse learning outcomes. Some may learn about the “big picture” of a new programming language by completing a lesson, while others may come away having added “tips and tricks” to their existing skills. This is one area in which workshops may have an advantage over traditional courses, particularly when it comes to confidence- and motivation-based outcomes.
Normalize error and demonstrate recovery. We know and expect that our learners will acquire the bulk of their skill independently. Willingness to make mistakes and awareness of problem-solving strategies are far more crucial to their success than any particular content. We coach our instructors to embrace and even delight in their own errors as an opportunity to model healthy and effective responses.
Explicitly address motivation and self efficacy. One substantial advantage that we have is that our learners attend our workshops because they are motivated to learn precisely what we teach. However, preserving and nurturing that motivation is crucial. Perseverance results not only from embracing error as normal, but also from learners’ personal belief in their ability to succeed. Creating a workshop in which learners can be successful in both learning and in demonstrating to themselves that they have learned is one piece of this. We spend a good deal of time discussing motivation with our instructors. We explain why saying “it’s easy, anyone can do it” is often demotivating. We explore the differences between novice and expert perspectives and coach instructors to be mindful of and to respect the novice experience. We teach instructors to foster a growth mindset in their language and learner interactions. We emphasize that a relaxed, welcoming, and positive workshop experience is one of the most important things we can provide.
Build community. The more people at all levels are able to share what they know, the more efficiently we can distribute knowledge. As a volunteer organization, we have a strong community of instructors, lesson maintainers, and others. As learners progress, they often become involved in this community. In the long range, we hope to create a community that can provide widespread support directly to learners.
What we know about our impact
We have conducted both short-term and long-term follow-up assessments of learners. Data Carpentry post-workshop survey results have always been positive and 85% of learners report that they agree that they would recommend our workshops to a colleague. The Carpentries’ Long-Term Impact survey (n = 530) is designed to determine whether this positive experience and self-reported increase in confidence affects long term outcomes. This survey (full report here) measured self-reported behaviors around good data management practices, change in confidence in open source tools, and other specific program goals. It also explored other ways the workshop may have impacted learners, such as improved research productivity. While Feldon et al. rightly critique self-assessment with regard to performance metrics, many of our target outcomes are more conducive to self-evaluation, e.g. confidence, motivation, and daily work habits. Researchers report increased daily programming usage after attending our two-day coding workshops, and sixty-five percent of respondents report higher confidence in working with data and open source tools as a result of completing the workshop. Our long-term assessment data shows a decline in the percentage of respondents that ‘have not been using these tools’ (-11.1%), and an increase in the percentage of those who now use the tools on daily basis (14.5%). Additional highlights from our long-term survey report include:
- 77% of respondents reported being more confident in the tools that were covered during their workshop compared to before the workshop.
- 54% of respondents have made their analyses more reproducible as a result of completing a workshop.
- 65% of respondents have gained confidence in working with data as a result of completing a workshop.
- 74% of respondents have recommended our workshops to a friend or colleague.
We see that short-format workshops can be effective at increasing researchers’ confidence, use of coding skills, and adoption of reproducible research perspectives. As a part of the Open Source community, we make all of our survey data and analysis code available in our assessment repository. We welcome people to work with our survey data and ask new questions. Understanding impact is important, and we will continue to keep our community informed with regular releases of survey data and reports. We also have a virtual assessment network which newcomers are welcome to be part of. Please join here if you are interested in discussing assessment efforts in the area of training in research computing.
In Closing …
Our data suggest that we are having a positive impact, and we expect that other short-format programs can be similarly effective. However, this likely requires a focused effort on optimizing within the limitations of a short course, along with clear goals and targeted assessment to demonstrate such efficacy. It is not clear that this was the case for any of the programs surveyed by Feldon et al. , and if it was, it is not clear to us that any such specific and variable successes would be discernable in their study. We agree, however, that under most circumstances, particularly where a large quantity of content needs to be taught, a short-format course should not be favored over any available time-distributed alternative.
We applaud, encourage, and endeavor to support those who have the access and opportunity to conduct long-format training in the subjects we teach. Many members of our community are actively involved in traditional undergraduate and graduate instruction of this kind. Traditional training opportunities will begin to catch up with demand for training in data science generally, but there will always be limitations - concepts or tools that don’t clearly fit into curriculum or new approaches that haven’t yet had a chance to be incorporated. We work on training in these gaps through short courses. It is necessary for us to be as effective as possible to achieve that mission.
So far, we feel comfortable declaring that effort a success.
[1] While the paper refers to programs as either “boot camps”, “bridge programs”, or “short-format training”, it has been brought to our attention that this usage of “boot camp” can cause some consternation for those with military training or under military regimes. We will therefore use the less-vivid but more-accurate “short course” label for this piece.