I See, UC, We All See Carpentries: Collaboratively Scaling Workshop Instruction Across a University System

Instructors' experiences running Carpentries workshops in the University of California System

Introduction

This Carpentries workshop ran online over the course of 3 weeks.

  • Week 1: Unix shell, Git/GitHub, SQL (one lesson per day, 9am-12pm)
  • Week 2: R (3 days, 9am-12pm)
  • Week 3: Python (3 days, 9am-12pm)

Following the success of the virtual Southern California Library Carpentry workshop, Carpentries instructors from UC Los Angeles, UC San Diego, and UC Berkeley decided to run a joint Carpentries workshop open to students and researchers affiliated with these three University of California (UC) campuses. The plan for this workshop was to run online sessions over the course of three weeks. The primary idea and goal for this workshop was to further connect a Carpentries network across the University of California campuses. Similar to the Southern California Library Carpentry workshop, this joint UC workshop came together because of the already established collaborative partnerships between the Carpentry instructors at these three campuses. Tim Dennis at UC Los Angeles proposed the idea for this workshop to the other campus instructors, Reid Otsuji at UC San Diego and Scott Peterson at UC Berkeley, to begin planning the workshop. Each of the hosts reached out to Carpentries instructors on their campuses to gauge interest in teaching in this workshop. There was a great response and interest from the community of instructors at all three campuses.

Initial Planning

Once the instructors were recruited, the group decided on the curriculum and logistics for the workshop during an initial planning meeting. The instructors decided Software Carpentry lessons would be taught for the majority of the workshop, except for the SQL lesson, which was from the Data Carpentry lesson. The instructors also decided to use JupyterLab for the Python Plotting and Programming lesson. To keep track of logistics, three planning meetings were scheduled before the first week of lessons. Each meeting was devoted to planning each week of instruction including establishing instructor and helper roles. As planned, the instructors and helpers would meet a half hour before each lesson to solidify everyone’s role for the day. Most instructors participated in many of the lessons as helpers if they were available, but the instructors were not required to attend every day of this workshop series.

Workshop Scheduling

The workshop ran from 9:00am-12:00pm Pacific Standard Time(PST) for three days a week (Tuesday, Wednesday, and Thursday) over three weeks. The first week grouped the Unix Shell, Git/Github, and SQL lessons together, and the following weeks were devoted to R and Python over two days with the third day of each of these weeks being an open lab days. During these open lab days, learners could ask questions about projects they were working on, review portions of the lessons, or learn more about R and Python. There was a section on that lab day’s Etherpad where learners could write down what they wanted to cover in the labs.

Since the Unix Shell, Git, and SQL lessons are half day lessons, and the R and Python lessons are full day lessons, it made sense to divide the lessons by week. The first week, similar to how two day Carpentries workshops are traditionally taught, served to introduce learners to many foundational concepts that would be utilized in the programming languages the following weeks. Learners were not required to attend all the lessons because this workshop was specifically designed as a cafeteria-style offering of Carpentries lessons learners could choose from.

We hoped learners would attend all the lessons in an individual week, especially the first two days of the R and Python lessons. This seems to have been the case for many learners. The first week more learners attended the Unix Shell (66) than the Git/Github lesson (53) and SQL (46) lessons. There was such a demand for the R and Python lessons that we ended up offering two concurrent sessions of these lessons the first two days of the following weeks. The two R lessons had 44 people in one session and 37 in another, with two-thirds of the people returning for day two of the session with 44 people on the first day and twelve people dropping out of the second day of the session that had 37 people. The two Python lessons had 29 people in one session and 20 people in another session, with few drop offs from day one to day two. Fifteen people showed up to the R open lab day and seventeen people showed up to the Python open lab.

Each Friday before a week of lessons an “installfest” was held for two hours where workshop attendees could get help installing and running the programs for the upcoming workshop. If someone could not make this time, and needed help, they were connected with someone from their institution that would help them install and run what they needed for the lessons. This led to minimal installation problems each day of the workshop.

Tim Dennis sent out reminder emails to learners each week and handled a lot of the logistics around the workshop using Eventbrite. Since the waitlist on Eventbrite did not easily specify what lesson in the workshop people were waitlisted for, this caused some confusion. As a result, we let everyone on the waitlist into the workshop. Tim also created a master etherpad for the workshop that connected to the etherpads for the individual lessons. This made all the etherpads easy to access and manage over the course of three weeks.

Preserving Instructor/Student Ratio

One opportunity of hosting a workshop series across three institutions was that we could increase the number of instructors and offer more workshops sessions based on demand. We took advantage of this capacity and kept the workshop teacher/student ratio reasonable (under 50 learners in a workshop) by spinning up other sessions of a lesson if it was overfilled. For example, the R and Python workshops filled up and generated large waitlists beyond the 50 seat capacity, so we responded by creating another concurrent section of those sessions. This was only possible because we had extra instructors available to teach, which was also only possible because of the UC Carpentries instructor network.

Planning Workshop Workflow

In preparation for each of the workshop days, the instructors and helpers implemented a few ideas to manage and run the individual workshop sessions. The first decision was deciding how to split the lessons up effectively to cover necessary lesson sections while being aware of the strain of teaching and learning online for three hour sessions.

Each workshop day was split between two instructors. This helped planning the time teaching commitment and being inclusive of all instructors that wanted to participate. The majority of sessions were taught by two instructors each taking on part of the lesson.

Instructors and helpers decided a designated person should be selected to be host for the virtual classroom and the primary person for managing breakout rooms needed for help and lesson exercises. This task required estimating, in advance, the approximate number of breakout rooms needed and the approximate time when they will be used during the lessons. Designating a host was effective because this allowed the instructors to focus on teaching with minimal interruptions. The host worked out well as a direct contact for helpers and learners.

In addition to managing breakout rooms, we found it effective and necessary to identify the available helpers by their displayed names. All helpers and instructors were identified by their name, the emoji star :star:, and the computer operating system they could support. Identifying helpers this way allowed the lesson host to quickly match a helper to a learner with the same operating system. Communication between instructors and the host was done via Slack messaging. Answering general questions and assistance to learners by helpers was done through the chat feature in Zoom.

Screen fatigue is a real issue, especially by the last day, and though the plan for each day was to schedule short breaks every hour in order for participants to take a break from being online, a few times only one break was given halfway through each three hour session, which was during the instructor changeover, in order to get further through a lesson. It became clear through feedback that one break was not sufficient, and for future workshops sticking to a brief five to ten minute break every hour should be adhered to, even though more breaks will cut into the overall time available for teaching a three hour session.

And as for the length of sessions each day, three hours did not seem to be nearly enough time to get to everything we would usually get to in a workshop. After accounting for the time spent introducing everyone to Zoom and how the workshop would run each morning coupled with breaks and time for the daily feedback, the actual teaching time usually came to around two hours, which did not allow for the instructors to get deep into the lessons. For instance, there was not enough time to get to loops when teaching the Shell lesson. Many learners wished that the days could have been longer, and the idea of teaching workshops from 10am to 3pm each day with a lunch from 12-1pm was discussed for future workshops. This would allow for at least three hours of teaching time, and it would break up the time spent on Zoom with a lunch to make an online workshop a bit more manageable for everyone.

Week 1: Shell, Git, SQL

The Shell and SQL lessons were taught with one instructor teaching up to the break and then another instructor teaching until the end of the session. When co-teaching a lesson based on time instead of up to a point in the lesson, the handoff can get a bit tricky because it is hard to time exactly where in the lesson you will get to before the break. With this in mind, it was important for the instructor teaching after the break to be comfortable starting at different points in the lesson to provide a smooth transition. Overall, co-teaching the Shell and SQL lessons worked pretty well. It allowed the learners to see two different faces over the three hours and it cut down on the cognitive load for each instructor. It is also a great way for someone who is a new instructor to get familiar with teaching and not have to worry about some of the more advanced parts of a lesson.

The Git lesson was taught by Mark Matney, who used multiple cameras and a whiteboard to illustrate some of the complex ideas behind Git. This worked really well and learners enjoyed how the Git lesson was taught. Tim Dennis spent time towards the end of the Git lesson introducing the group to Github, so though this lesson was co-taught as well, it was not split like the Shell and SQL lessons were.

The instructors for the SQL lesson, Jamie Jamison and Stephanie Labou, had a secondary data download location and a SQL Online IDE ready as a backup if needed, and though there were a few setup issues, they did not need to move anyone to the IDE. Most of the setup issues had to do with macOS, and both Jamie and Stephanie were Win/Linux users, so they were not familiar with many of the issues that came up. The helpers were great in getting everything figured out, as they were throughout the entire workshop, but these issues highlighted the fact that it is good to have helpers and instructors familiar with both Windows and Mac machines.

Week 2: R Concurrent Sessions

R sections were based on the lessons from R for Reproducible Scientific Analysis. Because two R sections were taught concurrently, instructors agreed beforehand on the content that would be covered. We decided, based on previous instructor experience teaching Carpentries content online, to cover a smaller subset of lessons than would be covered in a “full” R Carpentries session. The first day (3 hours) aimed to cover lessons one to six, concluding with a very brief introduction to ggplot2 (lesson eight), and the second day (3 hours) devoted significant time to dplyr (lesson 13) and ggplot2 (lesson eight).

Week 3: Python Concurrent Sessions

The instructors had a preliminary meeting to discuss what episodes to approach, and who would be responsible for each of them. We decided to use the lesson Plotting and Programming in Python; this session is under design, and we considered that gathering feedback in it would be relevant. Instead of lecturing the whole workshop in one day, we divided the episodes in two mornings of three hours each, trying to decrease the screen fatigue.

Since the pace of virtual sessions is often slower than the in person ones, we chose to approach less episodes: instead of the 20 episodes suggested, we lectured the first nine ones. Delivering less episodes, we had more time to answer questions and solve issues that the students had. Even approaching less material, we still taught important fundamental concepts on Python — for example, using variables, importing libraries, and plotting data.

Something New: Open Office Hours

The third morning for both R and Python consisted of “Open Office Hours” during the same time slot as the other workshop days (9am-12pm). This was an unstructured time for learners to join, if desired, and ask any additional questions of the instructors of those lessons. In practice, this ended up being a mix of recapping concepts covered in the previous two days, upon request by learners, and demonstrations of additional concepts from the lessons instructors did not have time to cover. This was also an opportunity to mention and/or demonstrate functionality beyond what is included in the lesson material (e.g., advanced plotting, adding model results to plots, etc.).

For anyone considering implementing a similar portion to their own workshop, we suggest having a few concepts to demonstrate in mind (e.g., creating bar charts in ggplot2 during the R Open Office Hours). Additionally, since this was an open time slot where people came and went, we did field the same questions a few times, which indicated concepts we should better emphasize next time (e.g. pipes using %>% in dplyr and the difference between == and %in% for subsetting, during the R Open Office Hours).

Workshop Stickies

As noted above in discussing the first week of this workshop, the R/Python courses also suffered from not having enough time to teach the material. In an in-person format, much less a virtual format where everything tends to move more slowly, three hours is barely enough to cover very basic fundamentals, much less more advanced concepts. Having three days of three hour sessions for R/Python, followed by a fourth day of Open Office Hours, may be a better option. While a time commitment of four mornings can be challenging for some researchers, it would allow more time to cover material and take additional screen breaks.

Green Stickies

  • Co-teaching and dividing lessons between instructors for all lessons was effective to provide teaching opportunities for several instructors from the various UC campuses
  • Assignment of room DJ worked well to manage the breakout rooms for exercises and troubleshooting
  • Open Office Hours were a good opportunity for learners to get clarification on concepts - or demonstrations of advanced concepts - in a smaller group setting
  • Teaching less lesson episodes gave us time to have focused discussions and solve issues the participants had during the workshops
  • Having several helpers for each session — around four per session. They helped to break up the effort of assisting all learners and monitoring the room
  • The concurrent workshop sessions were great to unburden both instructors and helpers.
  • Use of the Carpentries Slack to have back channel communication between all instructors and helpers during all workshop sessions
  • Using Jupyter Notebooks as a companion, with headers and text corresponding to the lesson, was useful to approach what were the next steps. The instructors still typed all code, using empty code cells that were filled during the episodes
  • Learners liked the use of slides and whiteboard to illustrate git concepts
  • A Git instructor used two cameras with one focused on a whiteboard for the Git lesson

Red Stickies

  • Some of the instructors and helpers were not as familiar with operating systems outside of their own to effectively answer OS specific questions
  • Learners had errors specific to their operating systems, mainly on MacOS. On one machine was unable to run Anaconda from the command line after installing. To solve this issue, we followed the instructions on this blog post
  • It was hard to get through the bulk of the lessons in the time allotted
  • We had issues trying to coordinate exercise breaks during the episodes. The instructors had to copy and paste the exercises into the Etherpad before the breaks, and the learners had to search for the exercises before starting. We used breaks of around 7 minutes per episode. A possible solution would be to set one exercise break per hour, and divide an episode in 45 mins teaching, 15 min exercises. Setting a specific section for exercises on the Etherpad and pre-populating it before the workshop could help with facilitating exercises during a workshop
  • There was not enough time for learners to take the workshop surveys
  • There were MacOS installation issues for DB Browser for SQLite
  • Some learners had issues downloading, unzipping, and accessing the data file during the first episode. This could be solved by accessing the data directly on the web, or having a pre-workshop script for the learners to run through in order to make sure that the data we are using is downloaded and accessible on their PCs
  • Episode 9, Plotting, combines pandas data cleaning and matplotlib plotting. Currently, these concepts are quickly presented, confusing the learners. A solution could be to explain the examples in that episode more thoroughly
  • Since we covered less content from the Python lesson, we had doubts on how to reorder or include fundamental concepts that were not in the episodes we chose — for instance, we did not present functions and loops in this session

Building on This Success Story

This was the first Carpentries workshop that leveraged Carpentries communities across multiple UC campuses. We had enough volunteer instructors, helpers and interest from students and faculty that we were able to run concurrent sessions for high demand lessons. This allowed us to teach approximately 100 learners at one time and maintain a learner-to-instructor/helper ratio appropriate for the Carpentries hands-on pedagogical model. This was especially important for teaching in an online environment where large class sizes can easily turn into more of a webinar and less of a learner centered educational experience. We felt that this would be extremely challenging to do on a single campus but worked quite well in a virtual cross-campus environment.

Additionally, we have found out teaching this way has amplified and strengthened existing regional networks we are part of. In addition to letting us teach together, we have found we are increasingly sharing instructional materials, tips, or ideas on scoping and adapting lessons. This lessens the need for each virtual workshop to start from scratch in terms of creating slides for lesson content or scaffolding challenges for breakout rooms. We have further discussed a UC-wide (or broader!) shared Google Folder of lesson materials for each lesson that could be edited and adapted to meet local needs. We feel this promotes a core feature of the Carpentries pedagogy – lesson study – where we improve our instruction through sharing feedback and observing how we teach. Situating this lesson study online among Carpentries instructors in the UC system is a powerful way we can adapt and improve the workshops we give remotely.

Finally, we are working to expand this model to include participation from more UC campuses, especially UC campuses that do not have a strong Carpentries community or equivalent grassroots programming community. We also want to apply this model to other communities in the UC. For example, Stephanie Labou has taken this cross-institutional model to collaboratively program the UC Love Data Week with other libraries in the UC system.

Dialogue & Discussion