Teaching LLM Assistants in Carpentries Workshops, part 2
Following our previous post summarising a pair of community discussions on the topic of teaching the use of LLM “assistants” such as ChatGPT in Carpentries workshops, here we describe the Curriculum Team’s plans to coordinate relevant updates to lessons.
Including discussion of LLMs in workshops
One key takeaway from November’s community discussions was the observation that learners are using/will use LLM assistants when learning to code, whether we want them to or not. In this context, we believe that our goal as Instructors should be to help learners develop a working mental model of how these tools can be used, the kind of task they are (and are not) suited to, and their strengths and limitations.
We consider this broadly comparable to the guidance we give learners about searching the Internet for help when they encounter an error or want to find out how to do something new: this is undoubtedly an effective way to get useful information – and is a common practice when coding/working on the command line – but it is also important that learners understand the risks of copy/pasting commands/code snippets they have found online. At minimum, there is a chance that the suggested code will not work, or will give incorrect results; at worst, they might delete valuable data or introduce security vulnerabilities to their software.
While ChatGPT will not respond with malicious intent (or any intent at all!), it is nevertheless important that learners leave the workshop understanding that they should always check the code/commands these tools provide, and be ready to adjust it as needed. Similarly, just as we might teach learners how to write more effective web searches in order to obtain more useful results*, Instructors might choose to spend time at a workshop guiding learners to write prompts that are most likely to return useful responses.
In the coming weeks, the Curriculum Team will propose new content to be added to the “Getting Help” section of one of our programming lessons to outline these points, supported by new Instructor Notes to help Instructors teach the content in workshops. (Note: the Library Carpentry lesson, Python Intro for Librarians, already contains a short section covering some of these points.) We will invite feedback from the community on GitHub (join the #lesson-dev channel on Slack to be notified when the pull request is ready) and, after the new content has been merged on one lesson, we will call for other contributors to help us add similar content to the other lessons. (This will be a great way to get started with contributing to our open source lessons, and to make a really important improvement to our workshops so please get involved if you are interested!)
Exploring development of additional AI-related curriculum
The Carpentries Incubator includes >15 lesson projects on topics related to machine learning/artifical intelligence, and there is clear interest in the creation of a new set of lessons/workshops teaching these topics in more depth than the current Data Carpentry, Library Carpentry and Software Carpentry lessons allow. In the coming weeks, we will reach out to the developers of these projects and invite them to join a discussion of how we might encourage coordination between projects and combine efforts, to produce a coherent curriculum or set of curricula from these various lessons.
Once scheduled, that discussion will be announced on the usual channels (discuss mailing list and #general channel on Slack) so that others interested in getting involved with such curriculum development can join.
Follow-up discussions
The initial community discussions held last year demonstrated an appetite for conversations on this topic within the community, but could not provide sufficient time to explore every aspect. Members of the Curriculum Team and Community Engagement Team will host monthly follow-up discussions on a theme of LLMs for Data Science in the first quarter of this year, with the potential for further sessions after that. These discussions will provide a platform for community members to share their ideas, experience, questions, and concerns about teaching generative AI methods and tools, and further inform The Carpentries strategy in this area.
Three discussions are currently scheduled, each running twice to cover more time zones:
- Tuesday 28 January, 12:00 UTC and 21:00 UTC: LLMs for Data Science: The Ethics of Teaching LLMs in Carpentries Workshops
- Tuesday 25 February, 12:00 UTC and 21:00 UTC: LLMs for Data Science: Essential Knowledge and Common Misconceptions
- Tuesday 25 March, 12:00 UTC and 21:00 UTC: LLMs for Data Science: Case Studies to Inform Carpentries Curriculum
Please join us if you are interested! You can sign up to join these discussions on the community sessions Etherpad.
Thanks for reading
These have been very thought-provoking discussions, and we are excited to see where the community takes them next. If you would like to discuss any of the ideas and projects mentioned here, please get in touch on Slack or by email.
* Indeed, improving web search terms was one way in which participants at the community discussions mentioned having successfully used LLM assistants.