Opportunities and Challenges in Localisation: The Carpentries Perspective
In the realm of open science, the power of collective knowledge is magnified when it’s accessible to all, irrespective of geographical or linguistic boundaries. Localisation, in this context, is not just about translating scientific content, but about adapting research, methodologies, and data to resonate with diverse cultural and regional nuances. This series of blog posts, Opportunities and Challenges in Localisation, will dig into the world of localisation from the vantage point of various open science communities of practice. Representatives from these communities will share their unique efforts in championing localisation, the opportunities they’ve unearthed, the challenges they’ve grappled with, and pathways for you, the reader, to engage and collaborate. Join us in this exploration, as we underscore the importance of localisation in making open science truly inclusive, bridging knowledge gaps, and fostering a global community of collaborative inquiry. If your organisation also has localisation efforts, and would like to share your story on a future blog post in this series, please do so following these formatting and template guidelines and send your contribution to balmarzouq@turing.ac.uk.
This blog post in this series introduces The Carpentries story with localisation.
— Batool Almarzouq, “Opportunities and Challenges in Localisation” series curator
Introduction
The Carpentries is an international, community-led organisation teaching foundational data science and coding skills to researchers through workshops that use openly-available lessons. Since 2012 and as of the end of 2022, our community has supported over 4,000 workshops in 65 countries for thousands of learners. Our core values drive us to engage in and support localisation efforts to ensure all members of our community can participate fully and in ways inclusive of their region and culture.
The Journey of Localisation in The Carpentries
A majority of our efforts toward localisation have focused on the translation of our lessons. Since 2017, a subset of lessons has been translated into five languages: Spanish, Korean, Portuguese, Italian, and Japanese. This translation work is driven by subcommunities from those language groups within the organisation. In 2023 all our lessons transitioned to a new lesson infrastructure, The Workbench, and the community has continued to test tools and develop procedures to support translation using this new infrastructure. Virtual events have also been hosted to provide a space for the community to share approaches to localisation with each other and to identify next steps (e.g., video of 2022 conference panel).
David Perez-Suarez presents at CarpentryCon 2022 during the session “Translation at The Carpentries: Where We’ve Been and Where We’re Going.” In this presentation, he references contributions from the community to a 24-hour collaboration sprint to translate the Data Carpentry R lesson for a geospatial data lesson into Spanish. Read more about the 2020 event in the blog post.
In addition, Glosario was launched in 2020 as a multilingual glossary of computing and data science terms. It has been supported by 157 community members making contributions in 24 languages from around the globe. This year, a process was also put in place where any community member can schedule to host a community session on any topic in any format at any time and in any language of their choosing. There is currently limited data on the success of this approach, but there is hope that this will further support localised programming.
Opportunities Harnessed
All of the processes developed have been valuable in lowering barriers and making it easier for more community members to make contributions. And, over time, advances in AI translation services and the availability of tools like Transifex and Crowdin should facilitate greater collaborative approaches to translation. There is strong interest in these activities among members of the community and these present an opportunity for the organisation to engage new members interested in supporting the organisation in this way.
Challenges Faced
Ongoing conversations within our community and with other similarly-placed communities have demonstrated the level of effort and resources required to support the work needed to advance localisation successfully and equitably. Multiple challenges still remain, particularly for an infrastructure and accompanying workflows to be used across The Carpentries. These include, but are not limited to:
- Use of platforms, like GitHub, create a barrier for those unfamiliar with the software.
- Several localisation efforts within the community are ongoing, and the tools and workflows used differ across subcommunities.
- Because of the various approaches being taken, there are no clear communications on how translations are being managed and what resources are or are not available to support the work.
- There are currently no pathways to support the creation of lessons in languages other than English.
- There are data science and programming terms that still need to be coined in some languages.
- There is currently no formal way to acknowledge volunteer contributions to localisation.
These challenges are slowly being addressed but progress has been slow as the focus has been on the transition of all lessons to The Workbench. Now that this transition has been completed, more time and attention are being placed on supporting localisation efforts through the testing of new tools.
How to Stay Connected with the Carpentries
If you are interested in being part of The Carpentries efforts to support localisation, please join the #internationalisation channel on our Slack workspace. We also co-host meetings with The Turing Way to provide updates and share lessons learned fortnightly from 14:00-15:00 UTC, beginning 25 October (add to your calendar). For additional information or to engage with us further in these efforts, please email community@carpentries.org.
Acknowledgements
This post was written following conversations and significant contributions to our localisation efforts from the following community members (alphabetised by first name): Andrea Sánchez-Tapia, Angelique Trusler, Annajiat Alim Rasel, Batool Almazrouq, David Perez-Suarez, Giacomo Peru, Joel Nitta, Kozo Nishida, Luca Di Stasio, Martino Sorbaro, Masami Yamaguchi, Melissa Black, Natalia Morandeira, Oscar Masinyana, Paola Corrales, Toby Hodges, Yanina Bellini Saibene, and Zhian Kamvar. We would also like to acknowledge all of the contributions made by other members of our community to these localisation efforts as there have been many. Thanks for all you do and continue to do to support this important work!