Home> Blog> Lessons, the Repository Split, and Translations

Lessons, the Repository Split, and Translations

This post originally appeared on the Software Carpentry website.

Keeping on the roll of the posts about the repo split, templates, and metadata, Software Carpentry now needs to consider how to handle translated lessons. The core Software Carpentry lessons have be translated by bilingual instructors into Korean produced by Victor KC Lee (and friends) and Spanish by Francisco Navarro (and friends). With the upcoming repo split, I think its a good time to examine the various options of how we might handle translations generally.

Option 1: Translations Live Within the Lesson Repo

The first method of handling translations is to introduce a translations directory to the existing lesson templates. Under this directory would live translations other than original lesson using the ISO two-letter language code. The contents of these directories would be otherwise identical to those of the host lesson. This is how the Francisco has implemented his translation, on a fork of the bc repo. The ProGit Book initially started out with this implementation, later moving to Option 3.

One possible drawback to this implementation is that original and translated lessons can drift apart with no indication of where the changes happened. Since most instructors will be monolingual, they cannot diff the content. Since this lesson would live inside the official Software Carpentry structure, it would have a certain endorsement associated with it.

Option 2: Translations Live Within a Separate Branch

Branches could be created of the form trans-ISOCODE or similar naming from the existing lessons. This would improve the ability to track the master language of the lesson and rebase changes (which would then need to be translated) as the lesson is updated.

This implementation would resolve the diff issue in Option 1, allowing comparison of line-by-line changes, even if the content isn't understood. These translations, again living within an official repo, would have certain endorsement.

Option 3: Translations Live Within a Forked Repo

Repositories containing translations could be forked from the main lesson and be maintained separately from the original. Changes would have to be merged from upstream and then translated.

There are some examples of this implementation being successfully used, most notably the ProGit book (thanks to W. Trevor King for the pointer). The forked repos are kept in sync with the upstream master book, and then translation commits are layered on top, for a sense of what that looks like, see progit-fr.

Forked repos could exist inside or outside the Software Carpentry organization, allowing for both officially endorsed and unofficial translations of lessons. If such translation efforts are wildly successful, this method would result in a massive proliferation of repositories, multiplying the increase due to the repo split.

How Official is a Translation?

Integrating the translations into the lesson repository, either as Option 1 or Option 2, lends a certain endorsement of the quality and completeness of the translation. This may impose a burden on the lesson maintainers to either translate themselves (in the case where they have the ability) or attempt to seek out translators to maintain existing material. Barring that, they will have to decide when the core lesson has diverged too much from a translation, and "depreciate" the material.

Handing Translations on the Site

Beyond storing translated lessons and keeping them in sync, we may want to render the lessons on the main Software Carpentry site. The existing lessons are rendered on the site under the path v5/novice/lesson. The existing pathing can be slightly modified to add the ISOCODE for the translation as v6/ISOCODE/novice/lesson. The lesson titles and descriptions will also need a custom page generated in the native language at v6/ISOCODE/ so that they're appropriately indexed by Google.

One last thing to consider is the detection of Accept-Language header sent by user's browser, and perhaps using that to recommend translated lessons if they navigate to the core lessons, this could bring larger awareness and usage to the translations, which is key to attracting more multi-lingual contributors.