Lessons, the Repository Split, and Translations
This post originally appeared on the Software Carpentry website.
Keeping on the roll of the posts about the repo split, templates, and metadata, Software Carpentry now needs to consider how to handle translated lessons. The core Software Carpentry lessons have be translated by bilingual instructors into Korean produced by Victor KC Lee (and friends) and Spanish by Francisco Navarro (and friends). With the upcoming repo split, I think its a good time to examine the various options of how we might handle translations generally.
Option 1: Translations Live Within the Lesson Repo
The first method of handling translations is to introduce
a translations
directory to the existing lesson
templates. Under this directory would live translations other than
original lesson using the ISO two-letter language code. The contents
of these directories would be otherwise identical to those of the
host lesson. This is how the Francisco has implemented his
translation, on a fork of the bc
repo. The ProGit Book
initially started out with this implementation, later moving to
Option 3.
One possible drawback to this implementation is that original and
translated lessons can drift apart with no indication of where the
changes happened. Since most instructors will be monolingual, they
cannot diff
the content. Since this lesson would live
inside the official Software Carpentry structure, it would have a
certain endorsement associated with it.
Option 2: Translations Live Within a Separate Branch
Branches could be created of the form trans-ISOCODE
or
similar naming from the existing lessons. This would improve the
ability to track the master language of the lesson and rebase
changes (which would then need to be translated) as the lesson is
updated.
This implementation would resolve the diff
issue in
Option 1, allowing comparison of line-by-line changes, even if the
content isn't understood. These translations, again living within an
official repo, would have certain endorsement.
Option 3: Translations Live Within a Forked Repo
Repositories containing translations could be forked from the main lesson and be maintained separately from the original. Changes would have to be merged from upstream and then translated.
There are some examples of this implementation being successfully used, most notably the ProGit book (thanks to W. Trevor King for the pointer). The forked repos are kept in sync with the upstream master book, and then translation commits are layered on top, for a sense of what that looks like, see progit-fr.
Forked repos could exist inside or outside the Software Carpentry organization, allowing for both officially endorsed and unofficial translations of lessons. If such translation efforts are wildly successful, this method would result in a massive proliferation of repositories, multiplying the increase due to the repo split.
How Official is a Translation?
Integrating the translations into the lesson repository, either as Option 1 or Option 2, lends a certain endorsement of the quality and completeness of the translation. This may impose a burden on the lesson maintainers to either translate themselves (in the case where they have the ability) or attempt to seek out translators to maintain existing material. Barring that, they will have to decide when the core lesson has diverged too much from a translation, and "depreciate" the material.
Handing Translations on the Site
Beyond storing translated lessons and keeping them in sync, we may
want to render the lessons on the main Software Carpentry site. The
existing lessons are rendered on the site under the
path v5/novice/lesson
. The existing pathing can be
slightly modified to add the ISOCODE for the translation
as v6/ISOCODE/novice/lesson
. The lesson titles and
descriptions will also need a custom page generated in the native
language at v6/ISOCODE/
so that they're appropriately
indexed by Google.
One last thing to consider is the detection
of Accept-Language
header sent by user's browser, and
perhaps using that to recommend translated lessons if they navigate
to the core lessons, this could bring larger awareness and usage to
the translations, which is key to attracting more multi-lingual
contributors.