Version Control and Newline Conventions

This post originally appeared on the Software Carpentry website.

There are two widely used conventions for representing the end of a line and the start of the next. Unix-like systems have traditionally used one character, \n (ASCII 10, "line feed"), while Microsoft DOS and Windows used a sequence \r\n (ASCII 13, "carriage return", \n). Being invisible, the difference of conventions usually goes unnoticed, specially because most modern software can correctly display text written with either one. However, there can be interoperability issues when using editors that follow different conventions.

One of such issues can manifest when using a version control system, like subversion. Suppose that one team member accesses the repository to make a small change. Most text editors will try to detect the convention used in the document, but some (broken) editors will display the text correctly, but will save it using its default convention, regardless of the original. If one of these editors is used to make a change, a one line modification could potentially span the whole file: the one line in which the modification was made, plus every line in the document where the editor decided to change the end of line symbol.

If this change is committed, the diff for that revision would be nearly useless, because instead of highlighting only the change that was made, it would appear as if the whole document was removed and retyped. This revision would also be very difficult to merge (i.e, if there is a need to revert it in the future), and if another team member is also editing the same file, he will surely get a conflict, which may be hard to solve because it will be very difficult to spot the actual difference.

There are other invisible characters that can cause similar issues, like tabs and spaces. It should be agreed beforehand which convention is going to be used, and each team member should make sure that their own commits follow it. It is a good practice to review the local editions before doing a commit (svn diff from the command line, or your favourite graphical diff tool). If lines that are apparently equal are reported to be different, that may be a sign that the text editor is changing the invisible characters. Reviewing the changes before committing can also help you make sure that only the desired changes make it to the repository, that throwaway or unnecessary changes are discarded, and that the commit contains one, and only one, logical change.

Dialogue & Discussion

Comments must follow our Code of Conduct.

Edit this page on Github