As we are a cross-platform project, it is important that line-endings are handled consistently. Please abide by the general github advice: http://help.github.com/dealing-with-lineendings/
As the page says, Mac and Linux users do not get to sit this out, and should also configure their git repo to commit Unix-style line endings.
If you are using msysgit, you may be presented with these options:
Option two ('Checkout as-is, commit Unix-style line endings') should be used.
A recent checkout of both the SCAPE Tika repository and the Apache Tika repository would not successfully pass all unit tests. In particular, asserting the mime-type of the testARofText.ar file (see testArchiveDetection(): TestMimeTypes.java: Line283) failed (mime-type returned as "text/plain" not "application/x-archive"). A hex-editor comparison of this file against the same file checked out on a freshly installed laptop showed a difference in line endings for this particular test file (testARofText.ar), suggesting a git line-ending configuration error.
Detailed look at the git configuration showed that one particular setting, core.autocrlf, differed between set-ups, with the problematic PC having two different values set:
The core.autocrlf setting tells git to convert newlines to the system's standard when checking out files, and to LF newlines when committing back [http://help.github.com/line-endings/]. This is the default behaviour when set to true; when set to input, line endings are always LF. With both set, it seems the system settings override the global ones.
The solution is to only set a single core.autocrlf (either in system or global), in this case I opted to remove the system entry:
The files can manually be edited also, and can be found in the following locations:
- system: %PROGRAM FILES%/Git/etc/gitconfig
- global: ~/.gitconfig