Title | SO14 Fuse mounting (w)arc files |
Detailed description | One way to access to content of container files is to mount them as user file systems. This allows for a ready integration with almost any tools, and saves space in regards to unpacking. |
Solution Champion |
Asger Askov Blekinge (SB) |
Corresponding Issue(s) |
|
myExperiment Link |
|
Tool Registry Link |
|
Evaluation |
Not presently usable in production, but a novel use, that should not be forgotten |
The code was originally developed by BL, and later modified by SB. It can be found
https://github.com/blekinge/wap
or here
https://github.com/openplanets/wap
In order to compile, it needs the project
https://github.com/blekinge/fuse-j-2.4-prerelease
The idea of mounting (w)arc files is novel. The BL is, as far as the author knows, actually using this in a production context. SB, on the other hand, have chosen not to use Fuse on production machines.
The basic reason for this refusal was the problem of ensuring the mountings would be unmounted, if the scripts that made the mountings failed to complete. The default limit in the kernel is 1000 simultaneous mountings. So, a number of malforming processes could block the machine, and the maintenance people are naturally suspicious about anything that interacts with the kernel, even in userspace.
Secondly, to actually get Fuse to run on a given machine turned out to be more difficult than imagened. While SB got the installs working relatively quickly, ONB who would be the primary user of the system, had a long list of compability problems.
Thirdly, the performance was actuallty worse when using fuse, relative to unpacking the files. The problem was caused by the reader (heritrix), which required the (w)arc files to be read sequencially. Since a file system is, per definition, random access (as least to the files), you would have to open and skip the nessesary length into the (w)arc file for each read. This caused more IO activity than unpacking the files, as in that case the (w)arc file would be read once and just once.