FITS profiling: [~gmcgath]
To see where the performance issues are in FITS, I ran OpenFits with JIP, a free Java profiling tool. I had to exclude a number of third-party packages from tabulation before I could get it to run without exhausting memory.
The first successful run was on a TIFF file (grayscale_8bpp_wrong_bpptag.tif from the OPF format corpus). Each thread is reported separately, but totals per method are provided in a summary.
In this case, the most expensive methods reported were:
Count Time Pct Location
===== ==== === ========
249 5070.0 203.2 org.apache.xerces.util.URI:<init>
10 3187.1 182.3 edu.harvard.hul.ois.fits.tools.ToolOutput:validateXmlOutput
4 3142.6 70.6 uk.gov.nationalarchives.droid.signatureFile.SubSequence:isFoundAfterFileMarker
6 1168.7 55.0 edu.harvard.hul.ois.fits.tools.ToolBase:transform
153246 1048.4 39.8 com.twmacinta.util.MD5:Transform
7 966.4 27.1 edu.harvard.hul.ois.fits.tools.utils.CommandLine:exec
14 858.9 1400.0 edu.harvard.hul.ois.fits.tools.utils.StreamGobbler:run
I don't know what "Pct" means, since it's often way above 100%, but the count and time totals are more useful. All times are in milliseconds.
validateXmlOutput subsumes the actual calls to the tools, and each call to it invokes a URI initialization. In spite of this, less total time is spent on it than on URI initializations. (This is possible because there are other URI initializations not called under validateXmlOutput).
isFoundAfterFileMarker is specific to Droid, and I don't know what to make of it. CommandLine.exec and StreamGobble.run are high-level functions, so the fact that the consume a lot of time doesn't tell us anything.
We can also look at the time spent for each tool. These are:
1831.4 AudioInfo
3222.4 Jhove
1790.2 ADLTool
2821.6 Exiftool
2642.2 FileUtility
4450.8 Droid
2631.4 FileInfo
2773.8 MetadataExtractor (NLNZ)
1660.7 FFIdent
1538.0 XmlMetadata
The initialization of org.apache.xerces.util.URI is surprisingly expensive. It's called a lot of times and takes a fair amount of time per call.
Next I ran FITS on a directory containing five WAVE files. There are now a lot of threads, since a thread is invoked for each use of each tool on a file. The most expensive methods now are:
Count Time Pct Location
===== ==== === ========
1080 15612.7 1549.3 org.apache.xerces.util.URI:<init>
25752 14017.0 144.6 uk.gov.nationalarchives.droid.signatureFile.SubSequence:endBytePosForSeqFrag
2815767 9155.1 92.1 uk.gov.nationalarchives.droid.signatureFile.ByteSeqSpecifier:matchesByteSequence
47 5097.9 406.5 edu.harvard.hul.ois.fits.tools.ToolOutput:validateXmlOutput
19 2527.2 182.0 edu.harvard.hul.ois.fits.tools.utils.CommandLine:exec
38 1724.4 3800.0 edu.harvard.hul.ois.fits.tools.utils.StreamGobbler:run
Again URI initialization is the biggest single factor.
This suggested that possible that some efficiency could be gained by doing more static initialization where URIs are involved. I went for an easy fix and pulled out all Namespace initializers involving fixed URIs into private static declarations.
After this, the most expensive method summary gave:
Count Time Pct Location
===== ==== === ========
25752 14460.4 146.4 uk.gov.nationalarchives.droid.signatureFile.SubSequence:endBytePosForSeqFrag
1080 13142.4 1411.5 org.apache.xerces.util.URI:<init>
2815767 9069.2 91.9 uk.gov.nationalarchives.droid.signatureFile.ByteSeqSpecifier:matchesByteSequence
47 5389.4 409.3 edu.harvard.hul.ois.fits.tools.ToolOutput:validateXmlOutput
38 3052.4 3800.0 edu.harvard.hul.ois.fits.tools.utils.StreamGobbler:run
19 2862.8 215.4 edu.harvard.hul.ois.fits.tools.utils.CommandLine:exec
This is about a 15% saving on the URI init, pushing it down to second place. All the other calls took longer; I don't think factoring out the namespaces should have slowed them down, so some external factor may just have dragged down this run by a bit.
This isn't a huge gain, but the coding effort for it was very slight.
Each tool in the directory ran five times. Here are the time totals (after optimization of initializers):
AudioInfo: 1928.4, 296.2, 373.1, 488.5, 244.0
Jhove: 2213.1, 1147.5, 118.5, 210.7, 74.2
ADLTool: 1805.4, 299.9, 586.4, 455.9, 310.8
Exiftool: 2185.7, 1208.6, 1016.7, 836.5, 853.0
FileUtility: 2137.4, 1140.9, 1033.9, 990.5, 848.0
Droid: 1929.6, 1193.8, 1184.9, 20974.4, 6727.1
FileInfo: 1876.6, 328.3, 553.2, 591.4, 504.7
MetadataExtractor: 2212.6, 841.7, 686.2, 524.0, 549.3
FFIdent: 1726.0, 466.7, 509.4, 454.6, 257.5
XmlMetadata: 1803.9, 368.3, 234.7, 370.8, 293.7
There is significant overhead for the first run of all the tools. A significant part of this is certainly due to the class loader.
Here are the corresponding totals before optimization of initializers:
AudioInfo: 2004.5, 197.5, 181.2, 358.1, 381.4
Jhove: 2673.7, 1330.0, 126.6, 90.5, 202.5
ADLTool: 2004.5, 387.3, 171.0, 200.0, 352.7
Exiftool: 2603.2, 1270.4, 903.1, 948.7, 938.8
FileUtility: 2536.5, 1231.7, 728.2, 843.2, 899.5
Droid: 2279.1, 1365.2, 930.3, 20582.1, 6782.7
FileInfo: 1779.8, 440.0, 407.3, 507.1, 478.6
MetadataExtractor: 2588.9, 743.6, 406.1, 594.2, 512.4
FFIdent: 2242.5, 492.7, 362.2, 487.8, 329.9
XmlMetadata: 1696.1, 380.3, 318.0, 409.7, 441.6
Some proper analysis should be done on these numbers, but just from a glance, it appears that the run with optimized initializers has a greater relative saving of time, suggesting that the benefits of the optimization would increase with longer runs.
To see where the performance issues are in FITS, I ran OpenFits with JIP, a free Java profiling tool. I had to exclude a number of third-party packages from tabulation before I could get it to run without exhausting memory.
The first successful run was on a TIFF file (grayscale_8bpp_wrong_bpptag.tif from the OPF format corpus). Each thread is reported separately, but totals per method are provided in a summary.
In this case, the most expensive methods reported were:
Count Time Pct Location
===== ==== === ========
249 5070.0 203.2 org.apache.xerces.util.URI:<init>
10 3187.1 182.3 edu.harvard.hul.ois.fits.tools.ToolOutput:validateXmlOutput
4 3142.6 70.6 uk.gov.nationalarchives.droid.signatureFile.SubSequence:isFoundAfterFileMarker
6 1168.7 55.0 edu.harvard.hul.ois.fits.tools.ToolBase:transform
153246 1048.4 39.8 com.twmacinta.util.MD5:Transform
7 966.4 27.1 edu.harvard.hul.ois.fits.tools.utils.CommandLine:exec
14 858.9 1400.0 edu.harvard.hul.ois.fits.tools.utils.StreamGobbler:run
I don't know what "Pct" means, since it's often way above 100%, but the count and time totals are more useful. All times are in milliseconds.
validateXmlOutput subsumes the actual calls to the tools, and each call to it invokes a URI initialization. In spite of this, less total time is spent on it than on URI initializations. (This is possible because there are other URI initializations not called under validateXmlOutput).
isFoundAfterFileMarker is specific to Droid, and I don't know what to make of it. CommandLine.exec and StreamGobble.run are high-level functions, so the fact that the consume a lot of time doesn't tell us anything.
We can also look at the time spent for each tool. These are:
1831.4 AudioInfo
3222.4 Jhove
1790.2 ADLTool
2821.6 Exiftool
2642.2 FileUtility
4450.8 Droid
2631.4 FileInfo
2773.8 MetadataExtractor (NLNZ)
1660.7 FFIdent
1538.0 XmlMetadata
The initialization of org.apache.xerces.util.URI is surprisingly expensive. It's called a lot of times and takes a fair amount of time per call.
Next I ran FITS on a directory containing five WAVE files. There are now a lot of threads, since a thread is invoked for each use of each tool on a file. The most expensive methods now are:
Count Time Pct Location
===== ==== === ========
1080 15612.7 1549.3 org.apache.xerces.util.URI:<init>
25752 14017.0 144.6 uk.gov.nationalarchives.droid.signatureFile.SubSequence:endBytePosForSeqFrag
2815767 9155.1 92.1 uk.gov.nationalarchives.droid.signatureFile.ByteSeqSpecifier:matchesByteSequence
47 5097.9 406.5 edu.harvard.hul.ois.fits.tools.ToolOutput:validateXmlOutput
19 2527.2 182.0 edu.harvard.hul.ois.fits.tools.utils.CommandLine:exec
38 1724.4 3800.0 edu.harvard.hul.ois.fits.tools.utils.StreamGobbler:run
Again URI initialization is the biggest single factor.
This suggested that possible that some efficiency could be gained by doing more static initialization where URIs are involved. I went for an easy fix and pulled out all Namespace initializers involving fixed URIs into private static declarations.
After this, the most expensive method summary gave:
Count Time Pct Location
===== ==== === ========
25752 14460.4 146.4 uk.gov.nationalarchives.droid.signatureFile.SubSequence:endBytePosForSeqFrag
1080 13142.4 1411.5 org.apache.xerces.util.URI:<init>
2815767 9069.2 91.9 uk.gov.nationalarchives.droid.signatureFile.ByteSeqSpecifier:matchesByteSequence
47 5389.4 409.3 edu.harvard.hul.ois.fits.tools.ToolOutput:validateXmlOutput
38 3052.4 3800.0 edu.harvard.hul.ois.fits.tools.utils.StreamGobbler:run
19 2862.8 215.4 edu.harvard.hul.ois.fits.tools.utils.CommandLine:exec
This is about a 15% saving on the URI init, pushing it down to second place. All the other calls took longer; I don't think factoring out the namespaces should have slowed them down, so some external factor may just have dragged down this run by a bit.
This isn't a huge gain, but the coding effort for it was very slight.
Each tool in the directory ran five times. Here are the time totals (after optimization of initializers):
AudioInfo: 1928.4, 296.2, 373.1, 488.5, 244.0
Jhove: 2213.1, 1147.5, 118.5, 210.7, 74.2
ADLTool: 1805.4, 299.9, 586.4, 455.9, 310.8
Exiftool: 2185.7, 1208.6, 1016.7, 836.5, 853.0
FileUtility: 2137.4, 1140.9, 1033.9, 990.5, 848.0
Droid: 1929.6, 1193.8, 1184.9, 20974.4, 6727.1
FileInfo: 1876.6, 328.3, 553.2, 591.4, 504.7
MetadataExtractor: 2212.6, 841.7, 686.2, 524.0, 549.3
FFIdent: 1726.0, 466.7, 509.4, 454.6, 257.5
XmlMetadata: 1803.9, 368.3, 234.7, 370.8, 293.7
There is significant overhead for the first run of all the tools. A significant part of this is certainly due to the class loader.
Here are the corresponding totals before optimization of initializers:
AudioInfo: 2004.5, 197.5, 181.2, 358.1, 381.4
Jhove: 2673.7, 1330.0, 126.6, 90.5, 202.5
ADLTool: 2004.5, 387.3, 171.0, 200.0, 352.7
Exiftool: 2603.2, 1270.4, 903.1, 948.7, 938.8
FileUtility: 2536.5, 1231.7, 728.2, 843.2, 899.5
Droid: 2279.1, 1365.2, 930.3, 20582.1, 6782.7
FileInfo: 1779.8, 440.0, 407.3, 507.1, 478.6
MetadataExtractor: 2588.9, 743.6, 406.1, 594.2, 512.4
FFIdent: 2242.5, 492.7, 362.2, 487.8, 329.9
XmlMetadata: 1696.1, 380.3, 318.0, 409.7, 441.6
Some proper analysis should be done on these numbers, but just from a glance, it appears that the run with optimized initializers has a greater relative saving of time, suggesting that the benefits of the optimization would increase with longer runs.