|| Structural and visual comparisons for web page archiving
|Detailed description||We propose in the context of Web archiving, a framework combining state-of-the-art comparison methods that use the source code of Web pages, with computer vision techniques to detect whether successive versions of a Web page are similar or not.|
| Scalability Challenge
||The method can be used as a crawler|
|Issue champion||Sureda-Gutierrez Carlos (UPMC).|
| Other interested parties
|Possible Solution approaches||A combination of web page structure and computer vision techniques for detecting significant changes between two web page versions|
|Solutions||SO18 Comparing two web page versions for web archiving|
|Objectives||Which scape objectives does this issues and a future solution relate to? e.g. scaleability, rubustness, reliability, coverage, preciseness, automation|
|Success criteria||Describe the success criteria for solving this issue - what are you able to do? - what does the world look like?|
|Automatic measures|| What automated measures would you like the solution to give to evaluate the solution for this specific issue? which measures are important?
If possible specify very specific measures and your goal - e.g.
* process 50 documents per second
* handle 80Gb files without crashing
* identify 99.5% of the content correctly
|Manual assessment|| Apart from automated measures that you would like to get do you foresee any necessary manual assessment to evaluate the solution of this issue?
If possible specify measures and your goal - e.g.
* Solution installable with basic linux system administration skills
* User interface understandable by non developer curators
|Actual evaluations||links to acutual evaluations of this Issue/Scenario|