Skip to end of metadata
Go to start of metadata
Title Comparing two web page versions for web archiving
Detailed description Our system is based on: (1) a combination of structural and visual comparison methods embedded in a statistical discriminative model, (2) a visual similarity measure designed for Web pages that improves change detection, (3) a supervised feature selection method adapted to Web archiving. We train a Support Vector Machine model with vectors of similarity scores between successive versions of pages. The trained model then determines whether two versions, defined by their vector of similarity scores, are similar or not. Experiments on real Web archives validate our approach.
Solution Champion
Sureda-Gutierrez Carlos (UPMC).
Corresponding Issue(s)
IS28 Structural and visual comparisons for web page archiving
IS7 Incompleteness and and inconsistency of web archive data
IS19 Migrate whole archive to new archiving system
myExperiment Link
MarcAlizer
Tool Registry Link
Pagelyzer
Evaluation
TBD
Labels:
solution solution Delete
scape scape Delete
characterisation characterisation Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.