nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Max S <maximillian...@googlemail.com>
Subject Customise scoring
Date Wed, 02 Sep 2009 20:33:23 GMT
Hi all,

I'm have installed / imported a XML and EXIF parser plugin into Nutch to
parse xml files and EXIF metadata from JPG images. 

The idea would be to:
1. Fetch and extract data and links from XML file
	NB: The XML file contains Geo coordinates (latitude and longitude),
title and image links. 
2. Fetch image and extract EXIF metadata
3. Store the extracted data from both parser in Index. 

I would like to customise search so the results is ordered by the following
priority.
1. Proximity to location
2. Keywords from EXIF Metadata
3. Kewords from XML title

>From what I can see at the moment, I will need to
1. Set a higher score to the fields according to the priority above
2. Repurpose the algorithm within GeoPosition plugin
(http://wiki.apache.org/nutch/GeoPosition)
3. Update ScoringFilter logic to include Geo Position algorithm?


The question here is, is the last item correct? Or are there any other
approach? 
Where should I start looking? Appreciate any suggestions.

Regards
Max S




Mime
View raw message