lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Rhodes <kr_new_off...@yahoo.com>
Subject Adding and populating a field
Date Sat, 27 Oct 2007 06:11:25 GMT
I have a set of documents with filenames that give a good indication of content.

A filename of 12 digits (I think this is [0-9]{12} as a regular expression) with the extension
html is a troubleshooting guide, the number being an error code.
A filename with two or three letters, then a minus (which would be [a-z]{2,3}- I think), then
a known string means the document is about a particular subject; I have a list of the known
strings matched to subjects.

What I would like to do, is have my indexer create a field named "category", populated with
either the string "troubleshooting" or with the known string extracted from the filename.

Examples:
For a file named 0000000000111.html the indexer adds the field "category" with the value "troubleshooting".
For a file named xxx-cal-123.html the indexer adds the field "category" with the value "cal".
For a file named xx-qv-(9).html the indexer adds the field "category" with the value "qv".

Is there a way to do that?
 __________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message