lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Create and populate a field when indexing
Date Mon, 29 Oct 2007 19:16:33 GMT
When you are indexing the file and adding the Document, you will need  
to parse out your filename per your regular expression, and then  
create the appropriate field:

Document doc = new Document()
String cat = getCategoryFromFileName(inputFileName)
doc.add(new Field("category", cat, ...)
//do the rest of your adds

Just locate where in the demo the Document add is taking place (I  
forget the exact spot) and then add in the appropriate stuff from  
above.  Obviously, you need to implement the method I stubbed called  
getCategoryFromFileName.

HTH,
Grant
On Oct 29, 2007, at 1:06 PM, KR wrote:

>
> I've been using the Lucene demo from
> http://lucene.apache.org/java/2_1_0/demo.html
>
> I have a set of documents
> with filenames that give a good indication of content.
>
> A filename of 12 digits (I think this is [0-9]{12} as a regular
> expression) with the extension html is a troubleshooting guide, the  
> number
> being an error code. A filename with two or three letters, then a  
> minus
> (which would be [a-z]{2,3}- I think), then a known string means the
> document is about a particular subject; I have a list of the known  
> strings
> matched to subjects.
>
> What I would like to do, is have my indexer create a field named
> "category", populated with either the string "troubleshooting" or  
> with the
> known string extracted from the filename.
>
> Examples:
> For a file named 0000000000111.html the indexer adds the field  
> "category"
> with the value "troubleshooting". For a file named xxx-cal-123.html  
> the
> indexer adds the field "category" with the value "cal". For a file  
> named
> xx-qv-(9).html the indexer adds the field "category" with the value  
> "qv".
>
> Is there a way to do that?
>
> Beef.
> -- 
> View this message in context: http://www.nabble.com/Create-and-populate-a-field-when-indexing-tf4713018.html#a13471852
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Boot Camp Training:
ApacheCon Atlanta, Nov. 12, 2007.  Sign up now!  http://www.apachecon.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message