lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prakash Reddy Bande <praka...@altair.com>
Subject Customizing indexing of large files
Date Mon, 27 Feb 2012 16:55:55 GMT
Hi,

I want to customize the indexing of some specific kind of files I have. I am using 2.9.3 but
upgrading is possible.
This is how my file's data looks

*****************************
Data for 2010
Description: This section has a general description of the data.
DATA_BEGIN
Month       P1          P2          P3
01          3243.433    43534.324   45345.2443
02          3242.324    234234.24   323.2343
...
...
...
...
DATA_END
Data for 2011
Description: This section has a general description of the data.
DATA_BEGIN
Month       P1          P2          P3
01          3243.433    43534.324   45345.2443
02          3242.324    234234.24   323.2343
...
...
...
...
DATA_END
*****************************

I would like to use a StandardAnalyser, but do not want to index the data of the columns,
i.e. skip all those numbers. Basically, as soon as I hit the keyword DATA_BEGIN, I want to
jump to DATA_END.
So, what is the best approach? Using a custom Reader, custom tokenizer or some other mechanism.
Regards,

Prakash Bande
Altair Eng. Inc.
Troy MI
Ph: 248-614-2400 ext 489
Cell: 248-404-0292


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message