lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <>
Subject Re: Polymorphic Index
Date Thu, 21 Oct 2010 21:40:00 GMT
How about splitting the 32 byte field into for example 16 subfields of 2 bytes each?
Then any direct query on that field needs to be transformed into a boolean
query requiring all 16 subfield terms.
Would that work?

Paul Elschot

Op donderdag 21 oktober 2010 21:44:34 schreef eks dev:
> Hi All, 
> I am trying to figure out a way to implement following use case with 
> lucene/solr. 
> In order to support simple incremental updates (master) I need to index  and 
> store UID Field on 300Mio collection. (My UID is a 32 byte  sequence). But I do 
> not need indexed (only stored) it during normal  searching (slaves). 
> The problem is that my term dictionary gets blown away with sheer number  of 
> unique IDs. Number of unique terms on this collection, excluding UID  is less 
> than 7Mio.
>  I can tolerate resources hit on Updater (big hardware, on disk index...).
> This is a master slave setup, where searchers run from RAMDisk and  having 
> 300Mio * 32 (give or take prefix compression) plus pointers to  postings and 
> postings is something I would really love to avoid as this  is significant 
> compared to really small documents I have. 
> Cutting to the chase:
> How I can have Indexed UID field, and when done with indexing:
> 1) Load "searchable" index into ram from such an index on disk without one 
> field? 
> 2) create 2 Indices in sync on docIDs, One containing only indexed UID
> 3) somehow transform index with indexed UID by droping UID field, preserving 
> docIs. Kind of tool smart index-editing tool. 
> Something else already there i do not know?
> Preserving docIds is crucial, as I need support for lovely incremental  updates 
> (like in solr master-slave update). Also Stored field should  remain!
> I am not looking for "use MMAPed Index and let OS deal with it advice"... 
> I do not mind doing it with flex branch 4.0, nut being in a hurry.
> Thanks in advance, 
> Eks 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message