lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <eks...@yahoo.co.uk>
Subject Re: postings without position information ?
Date Thu, 07 Feb 2008 20:04:16 GMT
yap, also without frequencies, this should not be all that difficult (imho), especially now
when we have DocSetIdIterator as superclass, as a matter of fact you could even today get
DocSetIterator from TermDocs or whatever and use it as Filter as a lightweight, in memory
solution ... real solution would require something like postings "type flag" 

----- Original Message ----
From: robert engels <rengels@ix.netcom.com>
To: java-dev@lucene..apache.org
Sent: Thursday, 7 February, 2008 7:43:33 PM
Subject: postings without position information ?

I 
think 
there 
are 
many 
uses 
of 
Lucene 
that 
would 
benefit 
from 
'enum'  
fields, 
aka 
categories.

When 
classifying 
documents, 
they 
are 
often 
in 
one 
or 
more 
categories.

Lucene 
could 
write 
these 
posting 
very 
efficiently 
using 
VINT 
and 
RLE  
(run 
length 
encoding) 
if 
the 
positions 
information 
was 
not 
stored  
(since 
it 
is 
not 
really 
useful 
in 
these 
typical 
cases).

StartingDocNum|NumberOfDocuments...StartingDocNum|NumberOfDocuments  
using 
a 
bit 
of 
the 
StartingDocNum 
to 
know 
if 
it 
was 
a 
series.

When 
a 
lot 
of 
documents 
are 
in 
the 
same 
category, 
and 
they 
are 
added  
as 
the 
same 
time, 
the 
document 
numbers 
would 
be 
nearly 
sequential,  
allowing 
very 
efficient 
compression.

Has 
anyone 
worked 
on 
this? 
Our 
previous 
custom 
IndexReaderWriter  
supported 
it, 
and 
I 
was 
wondering 
if 
this 
has 
made 
it 
into 
the 
core.  
I 
checked 
the 
docs/email 
and 
could 
not 
find 
anything.

Thanks.

Robert





---------------------------------------------------------------------
To 
unsubscribe, 
e-mail: 
java-dev-unsubscribe@lucene.apache.org
For 
additional 
commands, 
e-mail: 
java-dev-help@lucene.apache.org






      __________________________________________________________
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com



---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message