lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Owens <martin.ow...@merrillcorp.com>
Subject Re: Term Based Meta Data
Date Fri, 08 Aug 2008 21:00:59 GMT
Dear Lucene Users and Tricia Williams,

The way we're operating our lucene index is one where we index all the
terms but not store the text. From your SOLR-380 patch example Tricia I
was able to get a very good idea of how to set things up. Historically I
have used TermPositionsVector instead of TermPositions because of that
data is available without storing the text in the index.

Is it possible to translate code which uses TermPositions to using
TermPositionsVector with regards to payloads?

Best Regards, Martin Owens

On Tue, 2008-08-05 at 11:14 -0600, Tricia Williams wrote:
> Hi Martin,
> 
>     Take a look at what I've done with SOLR-380 
> (https://issues.apache.org/jira/browse/SOLR-380). It might solve your 
> problem, or at least give you a good starting point.
> 
> Tricia
> 
> Michael McCandless wrote:
> >
> > I think you could use payloads (= arbitrary/opaque byte[]) for this?
> >
> > You can attach a payload to each term occurrence during tokenization 
> > (indexing), and then retrieve the payload during searching.
> >
> > Mike
> >
> > Martin Owens wrote:
> >
> >> Hello Users,
> >>
> >> I'm working on a project which attempts to store data that comes from an
> >> OCR process which describes the pixel co-ordinates of each term in the
> >> document. It's used for hit highlighting.
> >>
> >> What I would like to do is store this co-ordinate information alongside
> >> the terms. I know there is existing meta data stored per term (Word
> >> Offset and Char Offsets) the problem is that If I create a separate
> >> index and try and use the word offset or char offsets not only is it
> >> slower but it doesn't match because of the way the terms are processed
> >> both inside of lucene and the OCR program.
> >>
> >> So, is it possible to store the data alongside the terms in lucene and
> >> then recall them when doing certain searches? and how much custom code
> >> needs to be written to do it?
> >>
> >> Best Regards, Martin Owens
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message