lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jebarlin Robertson <jebar...@gmail.com>
Subject Re: Regarding Compression Tool
Date Mon, 16 Sep 2013 08:22:07 GMT
I am using Apache Lucene in Android. I have around 1 GB of Text documents
(Logs). When I Index these text documents using this
*new Field(ContentIndex.KEY_TEXTCONTENT, contents, Field.Store.YES,
Field.Index.ANALYZED,TermVector.WITH_POSITIONS_OFFSETS)*, the index
directory is consuming 1.59GB memory size.
But without Field Store it will be adound 0.59 GB indexed size. If the
Lucene indexing is taking this much space to create index and to store the
original text just to use hightlight feature, it will be big problem for
mobile devices. So I just want some help that, is there any alternative
ways to do this without occupying more space to use highligh feature in
Android powered devices.


On Sun, Sep 15, 2013 at 3:26 AM, Erick Erickson <erickerickson@gmail.com>wrote:

> bq: I thought that I can use the CompressionTool to minimize the memory
> size.
>
> This doesn't make a lot of sense. Highlighting needs the raw data to
> figure out what to highlight, so I don't see how the CompressionTool
> will help you there.
>
> And unless you have a huge document and only a very few of them, then
> the memory occupied by the uncompressed data should be trivial
> compared to the various low-level caches. This really is seeming like
> an XY problem. Perhaps if you backed up and explained _why_ this
> seems important to do people could be more helpful.
>
>
> Best,
> Erick
>
>
> On Sat, Sep 14, 2013 at 12:21 PM, Jebarlin Robertson <jebarlin@gmail.com
> >wrote:
>
> > Thank you very much Erick. Actually I was using Highlighter tool, that
> > needs the entire data to be stored to get the relevant searched sentence.
> > But when I use that, It was consuming more memory (Indexed data size +
> >  Store.YES - the entire content) than the actual documents size.
> > I thought that I can use the CompressionTool to minimize the memory size.
> > You can help, if there is any possiblities or way to store the entire
> > content and to use the highlighter feature.
> >
> > Thankyou
> >
> >
> > On Fri, Sep 13, 2013 at 6:54 PM, Erick Erickson <erickerickson@gmail.com
> > >wrote:
> >
> > > Compression is for the _stored_ data, which is not searched. Ignore
> > > the compression and insure that you index the data.
> > >
> > > The compressing/decompressing for looking at stored
> > > values is, I believe, done at a very low level that you don't
> > > need to care about at all.
> > >
> > > If you index the data in the field, you shouldn't have to do
> > > anything special to search it.
> > >
> > > Best,
> > > Erick
> > >
> > >
> > > On Fri, Sep 13, 2013 at 1:19 AM, Jebarlin Robertson <
> jebarlin@gmail.com
> > > >wrote:
> > >
> > > > Hi,
> > > >
> > > > I am trying to store all the Field values using CompressionTool, But
> > > When I
> > > > search for any content, it is not finding any results.
> > > >
> > > > Can you help me, how to create the Field with CompressionTool to add
> to
> > > the
> > > > Document and how to decompress it when searching for any content in
> it.
> > > >
> > > > --
> > > > Thanks & Regards,
> > > > Jebarlin Robertson.R
> > > >
> > >
> >
> >
> >
> > --
> > Thanks & Regards,
> > Jebarlin Robertson.R
> > GSM: 91-9538106181.
> >
>



-- 
Thanks & Regards,
Jebarlin Robertson.R
GSM: 91-9538106181.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message