lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jebarlin Robertson <jebar...@gmail.com>
Subject Re: Regarding Compression Tool
Date Tue, 17 Sep 2013 09:37:28 GMT
Thanks Mark.

I know all this scenarios about battery and space. But at the same I am
just checking the feasibility only.
Actually I started this to ask how to use the CompressionTool to compress
the data and store it in index.
I observed the below things and I tried using this way
* Field field = new Field("contents", contents, Field.Store.NO,*
*                Field.Index.ANALYZED,
Field.TermVector.WITH_POSITIONS_OFFSETS);*
*Field field1 = new Field("contents",
CompressionTools.compressString(contents), Field.Store.YES)*  . I could
able to search but when i try to get the stored content from the document,
it is giving null.
So Could you please give me some sample code to use the CompressionTool.

public static final Field.Store
<http://lucene.apache.org/core/2_9_4/api/all/org/apache/lucene/document/Field.Store.html>
*COMPRESS*

*Deprecated.* *Please use
CompressionTools<http://lucene.apache.org/core/2_9_4/api/all/org/apache/lucene/document/CompressionTools.html>
instead.
For string fields that were previously indexed and stored using
compression, the new way to achieve this is: First add the field
indexed-only (no store) and additionally using the same field name as a
binary, stored field with
CompressionTools.compressString(java.lang.String)<http://lucene.apache.org/core/2_9_4/api/all/org/apache/lucene/document/CompressionTools.html#compressString(java.lang.String)>
.*Store the original field value in the index in a compressed form. This is
useful for long documents and for binary valued fields.


On Mon, Sep 16, 2013 at 9:56 PM, Mark Miller <developmentalmadness@gmail.com
> wrote:

> Have you considered storing your indexes server-side? I haven't used
> compression but usually the trade-off of compression is CPU usage which
> will also be a drain on battery life. Or maybe consider how important the
> highlighter is to your users - is it worth the trade-off of either disk
> space or battery life? If it's more of a nice-to-have then maybe hold off
> on the feature for a later release until you've had some feedback and some
> more time to figure out the best solution. Of course I don't know much
> about your application, so take my advice with a grain of salt.
>
>
> On Mon, Sep 16, 2013 at 2:22 AM, Jebarlin Robertson <jebarlin@gmail.com
> >wrote:
>
> > I am using Apache Lucene in Android. I have around 1 GB of Text documents
> > (Logs). When I Index these text documents using this
> > *new Field(ContentIndex.KEY_TEXTCONTENT, contents, Field.Store.YES,
> > Field.Index.ANALYZED,TermVector.WITH_POSITIONS_OFFSETS)*, the index
> > directory is consuming 1.59GB memory size.
> > But without Field Store it will be adound 0.59 GB indexed size. If the
> > Lucene indexing is taking this much space to create index and to store
> the
> > original text just to use hightlight feature, it will be big problem for
> > mobile devices. So I just want some help that, is there any alternative
> > ways to do this without occupying more space to use highligh feature in
> > Android powered devices.
> >
> >
> > On Sun, Sep 15, 2013 at 3:26 AM, Erick Erickson <erickerickson@gmail.com
> > >wrote:
> >
> > > bq: I thought that I can use the CompressionTool to minimize the memory
> > > size.
> > >
> > > This doesn't make a lot of sense. Highlighting needs the raw data to
> > > figure out what to highlight, so I don't see how the CompressionTool
> > > will help you there.
> > >
> > > And unless you have a huge document and only a very few of them, then
> > > the memory occupied by the uncompressed data should be trivial
> > > compared to the various low-level caches. This really is seeming like
> > > an XY problem. Perhaps if you backed up and explained _why_ this
> > > seems important to do people could be more helpful.
> > >
> > >
> > > Best,
> > > Erick
> > >
> > >
> > > On Sat, Sep 14, 2013 at 12:21 PM, Jebarlin Robertson <
> jebarlin@gmail.com
> > > >wrote:
> > >
> > > > Thank you very much Erick. Actually I was using Highlighter tool,
> that
> > > > needs the entire data to be stored to get the relevant searched
> > sentence.
> > > > But when I use that, It was consuming more memory (Indexed data size
> +
> > > >  Store.YES - the entire content) than the actual documents size.
> > > > I thought that I can use the CompressionTool to minimize the memory
> > size.
> > > > You can help, if there is any possiblities or way to store the entire
> > > > content and to use the highlighter feature.
> > > >
> > > > Thankyou
> > > >
> > > >
> > > > On Fri, Sep 13, 2013 at 6:54 PM, Erick Erickson <
> > erickerickson@gmail.com
> > > > >wrote:
> > > >
> > > > > Compression is for the _stored_ data, which is not searched. Ignore
> > > > > the compression and insure that you index the data.
> > > > >
> > > > > The compressing/decompressing for looking at stored
> > > > > values is, I believe, done at a very low level that you don't
> > > > > need to care about at all.
> > > > >
> > > > > If you index the data in the field, you shouldn't have to do
> > > > > anything special to search it.
> > > > >
> > > > > Best,
> > > > > Erick
> > > > >
> > > > >
> > > > > On Fri, Sep 13, 2013 at 1:19 AM, Jebarlin Robertson <
> > > jebarlin@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I am trying to store all the Field values using CompressionTool,
> > But
> > > > > When I
> > > > > > search for any content, it is not finding any results.
> > > > > >
> > > > > > Can you help me, how to create the Field with CompressionTool
to
> > add
> > > to
> > > > > the
> > > > > > Document and how to decompress it when searching for any content
> in
> > > it.
> > > > > >
> > > > > > --
> > > > > > Thanks & Regards,
> > > > > > Jebarlin Robertson.R
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks & Regards,
> > > > Jebarlin Robertson.R
> > > > GSM: 91-9538106181.
> > > >
> > >
> >
> >
> >
> > --
> > Thanks & Regards,
> > Jebarlin Robertson.R
> > GSM: 91-9538106181.
> >
>
>
>
> --
> Mark J. Miller
> Blog: http://www.developmentalmadness.com
> LinkedIn: http://www.linkedin.com/in/developmentalmadness
>



-- 
Thanks & Regards,
Jebarlin Robertson.R
GSM: 91-9538106181.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message