lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Out of memory problem in search
Date Thu, 15 Jul 2010 00:50:40 GMT
This doesn't make sense to me. Are you saying that you only have 200,000
documents in your index? Because keeping a score for 200K documents should
consume a relatively trivial amount of memory. The fact that you're sorting
by time is a red flag, but it's only a long, so 200K documents shouldn't
strain memory due to sorting either. The critical thing here isn't
necessarily the size of your index, but the number of documents in that
index and the number of unique values you're sorting by. By the way, what
happens if you don't sort?

Since it doesn't make sense to me, that must mean I don't understand the
problem very thoroughly. Could you provide some index characteristics?
Saying it's 40G leaves a lot open to speculation. That could be 39G of
stored text which is mostly irrelevant for searching. Or it could be
entirely indexed, tokenized data which would be a different thing. How many
documents do you have in your index? What does your query look like?

You can get an idea of the amount of your index holding indexed tokens by
NOT storing any of the fields, just indexing them (Field.Store.NO)

What version of Lucene are you using? How do you start your process? If you
start the application with java's default memory, that's not very much (64M
if memory serves). You may be using nowhere near your hardware limits. Try
specifying -Xmx512M and/or the -server option.

Best
Erick

On Wed, Jul 14, 2010 at 9:27 AM, ilkay polat <ilkay_polat@yahoo.com> wrote:

> I have also  confused about the memory management of lucene.
>
> Where is this out of memory problem is mainly arised from Reason-1 or
> Reason-2 reason?
>
> Reason-1 : Problem is sourced from searching is done in big indexed file
> (nearly 40 GB) If there is 100(small number of records) records returned
> from search in 60 GB indexed file, problem will again arised.
> OR
> Reason-2 : Problem is sourced from finding so many records(nearly 200,000
> records), so in memory 200, 000 java object in heap? If file's sizeis 10
> GB(small file size ) but returned records are so many, problem will again
> arised.
>
> Is there any document which tells the general memory management issues in
> searching in lucene?
>
> Thanks
>
>
> ilkay POLAT     Software Engineer   Gsm : (+90) 532 542 36 71
>   E-mail : ilkay_polat@yahoo.com
>
> --- On Wed, 7/14/10, ilkay polat <ilkay_polat@yahoo.com> wrote:
>
> From: ilkay polat <ilkay_polat@yahoo.com>
> Subject: Re: Out of memory problem in search
> To: java-user@lucene.apache.org
> Date: Wednesday, July 14, 2010, 3:54 PM
>
> Hi,
> We have hardware restrictions(Max RAM can be  8GB). So, unfortunately,
> increasing memory can not be option for us for today's situation.
>
> Yes, as you said that problem is faced when goes to last pages of search
> screen because of using search method which is find top n records. In other
> way, this is meaning "searching all the thinngs returns all".
>
> I am now researching whether there is a way which consumes time instead of
> memory in this search mechanism in lucene? Any other ideas?
>
> Thanks
>
> --- On Wed, 7/14/10, findbestopensource <findbestopensource@gmail.com>
> wrote:
>
> From: findbestopensource <findbestopensource@gmail.com>
> Subject: Re: Out of memory problem in search
> To: java-user@lucene.apache.org
> Date: Wednesday, July 14, 2010, 2:59 PM
>
> Certainly it will. Either you need to increase your memory OR refine your
> query. Eventhough you display paginated result. The first couple of pages
> will display fine and going towards last may face problem. This is because,
> 200,000 objects is created and iterated, 190,900 objects are skipped and
> last100 objects are returned. The memory is consumed in creating these
> objects.
>
> Regards
> Aditya
> www.findbestopensource.com
>
>
>
> On Wed, Jul 14, 2010 at 4:14 PM, ilkay polat <ilkay_polat@yahoo.com>
> wrote:
>
> > Hello Friends;
> >
> > Recently, I have problem with lucene search - memory problem on the basis
> > that indexed file is so big. (I have indexed some kinds of information
> and
> > this indexed file's size is nearly more than 40 gigabyte. )
> >
> > I search the lucene indexed file with
> > org.apache.lucene.search.Searcher.search(query, null, offset + limit, new
> > Sort(new SortField("time", SortField.LONG, true)));
> > (This provides to find (offset + limit) records to back.)
> >
> > I use searching by range. For example, in web page I firstly search
> records
> > which are in [0, 100] range then second page [100, 200]
> > I have nearly 200,000 records at all. When I go to last page which means
> > records between 200,000 -100, 200,0, there is a memory problem(I have 4gb
> > ram on running machine) in jvm( out of memory error).
> >
> > Is there a way to overcome this memory problem?
> >
> > Thanks
> >
> > --
> > ilkay POLAT   Software Engineer
> > TURKEY
> >
> >  Gsm : (+90) 532 542 36 71
> >  E-mail : ilkay_polat@yahoo.com
> >
> >
> >
>
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message