lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lyuba Romanchuk <lyuba.romanc...@gmail.com>
Subject Fwd: Adding new functionality to avoid "java.lang.OutOfMemoryError: Java heap space" exception
Date Tue, 09 Apr 2013 12:02:28 GMT
It seems like bullets don't look nice then I'm sending explanation without
bullets.

The flow of SolrCore.execute() function will be changed:

Change the status of the core to “USED” and call
waitForResource(SolrRequestHandler, SolrQueryRequest) function, after that
perform the current SolrCore.execute() flow and change status of the core
to “UNUSED”.

In waitForResource(SolrRequestHandler, SolrQueryRequest) function,
initially, estimate the required memory for this query/handler on this
core. If there is no enough free resources to run the query and after
unloading all unused, not permanent cores still there is no enough resource
throw an "OutOfMemoryError " exception and change the status of the core to
“UNUSED”; else wait with timeout till some resource is released and then
check again until the required resource is available or the exception is
thrown.

Best regards,

Lyuba

---------- Forwarded message ----------
From: Lyuba Romanchuk <lyuba.romanchuk@gmail.com>
Date: Tue, Apr 9, 2013 at 11:47 AM
Subject: Adding new functionality to avoid "java.lang.OutOfMemoryError:
Java heap space" exception
To: dev@lucene.apache.org


Hi all,

We run solr (4.2 and 5.0) in a real time environment with big data. Each
day two Solr cores are generated that can reach ~8-10g, depending on the
insertion rates and on different hardware.

Currently, all cores are loaded on solr startup.

The query rate is not high but the response must be quick and must be
returned even for old data and over a large time frame.

There are a lot of simple queries (facet/facet.pivot for small distributed
fields) but there are also heavy queries like facet.pivot for a large-scale
distributed fields. We use distributed search to query the cores and,
usually, the query over 1-2 weeks (around 7-28 cores).

After some large queries (with facet.pivot for wide distributed fields) we
sometimes encounter a "java.lang.OutOfMemoryError: Java heap space"
exception:.

The software is to be deployed to customer sites so increasing memory would
not always be possible, and the customers may want to get slower responses
for the larger queries, if we can provide them.

We looked at the LotsOfCores functionality that was added in 4.1 and 4.2.
It enables defining an upper limit of online cores and unloading them when
the cache gets full on a LRU basis. However in our case it seems a more
general use case is needed:

* Only cores that are used for updates/inserts must be loaded at all times.
Other cores, which are queried only, should be loaded / unloaded on demand
while the query runs, until completion – according to memory demands.

* Each facet, facet.pivot must be estimated for memory consumption. In case
there is not enough memory to run the query for all cores concurrently it
must be separated into sequential queries, unloading already queried or
irrelevant cores (but not permanent cores) and loading older cores to
complete the query.

* Occasionally, the oldest cores should be unloaded according to a
configurable policy (for example, one type of high volume cores will be
kept loaded for 1 week, while smaller cores can remain loaded for a month).
The policy will allow for data we know is queried less but is higher volume
to be kept live over shorter time periods.

We are considering adding the following functionality to Solr (optional –
turned on by new configs):

The flow of SolrCore.execute() function will be changed:


   - Change status of the core to “USED”
   - Call waitForResource(SolrRequestHandler, SolrQueryRequest) function
      - estimate the required memory for this query/handler on this core
      - if there is no enough free resources to run the query then
         - if all cores are permanent and can’t be unloaded then
            - throw a "OutOfMemoryError " exception // here the status of
            the core should be changed to “UNUSED”
         - else
            -  try to unload unused, not permanent cores
            - if unloading unused cores didn’t release enough resources and
            no core can be unloaded then
               - throw an "OutOfMemoryError " exception // here the status
               of the core should be changed to “UNUSED”
            - if unloading unused cores didn’t release enough resources and
            there are cores that can be unloaded then
            - wait with timeout till some resource is released
               - check again until the required resource is available or
               the exception is thrown
               - reserve the resource
   - Call the current SolrCore.execute()
   - Change status of the core to “UNUSED”

We would like to get some initial feedback on the design / functionality
we’re proposing as we feel this really benefits real-time, high volume
indexing systems such as ours. We are also happy to contribute the code
back if you feel there is a need for this functionality.

Best regards,

Lyuba

Mime
View raw message