lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-10205) Evaluate and reduce BlockCache store failures
Date Thu, 02 Mar 2017 01:53:45 GMT

     [ https://issues.apache.org/jira/browse/SOLR-10205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yonik Seeley updated SOLR-10205:
--------------------------------
    Attachment: SOLR-10205.patch

Here's a snapshot of the modifications and tests I'm using to reduce store failures and evaluate
the potential impact.

It's messy... yet another hacked up version of the HDFS performance test I used to uncover
the blockcache corruption issues.   I don't plan on committing most of this.  It's just for
evaluating what should be committed to reduce store failures.

> Evaluate and reduce BlockCache store failures
> ---------------------------------------------
>
>                 Key: SOLR-10205
>                 URL: https://issues.apache.org/jira/browse/SOLR-10205
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Yonik Seeley
>            Assignee: Yonik Seeley
>         Attachments: SOLR-10205.patch
>
>
> The BlockCache is written such that requests to cache a block (BlockCache.store call)
can fail, making caching less effective.  We should evaluate the impact of this storage failure
and potentially reduce the number of storage failures.
> The implementation reserves a single block of memory.  In store, a block of memory is
allocated, and then a pointer is inserted into the underling map.  A block is only freed when
the underlying map evicts the map entry.
> This means that when two store() operations are called concurrently (even under low load),
one can fail.  This is made worse by the fact that concurrent maps typically tend to amortize
the cost of eviction over many keys (i.e. the actual size of the map can grow beyond the configured
maximum number of entries... both the older ConcurrentLinkedHashMap and newer Caffeine do
this).  When this is the case, store() won't be able to find a free block of memory, even
if there aren't any other concurrently operating stores.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message