incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rahul Challapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BLUR-230) Change the index merging code to not effect the Block Cache
Date Wed, 02 Oct 2013 07:57:28 GMT

    [ https://issues.apache.org/jira/browse/BLUR-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13783730#comment-13783730
] 

Rahul Challapalli commented on BLUR-230:
----------------------------------------

I haven't looked into it a great detail but does this sound like the right approach?

We can check whether the current operation is a merge using the below code 
  if (ioContext.context.equals(IOContext.Context.MERGE)) {
      // then do not update cache
  }

Or if this doesn't work we can override the merge() method in BlurIndexWriter and disable
cache updates for the duration of the merge using some flag in CacheDirectory. This doesn't
sound right as we effectively turn off cache updates for other searches that happen during
the merge process.

Let me know what you think?

> Change the index merging code to not effect the Block Cache
> -----------------------------------------------------------
>
>                 Key: BLUR-230
>                 URL: https://issues.apache.org/jira/browse/BLUR-230
>             Project: Apache Blur
>          Issue Type: New Feature
>          Components: Blur
>    Affects Versions: 0.3.0
>            Reporter: Aaron McCurry
>             Fix For: 0.3.0
>
>
> The current implementation of the index merge scheduler will cause the Block Cache directory
to read in all the information into the cache while it's merging into a new segment.  This
may evict more useful block information and thus cause performance issues.
> This task will likely have to modify the Block Cache so that if the needed data is present
in the cache it is used.  However if it's missing, it is read from HDFS and does NOT update
Block Cache.  Also needed, if the data was present in the cache have the fact the scheduler
read the data not count as a hit.  LRU caches are basically a queue with a map and every time
you read something from the map it removes the item from the queue and adds it back to the
front of the queue.  That way it's less likely to be evicted.  During merges this behavior
should be bypassed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message