Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAOKsKJWjr11_pZqXH+LeHsBonDN-=fN=6LkHqxxPoFT7e0C0TQ@mail.gmail.com>
References: 
 <CAMUu0w91b8Ki1BaUxLAymqf=o2v0ejjb-rYcjTETYij4jO6Gcw@mail.gmail.com>
 <DC5EBE7F3610EB4CA5C7E92D76873E151716F05630@exchange2007.carrieriq.com>
 <CAMUu0w-UwA9FYvqoCj=GOsN-ECL1=88op4JDo0iY3afOmou=oA@mail.gmail.com>
 <CAOKsKJVNY4Ngq3rwhX4R7WjiDzDQGe4b0oz=uG8rgzY+cj+-fA@mail.gmail.com>
 <DC5EBE7F3610EB4CA5C7E92D76873E151716F05631@exchange2007.carrieriq.com>
 <CAOKsKJVUn-23RD4H2VZ98ugvxh16WFKTsirgjVAwiXrV0NBpOQ@mail.gmail.com>
 <CAOKsKJWjr11_pZqXH+LeHsBonDN-=fN=6LkHqxxPoFT7e0C0TQ@mail.gmail.com>
From: =?UTF-8?Q?Enis_S=C3=B6ztutar?= <enis@apache.org>
Date: Wed, 4 Apr 2012 15:01:23 -0700
Message-ID: 
 <CAMUu0w_atMq2DX1z9GBjyyuQZ8NanvagH8O6YTYU0wAsfPUr8Q@mail.gmail.com>
Subject: Re: keyvalue cache
To: dev@hbase.apache.org
Content-Type: multipart/alternative; boundary=0015174c0e9cb1575c04bce1913f

--0015174c0e9cb1575c04bce1913f
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Not sure about memcached or coprocessors based implementations, where you
would lose a consistent view over your data. I think one of the lucene over
hbase
implementation uses a memory cache (cant remember if it was memcache) over
hbase indexreaders and writers. You can do memcache deployments with 0 code
change to hbase, but haven't heard of any one other than those guys, no?
Has anyone
tried it?

On Wed, Apr 4, 2012 at 2:53 PM, Matt Corgan <mcorgan@hotpads.com> wrote:

> in the mean time, memcached could provide all those benefits without addi=
ng
> any complexity to hbase...
>
>
> On Wed, Apr 4, 2012 at 2:46 PM, Matt Corgan <mcorgan@hotpads.com> wrote:
>
> > It could act like a HashSet of KeyValues keyed on the
> > rowKey+family+qualifier but not including the timestamp.  As writes com=
e
> in
> > it would evict or overwrite previous versions (read-through vs
> > write-through).  It would only service point queries where the
> > row+fam+qualifier are specified, returning the latest version.  Wouldn'=
t
> be
> > able to do a typical rowKey-only Get (scan behind the scenes) because i=
t
> > wouldn't know if it contained all the cells in the row, but if you coul=
d
> > specify all your row's qualifiers up-front it could work.
> >
> >
> > On Wed, Apr 4, 2012 at 2:30 PM, Vladimir Rodionov <
> vrodionov@carrieriq.com
> > > wrote:
> >
> >> 1. 2KB can be too large for some applications. For example, some of ou=
r
> >> k-v sizes < 100 bytes combined.
> >> 2. These tables (from 1.) do not benefit from block cache at all (we d=
id
> >> not try 100 B block size yet :)
> >> 3. And Matt is absolutely right: small block size is expensive
> >>
> >> How about doing point queries on K-V cache and  bypass K-V cache on al=
l
> >> Scans (when someone really need this)?
> >> Implement K-V cache as a coprocessor application?
> >>
> >> Invalidation of K-V entry is not necessary if all upserts operations g=
o
> >> through K-V cache firstly if it sits in front of MemStore.
> >> There will be no "stale or invalid" data situation in this case.
> Correct?
> >> No need for data to be sorted and no need for data to be merged
> >> into a scan (we do not use K-V cache for Scans)
> >>
> >>
> >> Best regards,
> >> Vladimir Rodionov
> >> Principal Platform Engineer
> >> Carrier IQ, www.carrieriq.com
> >> e-mail: vrodionov@carrieriq.com
> >>
> >> ________________________________________
> >> From: Matt Corgan [mcorgan@hotpads.com]
> >> Sent: Wednesday, April 04, 2012 11:40 AM
> >> To: dev@hbase.apache.org
> >> Subject: Re: keyvalue cache
> >>
> >> I guess the benefit of the KV cache is that you are not holding entire
> 64K
> >> blocks in memory when you only care about 200 bytes of them.  Would an
> >> alternative be to set a small block size (2KB or less)?
> >>
> >> The problems with small block sizes would be expensive block cache
> >> management overhead and inefficient scanning IO due to lack of
> read-ahead.
> >>  Maybe improving the cache management and read-ahead would be more
> general
> >> improvements that don't add as much complexity?
> >>
> >> I'm having a hard time envisioning how you would do invalidations on t=
he
> >> KV
> >> cache and how you would merge its entries into a scan, etc.  Would it
> >> basically be a memstore in front of the memstore where KVs get
> >> individually
> >> invalidated instead of bulk-flushed?  Would it be sorted or hashed?
> >>
> >> Matt
> >>
> >> On Wed, Apr 4, 2012 at 10:35 AM, Enis S=C3=B6ztutar <enis@apache.org> =
wrote:
> >>
> >> > As you said, caching the entire row does not make much sense, given
> that
> >> > the families are by contract the access boundaries. But caching colu=
mn
> >> > families might be a good trade of for dealing with the per-item
> >> overhead.
> >> >
> >> > Also agreed on cache being configurable at the table or better cf
> >> level. I
> >> > think we can do something like enable_block_cache =3D true,
> >> > enable_kv_cache=3Dfalse, per column family.
> >> >
> >> > Enis
> >> >
> >> > On Tue, Apr 3, 2012 at 11:03 PM, Vladimir Rodionov
> >> > <vrodionov@carrieriq.com>wrote:
> >> >
> >> > > Usually make sense for tables with random mostly access (point
> >> queries)
> >> > > For short-long scans block cache is preferable.
> >> > > Cassandra has it (Row cache) but as since they cache the whole row
> >> (which
> >> > > can be very large) in many cases
> >> > > it has sub par performance. Make sense to make caching configurabl=
e:
> >> > table
> >> > > can use key-value cache and do not use block cache
> >> > > and vice verse.
> >> > >
> >> > > Best regards,
> >> > > Vladimir Rodionov
> >> > > Principal Platform Engineer
> >> > > Carrier IQ, www.carrieriq.com
> >> > > e-mail: vrodionov@carrieriq.com
> >> > >
> >> > > ________________________________________
> >> > > From: Enis S=C3=B6ztutar [enis@apache.org]
> >> > > Sent: Tuesday, April 03, 2012 3:34 PM
> >> > > To: dev@hbase.apache.org
> >> > > Subject: keyvalue cache
> >> > >
> >> > > Hi,
> >> > >
> >> > > Before opening the issue, I though I should ask around first. What
> do
> >> you
> >> > > think about a keyvalue cache sitting on top of the block cache? It
> is
> >> > > mentioned in the big table paper, and it seems that zipfian kv
> access
> >> > > patterns might benefit from something like this a lot. I could not
> >> find
> >> > > anybody who proposed that before.
> >> > >
> >> > > What do you guys think? Should we pursue a kv query-cache. My gut
> >> feeling
> >> > > says that especially for some workloads we might gain significant
> >> > > performance improvements, but we cannot verify it, until we
> implement
> >> and
> >> > > profile it, right?
> >> > >
> >> > > Thanks,
> >> > > Enis
> >> > >
> >> > > Confidentiality Notice:  The information contained in this message=
,
> >> > > including any attachments hereto, may be confidential and is
> intended
> >> to
> >> > be
> >> > > read only by the individual or entity to whom this message is
> >> addressed.
> >> > If
> >> > > the reader of this message is not the intended recipient or an age=
nt
> >> or
> >> > > designee of the intended recipient, please note that any review,
> use,
> >> > > disclosure or distribution of this message or its attachments, in
> any
> >> > form,
> >> > > is strictly prohibited.  If you have received this message in erro=
r,
> >> > please
> >> > > immediately notify the sender and/or Notifications@carrieriq.coman=
d
> >> > > delete or destroy any copy of this message and its attachments.
> >> > >
> >> >
> >>
> >> Confidentiality Notice:  The information contained in this message,
> >> including any attachments hereto, may be confidential and is intended
> to be
> >> read only by the individual or entity to whom this message is
> addressed. If
> >> the reader of this message is not the intended recipient or an agent o=
r
> >> designee of the intended recipient, please note that any review, use,
> >> disclosure or distribution of this message or its attachments, in any
> form,
> >> is strictly prohibited.  If you have received this message in error,
> please
> >> immediately notify the sender and/or Notifications@carrieriq.com and
> >> delete or destroy any copy of this message and its attachments.
> >>
> >
> >
>

--0015174c0e9cb1575c04bce1913f--