Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAAZU44nO=65-oStigDsyZodVA+8Q2kUV8LPQmx_BXPje0gC5GQ@mail.gmail.com>
References: 
 <CAAZU44nO=65-oStigDsyZodVA+8Q2kUV8LPQmx_BXPje0gC5GQ@mail.gmail.com>
From: Eric Evans <eevans@wikimedia.org>
Date: Tue, 19 Jan 2016 11:43:26 -0600
Message-ID: 
 <CANJ+_32XDepGRO0WN4FaY=Gq+ySosvCgr0vpMgEHhOshyD74TQ@mail.gmail.com>
Subject: Re: Using cassandra a BLOB store / web cache.
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a11c36e3255ba0e0529b36985

--001a11c36e3255ba0e0529b36985
Content-Type: text/plain; charset=UTF-8

On Mon, Jan 18, 2016 at 8:52 PM, Kevin Burton <burton@spinn3r.com> wrote:

> Internally we have the need for a blob store for web content.  It's MOSTLY
> key, ,value based but we'd like to have lookups by coarse grained tags.
>
> This needs to store normal web content like HTML , CSS, JPEG, SVG, etc.
>
> Highly doubt that anything over 5MB would need to be stored.
>
> We also need the ability to store older versions of the same URL for
> features like "time travel" where we can see what the web looks like over
> time.
>
> I initially wrote this for Elasticsearch (and it works well for that) but
> it looks like binaries snuck into the set of requirements.
>
> I could Base64 encode/decode them in ES I guess but that seems ugly.
>
> I was thinking of porting this over to CS but I'm not up to date on the
> current state of blobs in C*...
>
> Any advice?
>

We (Wikimedia Foundation) use Cassandra as a durable cache for HTML (with
history).  A simplified version of the schema we use would look something
like:

CREATE TABLE data (
    key text,
    rev int,
    tid timeuuid,
    value blob,
    PRIMARY KEY (("_domain", key), rev, tid)
)

In our case, a 'rev' represents a normative change to the document (read:
someone made an edit), and the 'tid' attribute allows for some arbitrary
number of HTML representations of that revision (say if for example some
transclusion would alter the final outcome).  You could simplify this
further by removing the 'tid' attribute if this doesn't apply to you.

One concern here is the size of blobs.  Where exactly the threshold on size
should be is probably debatable, but if you are using G1GC I would be
careful about what large blobs do to humongous allocations.  G1 will
allocate anything over 1/2 the region size as humongous, and special-case
the handling of them, so humongous allocations should be the exception and
not the rule.  Depending on your heap size and the distribution of blob
sizes, you might be able to get by with overriding the GC's choice of
region size, but if 5MB values are at all common, you'll need 16MB region
sizes, (which probably won't work well without a very large corresponding
max heap size).

Another concern is row width.  With a data-model like this, rows will grow
relative to the number of versions stored.  If versions are added at a low
rate, that might not pose an issue in practice, if it does though you'll
need to consider a different partitioning strategy.

TL;DR You need to understand what your data will look like.  Min and max
value sizes aren't enough, you should have some idea of size distribution,
read/write rates, etc.  Understand the implications of your data model.
And then test, test, test.


-- 
Eric Evans
eevans@wikimedia.org

--001a11c36e3255ba0e0529b36985
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">=
On Mon, Jan 18, 2016 at 8:52 PM, Kevin Burton <span dir=3D"ltr">&lt;<a href=
=3D"mailto:burton@spinn3r.com" target=3D"_blank">burton@spinn3r.com</a>&gt;=
</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px=
 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div di=
r=3D"ltr">Internally we have the need for a blob store for web content.=C2=
=A0 It&#39;s MOSTLY key, ,value based but we&#39;d like to have lookups by =
coarse grained tags.<div><br></div><div>This needs to store normal web cont=
ent like HTML , CSS, JPEG, SVG, etc.</div><div><br></div><div>Highly doubt =
that anything over 5MB would need to be stored.</div><div><br></div><div>We=
 also need the ability to store older versions of the same URL for features=
 like &quot;time travel&quot; where we can see what the web looks like over=
 time.</div><div><br></div><div>I initially wrote this for Elasticsearch (a=
nd it works well for that) but it looks like binaries snuck into the set of=
 requirements. =C2=A0</div><div><br></div><div>I could Base64 encode/decode=
 them in ES I guess but that seems ugly. =C2=A0</div><div><br></div><div>I =
was thinking of porting this over to CS but I&#39;m not up to date on the c=
urrent state of blobs in C*...</div><div><br></div><div>Any advice?</div></=
div></blockquote></div><br></div><div class=3D"gmail_extra">We (Wikimedia F=
oundation) use Cassandra as a durable cache for HTML (with history).=C2=A0 =
A simplified version of the schema we use would look something like:<br><br=
>CREATE TABLE data (<br>=C2=A0=C2=A0=C2=A0 key text,<br>=C2=A0=C2=A0=C2=A0 =
rev int,<br>=C2=A0=C2=A0=C2=A0 tid timeuuid,<br>=C2=A0=C2=A0=C2=A0 value bl=
ob,<br>=C2=A0=C2=A0=C2=A0 PRIMARY KEY ((&quot;_domain&quot;, key), rev, tid=
)<br>)<br></div><div class=3D"gmail_extra"><br clear=3D"all"></div><div cla=
ss=3D"gmail_extra">In our case, a &#39;rev&#39; represents a normative chan=
ge to the document (read: someone made an edit), and the &#39;tid&#39; attr=
ibute allows for some arbitrary number of HTML representations of that revi=
sion (say if for example some transclusion would alter the final outcome).=
=C2=A0 You could simplify this further by removing the &#39;tid&#39; attrib=
ute if this doesn&#39;t apply to you.<br></div><div class=3D"gmail_extra"><=
br></div><div class=3D"gmail_extra">One concern here is the size of blobs.=
=C2=A0 Where exactly the threshold on size should be is probably debatable,=
 but if you are using G1GC I would be careful about what large blobs do to =
humongous allocations.=C2=A0 G1 will allocate anything over 1/2 the region =
size as humongous, and special-case the handling of them, so humongous allo=
cations should be the exception and not the rule.=C2=A0 Depending on your h=
eap size and the distribution of blob sizes, you might be able to get by wi=
th overriding the GC&#39;s choice of region size, but if 5MB values are at =
all common, you&#39;ll need 16MB region sizes, (which probably won&#39;t wo=
rk well without a very large corresponding max heap size).<br></div><div cl=
ass=3D"gmail_extra"><br></div><div class=3D"gmail_extra">Another concern is=
 row width.=C2=A0 With a data-model like this, rows will grow relative to t=
he number of versions stored.=C2=A0 If versions are added at a low rate, th=
at might not pose an issue in practice, if it does though you&#39;ll need t=
o consider a different partitioning strategy.<br><br></div><div class=3D"gm=
ail_extra">TL;DR You need to understand what your data will look like.=C2=
=A0 Min and max value sizes aren&#39;t enough, you should have some idea of=
 size distribution, read/write rates, etc.=C2=A0 Understand the implication=
s of your data model.=C2=A0 And then test, test, test.<br></div><div class=
=3D"gmail_extra"><br><br>-- <br><div class=3D"gmail_signature"><div dir=3D"=
ltr">Eric Evans<div><a href=3D"mailto:eevans@wikimedia.org" target=3D"_blan=
k">eevans@wikimedia.org</a></div></div></div>
</div></div>

--001a11c36e3255ba0e0529b36985--