Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: message received from 54.164.171.186
 which is an MX secondary for user@cassandra.apache.org)
Sender: "Brian O'Neill" <boneill42@gmail.com>
User-Agent: Microsoft-MacOutlook/14.4.9.150325
Date: Wed, 22 Apr 2015 08:17:18 -0400
Subject: Re: Adhoc querying in Cassandra?
From: Brian O'Neill <bone@alumni.brown.edu>
To: <user@cassandra.apache.org>
Message-ID: <D15D0922.136897%bone@alumni.brown.edu>
Thread-Topic: Adhoc querying in Cassandra?
References: <bc3f937eb5c5a8fbc49eb6eddd276e09@mail.gmail.com>
 <CAKiMtbcR01n2=KzEJkPuwhhTGuDV7hk--EKgnt61M-Vm9jwytA@mail.gmail.com>
 <D15D0461.1367F1%bone@alumni.brown.edu>
 <a953a642ce12130a56bceb10c9bf6023@mail.gmail.com>
 <CAKiMtbcn_xN48M0T-byUJMfU8h2t5vctbObtJHuqJinTz2Pi8A@mail.gmail.com>
In-Reply-To: 
 <CAKiMtbcn_xN48M0T-byUJMfU8h2t5vctbObtJHuqJinTz2Pi8A@mail.gmail.com>
Mime-version: 1.0
Content-type: multipart/alternative;
	boundary="B_3512535444_5269474"

> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

--B_3512535444_5269474
Content-type: text/plain;
	charset="ISO-8859-1"
Content-transfer-encoding: quoted-printable

Again =8B agreed.

They have different usage patterns (C* heavy writes, ES heavy read), I woul=
d
separate them.
SOLR should be sufficient.  I believe DSE is a tight integration between
SOLR and C*.

-brian

---
Brian O'Neill=20
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile =80 @boneill42 <http://www.twitter.com/boneill42>


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
=20


From:  Ali Akhtar <ali.rac200@gmail.com>
Reply-To:  <user@cassandra.apache.org>
Date:  Wednesday, April 22, 2015 at 8:10 AM
To:  <user@cassandra.apache.org>
Subject:  Re: Adhoc querying in Cassandra?

I believe ElasticSearch has better support for scaling horizontally (by
adding nodes) than Solr does. Some benchmarks that I've looked at, also sho=
w
it as performing better under high load.

I probably wouldn't run them both on the same node, or you might see low
performance as they compete for resources.

What type of usage do you expect - mostly read, or mostly write?

On Wed, Apr 22, 2015 at 5:06 PM, Matthew Johnson <matt.johnson@algomi.com>
wrote:
> Hi Ali, Brian,
> =20
> Thanks for the suggestion =AD we have previously used Solr (SolrCloud for
> distribution) for a lot of other products, presumably this will do the sa=
me
> job as ElasticSearch? Or does ElasticSearch have specifically better
> integration with Cassandra or better support for aggregate queries?
> =20
> Would it be an ok architecture to have a Cassandra node and a Solr/ES ins=
tance
> on each box, so they scale together? Or is it better to have separate ser=
vers
> for storage and search?
> =20
> Cheers,
> Matt
> =20
>=20
> From: Brian O'Neill [mailto:boneill42@gmail.com] On Behalf Of Brian O'Nei=
ll
> Sent: 22 April 2015 12:56
> To: user@cassandra.apache.org
> Subject: Re: Adhoc querying in Cassandra?
> =20
>=20
> =20
>=20
> +1, I think many organizations (including ours) pair Elastic Search with
> Cassandra.
>=20
> Use Cassandra as your system of record, then index the data with ES.
>=20
> =20
>=20
> -brian
>=20
> =20
>=20
> ---
> Brian O'Neill=20
> Chief Technology Officer
> Health Market Science, a LexisNexis Company
> 215.588.6024 Mobile =80 @boneill42 <http://www.twitter.com/boneill42>
> =20
> This information transmitted in this email message is for the intended
> recipient only and may contain confidential and/or privileged material. I=
f you
> received this email in error and are not the intended recipient, or the p=
erson
> responsible to deliver it to the intended recipient, please contact the s=
ender
> at the email above and delete this email and any attachments and destroy =
any
> copies thereof. Any review, retransmission, dissemination, copying or oth=
er
> use of, or taking any action in reliance upon, this information by person=
s or
> entities other than the intended recipient is strictly prohibited.
> =20
>=20
> =20
>=20
> From: Ali Akhtar <ali.rac200@gmail.com>
> Reply-To: <user@cassandra.apache.org>
> Date: Wednesday, April 22, 2015 at 7:52 AM
> To: <user@cassandra.apache.org>
> Subject: Re: Adhoc querying in Cassandra?
>=20
> =20
> You might find it better to use elasticsearch for your aggregate queries =
and
> analytics. Cassandra is more of just a data store.
>=20
> On Apr 22, 2015 4:42 PM, "Matthew Johnson" <matt.johnson@algomi.com> wrot=
e:
>=20
> Hi all,
> =20
> Currently we are setting up a =B3big=B2 data cluster, but we are only going t=
o
> have a couple of servers to start with but we need to be able to scale ou=
t
> quickly when usage ramps up. Previously we have used Hadoop/HBase for our=
 big
> data cluster, but since we are starting this one on only two nodes I thin=
k
> Cassandra will be a much better fit, as Hadoop and HBase really need at l=
east
> 3 to achieve any sort of resilience (zookeeper quorum etc).
> =20
> My question is this:
> =20
> I have used Apache Phoenix as a JDBC layer on top of HBase, which allows =
me to
> issue ad-hoc SQL-style queries. (eg count the number of times users have
> clicked on a certain button after clicking a different button in the last=
 3
> weeks etc). My understanding is that CQL does not support this style of a=
dhoc
> aggregate querying out of the box. Is there a recommended way to do count=
,
> sum, average etc without writing client code (in my case Java) every time=
 I
> want to run one? I have been looking at projects like Drill, Spark etc th=
at
> could potentially sit on top of Cassandra but without actually setting
> everything up and testing them it is difficult to figure out what they wo=
uld
> give us.
> =20
> Does anyone else interactively issue adhoc aggregate queries to Cassandra=
, and
> if so, what stack do you use?
> =20
> Thanks!
> Matt
> =20


--B_3512535444_5269474
Content-type: text/html;
	charset="ISO-8859-1"
Content-transfer-encoding: quoted-printable

<html><head></head><body style=3D"word-wrap: break-word; -webkit-nbsp-mode: s=
pace; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size:=
 13px; font-family: Calibri, sans-serif;"><div><div><div>Again &#8212; agree=
d.</div><div><br></div><div>They have different usage patterns (C* heavy wri=
tes, ES heavy read), I would separate them.</div><div>SOLR should be suffici=
ent. &nbsp;I believe DSE is a tight integration between SOLR and C*.</div><d=
iv><br></div><div>-brian</div><div><br></div><div><p class=3D"MsoNormal" style=
=3D"margin: 0in 0in 0.0001pt;"><font face=3D"Calibri">---</font></p><p class=3D"Ms=
oNormal" style=3D"margin: 0in 0in 0.0001pt;"><b><font face=3D"Calibri">Brian O'N=
eill&nbsp;</font></b></p><p class=3D"MsoNormal" style=3D"margin: 0in 0in 0.0001p=
t;"><font face=3D"Calibri">Chief Technology Officer</font></p><p class=3D"MsoNor=
mal" style=3D"margin: 0in 0in 0.0001pt;"><font face=3D"Calibri">Health Market&nb=
sp;<span style=3D"color: rgb(230, 29, 53);">Science</span>, a LexisNexis Compa=
ny</font></p><p class=3D"MsoNormal" style=3D"margin: 0in 0in 0.0001pt;"><font fa=
ce=3D"Calibri">215.588.6024 Mobile&nbsp;<span class=3D"Apple-style-span" style=3D"=
color: rgb(237, 26, 52);">&#8226; </span><span style=3D"letter-spacing: -0.4pt=
;"><a href=3D"http://www.twitter.com/boneill42" style=3D"letter-spacing: normal;=
 line-height: 20px; orphans: 2; widows: 2; background-color: rgb(255, 255, 2=
55);"><font color=3D"#000000">@boneill42</font></a>&nbsp;</span></font></p><p =
class=3D"MsoNormal" style=3D"margin: 0in 0in 0.0001pt;"><font face=3D"Calibri"><br=
></font></p><p class=3D"MsoNormal" style=3D"margin: 0in 0in 0.0001pt;"></p><p cl=
ass=3D"MsoNormal" style=3D"margin: 0in 0in 0.0001pt;"><span style=3D"color: rgb(31=
, 73, 125);"><font face=3D"Calibri">This information transmitted in this email=
 message is for the intended recipient only and may contain confidential and=
/or privileged material. If you received this email in error and are not the=
 intended recipient, or the person responsible to deliver it to the intended=
 recipient, please contact the sender at the email above and delete this ema=
il and any attachments and destroy any copies thereof. Any review, retransmi=
ssion, dissemination, copying or other use of, or taking any action in relia=
nce upon, this information by persons or entities other than the intended re=
cipient is strictly prohibited.</font><font face=3D"Calibri,sans-serif" style=3D=
"font-size: 8pt;"><o:p></o:p></font></span></p><p class=3D"MsoNormal" style=3D"f=
ont-size: 11pt; margin: 0in 0in 0.0001pt;"><o:p>&nbsp;</o:p></p><p style=3D"fo=
nt-size: 14px;"></p></div></div></div><div><br></div><span id=3D"OLK_SRC_BODY_=
SECTION"><div style=3D"font-family:Calibri; font-size:11pt; text-align:left; c=
olor:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BO=
TTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt so=
lid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt"><span style=3D"font-weight:b=
old">From: </span> Ali Akhtar &lt;<a href=3D"mailto:ali.rac200@gmail.com">ali.=
rac200@gmail.com</a>&gt;<br><span style=3D"font-weight:bold">Reply-To: </span>=
 &lt;<a href=3D"mailto:user@cassandra.apache.org">user@cassandra.apache.org</a=
>&gt;<br><span style=3D"font-weight:bold">Date: </span> Wednesday, April 22, 2=
015 at 8:10 AM<br><span style=3D"font-weight:bold">To: </span> &lt;<a href=3D"ma=
ilto:user@cassandra.apache.org">user@cassandra.apache.org</a>&gt;<br><span s=
tyle=3D"font-weight:bold">Subject: </span> Re: Adhoc querying in Cassandra?<br=
></div><div><br></div><div dir=3D"ltr">I believe ElasticSearch has better supp=
ort for scaling horizontally (by adding nodes) than Solr does. Some benchmar=
ks that I've looked at, also show it as performing better under high load.<d=
iv><br></div><div>I probably wouldn't run them both on the same node, or you=
 might see low performance as they compete for resources.&nbsp;</div><div><b=
r></div><div>What type of usage do you expect - mostly read, or mostly write=
?</div></div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Wed, A=
pr 22, 2015 at 5:06 PM, Matthew Johnson <span dir=3D"ltr">&lt;<a href=3D"mailto:=
matt.johnson@algomi.com" target=3D"_blank">matt.johnson@algomi.com</a>&gt;</sp=
an> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex"><div lang=3D"EN-US" link=3D"blue" vlink=3D=
"purple"><div><p class=3D"MsoNormal"><span style=3D"font-size: 11pt; font-family=
: Calibri, sans-serif; color: rgb(31, 73, 125);">Hi Ali, Brian,</span></p><p=
 class=3D"MsoNormal"><span style=3D"font-size: 11pt; font-family: Calibri, sans-=
serif; color: rgb(31, 73, 125);">&nbsp;</span></p><p class=3D"MsoNormal"><span=
 style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73=
, 125);">Thanks for the suggestion &#8211; we have previously used Solr (Sol=
rCloud for distribution) for a lot of other products, presumably this will d=
o the same job as ElasticSearch? Or does ElasticSearch have specifically bet=
ter integration with Cassandra or better support for aggregate queries?</spa=
n></p><p class=3D"MsoNormal"><span style=3D"font-size: 11pt; font-family: Calibr=
i, sans-serif; color: rgb(31, 73, 125);">&nbsp;</span></p><p class=3D"MsoNorma=
l"><span style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: rg=
b(31, 73, 125);">Would it be an ok architecture to have a Cassandra node and=
 a Solr/ES instance on each box, so they scale together? Or is it better to =
have separate servers for storage and search?</span></p><p class=3D"MsoNormal"=
><span style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: rgb(=
31, 73, 125);">&nbsp;</span></p><p class=3D"MsoNormal"><span style=3D"font-size:=
 11pt; font-family: Calibri, sans-serif; color: rgb(31, 73, 125);">Cheers,</=
span></p><p class=3D"MsoNormal"><span style=3D"font-size: 11pt; font-family: Cal=
ibri, sans-serif; color: rgb(31, 73, 125);">Matt</span></p><p class=3D"MsoNorm=
al"><span style=3D"font-size: 11pt; font-family: Calibri, sans-serif; color: r=
gb(31, 73, 125);">&nbsp;</span></p><div><div style=3D"border:none;border-top:s=
olid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm"><p class=3D"MsoNormal"><b><span =
style=3D"font-size: 10pt; font-family: Tahoma, sans-serif;">From:</span></b><s=
pan style=3D"font-size: 10pt; font-family: Tahoma, sans-serif;"> Brian O'Neill=
 [mailto:<a href=3D"mailto:boneill42@gmail.com" target=3D"_blank">boneill42@gmai=
l.com</a>] <b>On Behalf Of </b>Brian O'Neill<br><b>Sent:</b> 22 April 2015 1=
2:56<br><b>To:</b> <a href=3D"mailto:user@cassandra.apache.org" target=3D"_blank=
">user@cassandra.apache.org</a><br><b>Subject:</b> Re: Adhoc querying in Cas=
sandra?</span></p></div></div><p class=3D"MsoNormal">&nbsp;</p><div><div><p cl=
ass=3D"MsoNormal"><span style=3D"font-size: 10pt; font-family: Calibri, sans-ser=
if; color: black;">&nbsp;</span></p></div><div><p class=3D"MsoNormal"><span st=
yle=3D"font-size: 10pt; font-family: Calibri, sans-serif; color: black;">+1, I=
 think many organizations (including ours) pair Elastic Search with Cassandr=
a.</span></p></div><div><p class=3D"MsoNormal"><span style=3D"font-size: 10pt; f=
ont-family: Calibri, sans-serif; color: black;">Use Cassandra as your system=
 of record, then index the data with ES.</span></p></div><div><p class=3D"MsoN=
ormal"><span style=3D"font-size: 10pt; font-family: Calibri, sans-serif; color=
: black;">&nbsp;</span></p></div><div><p class=3D"MsoNormal"><span style=3D"font=
-size: 10pt; font-family: Calibri, sans-serif; color: black;">-brian</span><=
/p></div><div><p class=3D"MsoNormal"><span style=3D"font-size: 10pt; font-family=
: Calibri, sans-serif; color: black;">&nbsp;</span></p></div><div><p class=3D"=
MsoNormal"><span style=3D"font-family: Calibri, sans-serif; color: black;">---=
</span><span style=3D"color:black"></span></p><p class=3D"MsoNormal"><b><span st=
yle=3D"font-family: Calibri, sans-serif; color: black;">Brian O'Neill&nbsp;</s=
pan></b><span style=3D"color:black"></span></p><p class=3D"MsoNormal"><span styl=
e=3D"font-family: Calibri, sans-serif; color: black;">Chief Technology Officer=
</span><span style=3D"color:black"></span></p><p class=3D"MsoNormal"><span style=
=3D"font-family: Calibri, sans-serif; color: black;">Health Market&nbsp;</span=
><span style=3D"font-family: Calibri, sans-serif; color: rgb(230, 29, 53);">Sc=
ience</span><span style=3D"font-family: Calibri, sans-serif; color: black;">, =
a LexisNexis Company</span><span style=3D"color:black"></span></p><p class=3D"Ms=
oNormal"><span style=3D"font-family: Calibri, sans-serif; color: black;">215.5=
88.6024 Mobile&nbsp;</span><span><span style=3D"font-family: Calibri, sans-ser=
if; color: rgb(237, 26, 52);">&#8226; </span></span><span style=3D"font-family=
: Calibri, sans-serif; color: black; letter-spacing: -0.4pt;"><a href=3D"http:=
//www.twitter.com/boneill42" target=3D"_blank"><span style=3D"color:black;letter=
-spacing:0pt;background:white">@boneill42</span></a>&nbsp;</span><span style=
=3D"color:black"></span></p><p class=3D"MsoNormal"><span style=3D"color:black">&nb=
sp;</span></p><p class=3D"MsoNormal"><span style=3D"font-family: Calibri, sans-s=
erif; color: rgb(31, 73, 125);">This information transmitted in this email m=
essage is for the intended recipient only and may contain confidential and/o=
r privileged material. If you received this email in error and are not the i=
ntended recipient, or the person responsible to deliver it to the intended r=
ecipient, please contact the sender at the email above and delete this email=
 and any attachments and destroy any copies thereof. Any review, retransmiss=
ion, dissemination, copying or other use of, or taking any action in relianc=
e upon, this information by persons or entities other than the intended reci=
pient is strictly prohibited.</span><span style=3D"color:black"></span></p><p =
class=3D"MsoNormal"><span style=3D"font-size:11.0pt;color:black">&nbsp;</span></=
p></div></div><div><p class=3D"MsoNormal"><span style=3D"font-size: 10pt; font-f=
amily: Calibri, sans-serif; color: black;">&nbsp;</span></p></div><div style=
=3D"border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm"><p c=
lass=3D"MsoNormal"><b><span style=3D"font-size: 11pt; font-family: Calibri, sans=
-serif; color: black;">From: </span></b><span style=3D"font-size: 11pt; font-f=
amily: Calibri, sans-serif; color: black;">Ali Akhtar &lt;<a href=3D"mailto:al=
i.rac200@gmail.com" target=3D"_blank">ali.rac200@gmail.com</a>&gt;<br><b>Reply=
-To: </b>&lt;<a href=3D"mailto:user@cassandra.apache.org" target=3D"_blank">user=
@cassandra.apache.org</a>&gt;<br><b>Date: </b>Wednesday, April 22, 2015 at 7=
:52 AM<br><b>To: </b>&lt;<a href=3D"mailto:user@cassandra.apache.org" target=3D"=
_blank">user@cassandra.apache.org</a>&gt;<br><b>Subject: </b>Re: Adhoc query=
ing in Cassandra?</span></p></div><div><p class=3D"MsoNormal"><span style=3D"fon=
t-size: 10pt; font-family: Calibri, sans-serif; color: black;">&nbsp;</span>=
</p></div><p><span style=3D"font-size: 10pt; font-family: Calibri, sans-serif;=
 color: black;">You might find it better to use elasticsearch for your aggre=
gate queries and analytics. Cassandra is more of just a data store.</span></=
p><div><p class=3D"MsoNormal"><span style=3D"font-size: 10pt; font-family: Calib=
ri, sans-serif; color: black;">On Apr 22, 2015 4:42 PM, "Matthew Johnson" &l=
t;<a href=3D"mailto:matt.johnson@algomi.com" target=3D"_blank">matt.johnson@algo=
mi.com</a>&gt; wrote:</span></p><div><div><p class=3D"MsoNormal"><span style=3D"=
color:black">Hi all,</span></p><p class=3D"MsoNormal"><span style=3D"color:black=
">&nbsp;</span></p><p class=3D"MsoNormal"><span style=3D"color:black">Currently =
we are setting up a &#8220;big&#8221; data cluster, but we are only going to=
 have a couple of servers to start with but we need to be able to scale out =
quickly when usage ramps up. Previously we have used Hadoop/HBase for our bi=
g data cluster, but since we are starting this one on only two nodes I think=
 Cassandra will be a much better fit, as Hadoop and HBase really need at lea=
st 3 to achieve any sort of resilience (zookeeper quorum etc).</span></p><p =
class=3D"MsoNormal"><span style=3D"color:black">&nbsp;</span></p><p class=3D"MsoNo=
rmal"><span style=3D"color:black">My question is this:</span></p><p class=3D"Mso=
Normal"><span style=3D"color:black">&nbsp;</span></p><p class=3D"MsoNormal"><spa=
n style=3D"color:black">I have used Apache Phoenix as a JDBC layer on top of H=
Base, which allows me to issue ad-hoc SQL-style queries. (eg count the numbe=
r of times users have clicked on a certain button after clicking a different=
 button in the last 3 weeks etc). My understanding is that CQL does not supp=
ort this style of adhoc aggregate querying out of the box. Is there a recomm=
ended way to do count, sum, average etc without writing client code (in my c=
ase Java) every time I want to run one? I have been looking at projects like=
 Drill, Spark etc that could potentially sit on top of Cassandra but without=
 actually setting everything up and testing them it is difficult to figure o=
ut what they would give us.</span></p><p class=3D"MsoNormal"><span style=3D"colo=
r:black">&nbsp;</span></p><p class=3D"MsoNormal"><span style=3D"color:black">Doe=
s anyone else interactively issue adhoc aggregate queries to Cassandra, and =
if so, what stack do you use?</span></p><p class=3D"MsoNormal"><span style=3D"co=
lor:black">&nbsp;</span></p><p class=3D"MsoNormal"><span style=3D"color:black">T=
hanks!</span></p><p class=3D"MsoNormal"><span style=3D"color:black">Matt</span><=
/p><p class=3D"MsoNormal"><span style=3D"color:black">&nbsp;</span></p></div></d=
iv></div></div></div></blockquote></div><br></div></span></body></html>

--B_3512535444_5269474--