incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Re: Getting started - sharding data by customer, and hadoop version requirements.
Date Fri, 21 Dec 2012 16:08:54 GMT
I agree with Garret.  We run ~100 tables with the shard count varying from
1 shard to over 1000 in a single table.  How many tables will you have?

Yes Blur works on CDH3U2.  It should work on any 0.20.x (1.0.x) version of
Hadoop.  However if HDFS doesn't support appends then the write ahead log
won't function correctly.  Meaning it won't actually preserve the data.

Aaron


On Fri, Dec 21, 2012 at 10:59 AM, Garrett Barton
<garrett.barton@gmail.com>wrote:

> If I understand you correctly you have data from multiple customers
> (denoted by a customer_id) and you only perform a search against a single
> customer at a time?  If that's the case the separate index route might be a
> good idea as you can rebuild them separately, and you can model them
> differently potentially if you have a need.  Having said that, if you also
> occasionally want to search across customers, then you would want them all
> in a single index.
>
> I have Blur 1.x running on CDH3U5, I think it will work back down to CDH3U2
> at least, and that's hadoop 0.20 in both cases.  Have not tried 0.23 though
> I will be needing to soon.
>
>
> On Fri, Dec 21, 2012 at 10:51 AM, James Kebinger <jkebinger@gmail.com
> >wrote:
>
> > Hello, I'm hoping to kick the tires on apache blur in the near future. I
> > have a couple of quick questions before I set out.
> >
> > What version(s) of hadoop are required/supported at present?
> >
> > We have lots of data to index, but we always search within a particular
> > customer's data set. Would the best practice be to put all of the data in
> > one table and have the customer id in all of the queries, or build
> separate
> > tables for each customer_id (like users-1, users-123 etc).
> >
> > Thanks, and happy holidays!
> >
> > -James Kebinger
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message