incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <>
Subject Re: Possible Limitations?
Date Sat, 19 Oct 2013 13:38:53 GMT

You always have great emails.  :-)

I will attempt to add some value beyond Garrett's email.

On Fri, Oct 18, 2013 at 3:25 PM, Colton McInroy <>wrote:

> Hey Guys,
>     I am just wondering what kind of limitations/draw backs I may run into.
>     Currently I am adding a row (UUID) with two records (1 and 2) for
> every syslog event. I pool together the mutates and so far it appears to be
> working, although there may be further optimizations I can do. My original
> plan was to make a table with the application and date stamp then query
> across the tables, but i was informed by Aaron that it is currently not
> possible, although that's something that is being looked into being added
> (any idea how long Aaron?). Instead as per Aarons suggestion, I made a
> table with just the application name with the afore mentioned

The code to make that happen won't be difficult it's just a matter of
changing the API in a smart way to make it possible.

> row/records. My inquery is about what kind of things I can possibly expect
> or do to avoid future problems... Such as...
> Is there a limit on the number of rows in a table?
> If I am going to be putting years worth of access logs for all the sites I
> am logging, it's possible that I could have billions/trillions+ of rows in
> a table. If I do have that many of rows, will the queries take an extremely
> long amount of time?
> Is this based off of my shard count (shart count per table or number of
> shard servers)?
> If I know a table is going to be very large, should I set a really larger
> shard count, or does that have any bearing after a certain number?
> It was my understanding that lucene only supports about 274 billion unique
> terms per index, from my experience so far, each shard (directory not shard
> server... kinda confusing between the two when trying to explain... any
> plans on changing the name of the shard server to something else?) appears
> to be an index, so if that is correct, that would be 274*shard count which
> would be about 2.7 trillion for 10 shards. Is that correct?
> If querying across tables is going to be coming sometime soon, should I
> wait and separate the tables with date stamps to keep the indexes smaller,
> or will using a single table with lots of shards mean I can plow through
> the data without a problem?
> If my understanding is correct, and the shard count does separate and
> index into the number of shard counts, then where I would do previous 1
> index for each year/month/day/hour setting the shard count to 365*24 would
> be the same number of indexes I currently create on the fly for one year
> (actually, I do 365*24*SITECOUNT). If I set the number of shards to 1000
> would that be a optimal number for well... 1000*274billion = 274quadrillion
> entries.
> Again, if this is all correct, for that scenario, I can see the
> following...
> 274,000,000,000,000 / 365 / 24 / 60 / 60 = 8,712,353 rows per second for
> one year until full
> Unless it is per record, which I insert 2 for each row, then it would
> be....
> 274,000,000,000,000 / 365 / 24 / 60 / 60 / 2 = 4,356,176 records per
> second for one year until full
> 4 million events per second is above what I am dealing with right now. But
> this math should help people be able to figure out what settings for what
> kind of scale... again, if this is all correct, I'll need someone like
> Aaron to chime in. I end up with this calculation...
> int shardCount = 10;
> int recordsPerRow = 2;
> int recordsPerSecond = 10000;
> int luceneTermIndexInterval = 128; // Default
> int yearsUntilFull = (Integer.MAX_VALUE * luceneTermIndexInterval *
> shardCount) / 365 / 24 / 60 / 60 / recordsPerRow / recordsPerSecond; //
> 4.36 years until full
> I just kinda threw this out there, may not work in java but I am pretty
> sure my math is right (haven't been drinking... too much), I just did it
> out of my head quickly to demonstrate. In that example, it's 4.36 years for
> 10 shards with 10,000 sustained entries per second. Add another 0 to the
> number of shards then it's 100,000 sustained entries per second for 4.36
> years until full.

Ok, so Lucene's limits are a little different than what you have described
(I think).  Based on the
limit is not per index but instead per segment within the index.  Plus by
default Lucene will not merge segments that are over 5GB in size, so that
provides a kind of ceiling for the size of segment.  Overall the biggest
limit you will likely see is the 2 Billion document (Records in our case)
limit.  I think a Lucene index could exceed that, but the counts are based
on integers.  Given that I would think that you might want to calculate a
top end of 500 million per index (I have run 400+ million in a single index
in the past but keep things around 100 million now) so that you have plenty
of room before you get into the unknown limits of a single Lucene index.

At this point I think the biggest question mark in the system from want you
have described is how many shard servers you may need to have.  I have
tested Blur up to 200 shard servers in a single cluster and the controllers
seem to do just fine.  But at some point there will likely need to be some
work done to expand to 1000's of shard servers within a single cluster.
 Where is that point, I'm not sure, but I think that there may likely need
a second layer of controllers to reduce network hotspots.  But that's just
a guess.

> If I am correct on all of this, what kind of effect does having a large
> shard count (1000+)? I notice that tables when creating them seem to take
> for ever in comparison to the file structure made. I've seen the create
> process take 10+ second, but in the background I look, and see the file
> structure created, check the disk space used while the create command is
> running, and there is no change. I've tried both in hdfs and localfs. I am
> doing all of this in a virtual set of sessions right now, so it may not be
> the best testing platform. I'm in the process of requisitioning the
> hardware though to start building the cluster at our data center. Before I
> start putting all of the money into it, I want to see what kind of
> limitations I may be up against if any.

The delay is likely because all 1000 of indexes are opened for writing and
their nrt threads are started up and there is a lot of ZooKeeper logic to
get everything into a known state.  Also since creating a table is not
something I have optimized for speed, it's more of a get it right process
instead of get it done fast.


> It's been a long night for me so far, and still not over, I got meetings
> to go to soon. Hopefully I havent rabled on too much ;)
> I have been trying to keep close with the git code so that I can deal with
> the changes as they come, as well as help where I can, although I still got
> a lot of code reading left to do on the blur project.
> --
> Thanks,
> Colton McInroy
>  * Director of Security Engineering
> Phone
> (Toll Free)
> _US_    (888)-818-1344 Press 2
> _UK_    0-800-635-0551 Press 2
> My Extension    101
> 24/7 Support <>
> Email <>
> Website

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message