incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Big NRT update changes in Blur 0.2.2
Date Mon, 06 Jan 2014 20:36:43 GMT
As some of you know there are many issues with the NRT system in Apache
Blur 0.2.1.  The basic problem seemed to be a blocking issue between the
write ahead log, the commit process and the merge scheduler.  Basically if
the NRT system operated at all it was very slow, but many times it would
simply cause the cluster to become unstable and unusable.

Also the current behavior also had the visibility of data lag the ingestion
of data for performance improvements.  Lucene's NRT update system was used
in this case.

Over the past week or so I have been working on this part of the system to
fix/improve it before the 0.2.2 release.  I have pushed the results of this
work this afternoon.

- The new implementation removes the need for a WAL (write ahead log)
because everything is committed (Lucene commit) with every call.
- The data is also immediately visible, so there is no data visibility lag.
- The updates are applied per shard in a sequential manner.
- And per mutate operation on a given shard, it will either success or fail.

So I think this means that Blur is now ACID (with a couple of caveats).
 There needs to be more testing before we can make any guarantees.  Below
I'm attaching some of the performance improvements.

So in 0.2.1 I created a single table with a single shard stored on HDFS.
blur (default)> create -t slow-0_2_1-table -c 1 -l hdfs://
127.0.0.1:9000/blur/slow-0.2.1-table

Then I ran the loadtestdata command that said to store everything in the
WAL, add 1,000,000 Row with 1 Record in each with 1 Family and 1 Column
with 1 word in that column.  Add them in batches of 1000 mutations at a
time.

blur (default)> loadtestdata slow-0_2_1-table true 1000000 1 1 1 1 1000
Rows indexed [26000] at Rows [4914.004914/s] Records [4914.004914/s]
Rows indexed [50000] at Rows [483.617459/s] Records [483.617459/s]
Could not connect to controller/shard server. All connections are bad.

The shard server got hung and failed.

Next I tried the same test on the 0.2.2 code.

blur (default)> create -t fast-0_2_2-table -c 1 -l hdfs://
127.0.0.1:9000/blur/fast-0.2.2-table
blur (default)> loadtestdata fast-0_2_2-table true 1000000 1 1 1 1 1000
Rows indexed [31000] at Rows [6144.697721/s] Records [6144.697721/s]
Rows indexed [102000] at Rows [14191.485109/s] Records [14191.485109/s]
Rows indexed [181000] at Rows [15749.601276/s] Records [15749.601276/s]
Rows indexed [260000] at Rows [15771.611100/s] Records [15771.611100/s]
Rows indexed [349000] at Rows [16713.615023/s] Records [16713.615023/s]
Rows indexed [435000] at Rows [17121.242285/s] Records [17121.242285/s]
Rows indexed [521000] at Rows [17111.022682/s] Records [17111.022682/s]
Rows indexed [602000] at Rows [16180.583300/s] Records [16180.583300/s]
Rows indexed [674000] at Rows [14090.019569/s] Records [14090.019569/s]
Rows indexed [737000] at Rows [12529.832936/s] Records [12529.832936/s]
Rows indexed [825000] at Rows [17443.012884/s] Records [17443.012884/s]
Rows indexed [906000] at Rows [16112.989855/s] Records [16112.989855/s]
Rows indexed [989000] at  Rows [16576.792491/s] Records [16576.792491/s]
Rows indexed [1000000] at Rows [19434.628975/s] Records [19434.628975/s]

And the data was available for search as soon as the command was complete.

More tests to come.  Thanks!

Aaron

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message