kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From t...@apache.org
Subject incubator-kudu git commit: Add 4/4 weekly update
Date Tue, 05 Apr 2016 01:20:49 GMT
Repository: incubator-kudu
Updated Branches:
  refs/heads/gh-pages e6915e44d -> 456ddb44b

Add 4/4 weekly update

Change-Id: Id5a7262fd5dc1da0751e3a530e01d78eae925730
Reviewed-on: http://gerrit.cloudera.org:8080/2717
Reviewed-by: Todd Lipcon <todd@apache.org>
Tested-by: Todd Lipcon <todd@apache.org>

Project: http://git-wip-us.apache.org/repos/asf/incubator-kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-kudu/commit/456ddb44
Tree: http://git-wip-us.apache.org/repos/asf/incubator-kudu/tree/456ddb44
Diff: http://git-wip-us.apache.org/repos/asf/incubator-kudu/diff/456ddb44

Branch: refs/heads/gh-pages
Commit: 456ddb44bdecdf9dc2bd0f7371f37cab9f369cee
Parents: e6915e4
Author: Todd Lipcon <todd@cloudera.com>
Authored: Mon Apr 4 15:56:36 2016 -0700
Committer: Todd Lipcon <todd@apache.org>
Committed: Tue Apr 5 01:20:14 2016 +0000

 _posts/2016-04-04-weekly-update.md | 84 +++++++++++++++++++++++++++++++++
 community.md                       |  2 +
 2 files changed, 86 insertions(+)

diff --git a/_posts/2016-04-04-weekly-update.md b/_posts/2016-04-04-weekly-update.md
new file mode 100644
index 0000000..1942744
--- /dev/null
+++ b/_posts/2016-04-04-weekly-update.md
@@ -0,0 +1,84 @@
+layout: post
+title: Apache Kudu (incubating) Weekly Update April 4, 2016
+author: Todd Lipcon
+Welcome to the third edition of the Kudu Weekly Update. This weekly blog post
+covers ongoing development and news in the Apache Kudu (incubating) project.
+If you find this post useful, please let us know by emailing the
+[kudu-user mailing list](mailto:user@kudu.incubator.apache.org) or
+tweeting at [@ApacheKudu](https://twitter.com/ApacheKudu). Similarly, if you're
+aware of some Kudu news we missed, let us know so we can cover it in
+a future post.
+## Development discussions and code in progress
+* The 0.8.0 release train is progressing nicely. Jean-Daniel Cryans posted a first
+  release candidate [VOTE thread](http://mail-archives.apache.org/mod_mbox/incubator-kudu-dev/201604.mbox/%3CCAGpTDNfA-hsv6xkeNcvwBGarP1sri%2BvBYqYNt70YWeH44_QPSw%40mail.gmail.com%3E)
+  on the dev mailing list. Please download the release candidate and give it a try.
+* Ara Ebrahimi's implementation of a [Kudu sink for Apache Flume](https://github.com/apache/incubator-kudu/blob/master/java/kudu-flume-sink/)
+  was completed this week. Mike Percy worked on some follow-up around documentation and a
+  _jar-with-dependencies_ for easier consumption.
+* Mike also worked on and committed an [improvement to the Java API](http://gerrit.cloudera.org:8080/#/c/2640/)
+  so that error results for failed write operations are easier to handle. Previously,
+  clients had to perform string comparisons to know whether an insert failed due
+  to something like a duplicate key constraint rather than some other less expected
+  error.
+* After some early cluster testing of the new implementation of scanner predicates,
+  a few tricky bugs were discovered. The first was around proper handling of inequality
+  predicates on floating point columns. The second involved handling predicates like
+  'my_int8_col <= 127': for non-nullable columns, this predicate is a tautology
+  and can be eliminated. However, for nullable columns, such a predicate is equivalent
+  to 'my_int8_col IS NOT NULL'. In order to fix this, Dan Burkert added an internal
+  [implementation of 'IS NOT NULL'](http://gerrit.cloudera.org:8080/#/c/2671/).
+  These bugs also exposed a few gaps in test coverage around predicate handling. Dan
+  added a couple thousand lines worth of new test coverage both from [C++](http://gerrit.cloudera.org:8080/2677)
+  and [Java](http://gerrit.cloudera.org:8080/#/c/2591/).
+  These exhaustive tests also identified a [gap in Kudu's handling of NaN
+  float values](https://issues.apache.org/jira/browse/KUDU-1386). The team
+  elected to leave this as a known issue for now, since usage of NaN is relatively
+  rare.
+* Todd Lipcon fixed a [bug in Kudu's implementation of Raft configuration
+  change](http://gerrit.cloudera.org:8080/#/c/2483/).
+  This bug could cause tablet replicas to become "stuck" after certain types of network
+  partitions. The fix will be included in the upcoming 0.8 release.
+* Mike Percy has been working on a [bug](http://gerrit.cloudera.org:8080/#/c/2595/)
+  where tablet servers fail to start up after their write-ahead logs (WALs) have been
+  truncated. This can happen on certain types of machine crashes, or if the disks
+  fill up under a write workload.
+* Todd Lipcon spent time this week testing Kudu on a small cluster with a 3TB
+  TPC-H dataset. In particular, he was focusing on concurrent query workloads,
+  including scenarios with multiple read-only users in addition to combining
+  a query workload with a write workload. As a result, he identified a few
+  issues around Kudu's handling of RPCs in overload conditions.
+  In order to improve the behavior on the server side, Todd changed the RPC
+  scheduling algorithm to use an [earliest-deadline-first](https://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling)
+  policy. This has the effect of preventing query timeouts: a query which
+  is closest to experiencing a timeout will be scheduled with higher priority
+  over those which have plenty of time left.
+  In addition, this work identified a few bugs in the Kudu C++ client.
+  In particular, in the case where the server was overloaded, the client
+  would sometimes [incorrectly rewind to the start of the current
+  tablet](http://gerrit.cloudera.org:8080/#/c/2654/) resulting in incorrect
+  results. In other cases, the client would end up in a [tight loop sending
+  RPCs to the master](http://gerrit.cloudera.org:8080/#/c/2709/). Fixes
+  for both of these issues will be in the upcoming 0.8 release.
+## Upcoming talks and meetups
+* Todd Lipcon will be presenting an introductory Kudu talk at [DataEngConf](http://www.dataengconf.com/schedule/)
+  on Friday, April 8th.

diff --git a/community.md b/community.md
index a32fc74..5cb8de8 100644
--- a/community.md
+++ b/community.md
@@ -109,6 +109,8 @@ your city, get in touch by emailing the
    _Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data_. Presented
by Mike Percy and Ashish Singh.
 1. Mon, May 9, 2016. [Apache: Big Data 2016](http://apachebigdata2016.sched.org/event/6M0B/introduction-to-apache-kudu-incubating-for-timeseries-storage-dan-burkert-cloudera).
Vancouver, BC, Canada.<br/>
    _Introduction to Apache Kudu (incubating) for timeseries storage_. Presented by Dan Burkert.
+1. Fri, Apr 8, 2016. [DataEngConf](http://www.dataengconf.com/hadoop-vs-kudu). San Francisco,
CA, USA.<br/>
+  _Resolving Transactional Access/Analytic Performance Trade-offs in Apache Hadoop with Apache
 1. Thu, Mar 31, 2016. [Strata Hadoop World](http://conferences.oreilly.com/strata/hadoop-big-data-ca/public/schedule/detail/47055).
San Jose, CA, USA.<br/>
    _Fast data made easy with Apache Kafka and Apache Kudu (incubating)_. Presented by Ted
Malaska and Jeff Holoman.
 1. Sat, Mar 19, 2016. [China Hadoop Summit 2016](http://chinahadoop.com/archives/1313). Beijing,

View raw message