kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matteo Durighetto <m.durighe...@miriade.it>
Subject Re: [ANNOUNCE] Apache Kudu 1.0.0 release
Date Wed, 21 Sep 2016 06:23:22 GMT
2016-09-20 9:11 GMT+02:00 Todd Lipcon <todd@apache.org>:

> The Apache Kudu team is happy to announce the release of Kudu 1.0.0!
>
> Kudu is an open source storage engine for structured data which supports
> low-latency random access together with efficient analytical access
> patterns. It is designed within the context of the Apache Hadoop ecosystem
> and supports many integrations with other data analytics projects both
> inside and outside of the Apache Software Foundation.
>
> This latest version adds several new features, including:
>
> - Removal of multiversion concurrency control (MVCC) history is now
> supported. This allows Kudu to reclaim disk space, where previously Kudu
> would keep a full history of all changes made to a given table since the
> beginning of time.
>
> - Most of Kudu’s command line tools have been consolidated under a new
> top-level "kudu" tool. This reduces the number of large binaries
> distributed with Kudu and also includes much-improved help output.
>
> - Administrative tools including "kudu cluster ksck" now support running
> against multi-master Kudu clusters.
>
> - The C++ client API now supports writing data in AUTO_FLUSH_BACKGROUND
> mode. This can provide higher throughput for ingest workloads.
>
> This release also includes many bug fixes, optimizations, and other
> improvements, detailed in the release notes available at:
> http://kudu.apache.org/releases/1.0.0/docs/release_notes.html
>
> Download the source release here:
> http://kudu.apache.org/releases/1.0.0/
>
> Convenience binary artifacts for the Java client and various Java
> integrations (eg Spark, Flume) are also now available via the ASF Maven
> repository.
>
> Enjoy the new release!
>
> - The Apache Kudu team
>


Really great. Moreover there are a new producer in flume-kudu sink:
The regexp kudu producer

https://github.com/cloudera/kudu/blob/master/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/RegexpKuduOperationsProducer.java

With the regexp kudu producer is simple to cast with a reg exp and write
records into kudu tables:

 * <p>A regular expression serializer that generates one {@link Insert} or
 * {@link Upsert} per {@link Event} by parsing the payload into values
using a
 * regular expression. Values are coerced to the proper column types.
 *
 * Example: if the Kudu table has the schema
 *
 * key INT32
 * name STRING
 *
 * and producer.pattern is '(?<key>\\d+),(?<name>\w+)', then the
 * RegexpKuduOperationsProducer will parse the string
 *
 * |12345,Mike||54321,Todd|
 *
 * into the rows (key=12345, name=Mike) and (key=54321, name=Todd).

We are just testing it, and it's working.

Kind Regards

Matteo Durighetto
e-mail: m.durighetto@miriade.it

Mime
View raw message