hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Lily 0.3 is released
Date Mon, 14 Feb 2011 17:28:04 GMT
Congrats lads.  Keep the releases coming.
St.Ack

On Mon, Feb 14, 2011 at 7:23 AM, Steven Noels <stevenn@outerthought.org> wrote:
> Hi all,
>
> Lily is a data/content repository that integrates HBase with SOLR: flexible
> content storage and automatic index maintenance - at scale. It's available
> under the Apache license.
>
> This release is the result of 3 months of hard work since Lily 0.2 last
> October. Our focus was stabilization, performance and robustness, providing
> a platform we can continue building upon. More than 50 tickets were resolved
> during this development sprint, and we're slowly readying ourselves for the
> 1.0 release. Lily 0.3 brings many gradual improvements over Lily 0.2. It has
> a more solid implementation of the blob fields, automatic retry of
> operations that fail due to I/O exceptions (between Lily client and Lily
> server), and other miscellaneous improvements, all listed underneath.
>
> Everything Lily can be found at www.lilyproject.org. We're now also sharing
> details of our commercial software subscription service with select
> prospects, let us know if you're interested!
>
> Here's a concise list of improvements since Lily 0.2:
>
>   - Repository
>      - Performance / space improvements
>         - Shorter column key encoding (field id's)
>         - Reduction of number of column families used
>         - Avoid duplicate values in the table: make use of sparseness of
>         the table
>         - Drop the use of HBase rowlocks, which do not survive region
>         splits/moves.
>         - Use byte[] as keys in RecordType FieldType cache
>      - API
>         - Added a new method createOrUpdate which creates or updates a
>         record depending on whether it already exists. This new method has the
>         advantage over the create method that it can be retried in case of IO
>         exceptions, i.e. it is idempotent, similar to PUT in HTTP/REST.
>         - Allow updating versioned-mutable fields without specifying the
>         record type.
>         - Throw a RecordLockedException instead of generic exception when a
>         record is locked, this allows Lily clients to retry the
> operation in that
>         case.
>      - Clear historical data when deleting a record and remove any
>      referenced blobs.
>      - The link index stores record IDs and field IDs as bytes instead of
>      strings.
>      - The record ID string representation was changed to use comma instead
>      of semicolon to separate variant properties, since the use of
> semicolons was
>      problematic in the JAX-RS based REST interface implementation.
>   - Upgrade to Apache HBase 0.90
>   - Blobs
>      - Rework blobstore functionality
>         - Blobs can only be accessed through the record they are used in,
>         not directly by using their blob key. This is to allow for future
>         record-level access control.
>         - Introduce a Repository.getBlob() method, which returns a
>         BlobAccess object, which provides access to the blob meta
> data (Blob object)
>         and the blob input stream. This avoids the need to read the
> record in case
>         you need the blob metadata.
>         - Uploaded blobs which are never used in a record are cleaned up.
>      - The HDFS-stored blobs are stored in a hierarchical structure.
>   - RowLog improvements
>      - Performance improvements
>         - the RowLog processor uses a Zookeeper based notification system
>         instead of Netty based.
>         - Optimize queue scanning: avoid scanning over deleted rows in the
>         table, fix too-frequent scanning, fix endless scanning loop
> on startup in
>         case of no repository activity.
>         - The RowLog processor only processes messages of a minimal age
>         (avoid conflicts with direct processing of wal messages).
>      - Extended RowLogConfigurationManager to add/update rowlog
>      configuration information.
>      - Avoid and remove stale messages in the queue.
>      - Allow the rowlog to either use row-level locks (wal use case) or
>      executionstate-level locks per subscription (mq use case) when processing
>      messages.
>      - Added a WAL processor which handles open WAL messages.
>   - REST interface
>      - Adapted blob-support to new blobstore functionality. Content-Length
>      header is now set when downloading blobs. Multi-value or
> hierarchical blobs
>      are now accessible.
>      - Support updating versioned-mutable fields.
>      - Fixed various smaller bugs reported by users.
>   - HBase index library
>      - Allow to add/remove multiple entries in one call.
>      - Performance
>         - Fixed important performance issue whereby row scanning always ran
>         to the end of the index table.
>         - Enable scan caching.
>         - Added a performance testing tool.
>      - Indexer
>      - Upgrade to Tika 0.8
>      - Performance
>         - Avoid FieldNotFoundException when evaluating field values
>      - the SOLR request-writer and response-parser implementation
>      configurable. This allows to use the XML format instead of the javabin
>      format.
>   - LilyClient
>      - Automatically retry operations on IOExceptions, this allows
>      operations to survive node failures.
>      - Automatic balancing over all Lily nodes. Each method called on the
>      Repository object will automatically be performed on an
> arbitrarily selected
>      Lily node.
>      - Avro: switch from HTTP to Netty transport. For this, upgraded to an
>      Avro 1.5 snapshot with patch AVRO-747.
>   - Tester tool
>      - Allows to configure test scenarios and indexer and solr
>      configuration.
>      - Has extended logging, metrics and metrics plotting (gnuplot
>      integration) capabilities allowing for performance evaluations.
>      - Introduces general performance testing library.
>   - Lily server process
>      - Ability to create tables with multiple initial regions at first
>      cluster startup (record table, linkindex, blobincubator, ...).
> Also allows
>      to set the max file size and the memstore flush size.
>      - The initial Lily startup can now be performed on multiple nodes
>      concurrently, previously this failed because the table creation
> code did not
>      handle failures in case of concurrent table creation.
>      - Configuration files changed so that they allow for inheritance (=
>      fallback from one conf dir to another, to the built-in conf). Include
>      default configuration in Kauri-module jars. All this will help in
>      maintaining Lily configuration across Lily versions.
>
> We hope you'll enjoy this new Lily as much as we did making it. Let us know
> how we're doing!
>
> The Outerthought Lily team.
>
> --
> Steven Noels
> http://outerthought.org/
> Scalable Smart Data
> Makers of Kauri, Daisy CMS and Lily
>

Mime
View raw message