incubator-accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesse Yates <jesse.k.ya...@gmail.com>
Subject Re: (Re)Introducing Culvert - A secondary indexing framework for BigTable like systems
Date Thu, 22 Dec 2011 22:46:54 GMT
I just updated trunk so that we don't build the accumulo package by default.

If you want to build with accumulo, right now we are supporting the
"accumulo-1.3.5-incubating" branch, which supports the current released
version of accumulo
(accumulo-1.3.5<http://incubator.apache.org/accumulo/downloads/downloads.html>).


Hopefully, in the near future, we can start hosting the accumulo snapshots
in a publicly accessible maven repository, and we can merge the accumulo
branch back into trunk.

On Thu, Dec 22, 2011 at 2:35 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Thanks for the hint. That works.
>
> I had to modify culvert-accumulo/pom.xml so that it looks for
> 1.5.0-incubating-SNAPSHOT which was built by accumulo TRUNK.
>
> On Thu, Dec 22, 2011 at 2:22 PM, Jesse Yates <jesse.k.yates@gmail.com
> >wrote:
>
> > Wow, that's embarrassing - project not building...
> >
> > It's because accumulo's release is no longer deployed into the standard
> > apache maven repository. Maybe one of the accumulo committers can shed
> some
> > light on where to find it?
> >
> > I'll make some changes and have it at least compiling from the raw
> tonight
> > :)
> >
> > The alternative is to download accumulo source (
> > https://github.com/apache/accumulo) and "mvn clean install" to get it
> > working on your local machine.
> >
> > Thanks Ted!
> >
> > -Jesse
> >
> > On Thu, Dec 22, 2011 at 1:54 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Thanks for the update, Jesse.
> > > Let us know of any feature Culvert needs from HBase.
> > >
> > > After cloning Culvert, I got:
> > >
> > > [INFO] Culvert - Accumulo Integration .................... FAILURE
> > [0.431s]
> > > [INFO]
> > >
> ------------------------------------------------------------------------
> > > [INFO] BUILD FAILURE
> > > [INFO]
> > >
> ------------------------------------------------------------------------
> > > [INFO] Total time: 1:06.638s
> > > [INFO] Finished at: Thu Dec 22 13:51:34 PST 2011
> > > [INFO] Final Memory: 20M/81M
> > > [INFO]
> > >
> ------------------------------------------------------------------------
> > > [ERROR] Failed to execute goal on project culvert-accumulo: Could not
> > > resolve dependencies for project
> > > com.bah.culvert:culvert-accumulo:jar:0.4.0-SNAPSHOT: Could not find
> > > artifact
> org.apache.accumulo:accumulo-core:jar:1.4.0-incubating-SNAPSHOT
> > in
> > > apache-snapshots (http://repository.apache.org/snapshots/) -> [Help 1]
> > >
> > > Can someone provide hint ?
> > >
> > > On Thu, Dec 22, 2011 at 11:44 AM, Jesse Yates <jesse.k.yates@gmail.com
> > > >wrote:
> > >
> > > > Culvert was originally introduced at Hadoop Summit 2011, but recent
> > > updates
> > > > have made it very applicable to current systems. Recently, we added
> > > support
> > > > for Accumulo as well as upgraded HBase support to 0.92. Since Hadoop
> > > > Summit, there have also been significant code cleanup and added some
> > > small
> > > > features. However, we found that most people hadn't heard of Culvert,
> > so
> > > we
> > > > wanted to re-release the framework.
> > > >
> > > > For an introduction to using Culvert, check out the blog post here:
> > > > http://jyates.github.com/2011/11/17/intro-to-culvert.html
> > > >
> > > > Also, the original presentation (where we discuss the internals) is
> > > > available on slideshare<
> > > >
> > >
> >
> http://www.slideshare.net/jesse_yates/culvert-a-robust-framework-for-secondary-indexing-of-structured-and-unstructured-data
> > > > >
> > > > .
> > > >
> > > > There is a Culvert hackathon in the middle of January:
> > > > http://culverthackathon2012.eventbrite.com/
> > > >
> > > > Oh, and you can find the code on
> > > > github<https://github.com/booz-allen-hamilton/culvert>
> > > > .
> > > >
> > > > Below is an overview of why we wrote Culvert and what it does.
> > > >
> > > > Secondary indexing is a common design pattern in BigTable-like
> > databases
> > > > that allows users to index one or more columns in a table. This
> > technique
> > > > enables fast search of records in a database based on a particular
> > column
> > > > instead of the row id, thus enabling relational-style semantics in a
> > > NoSQL
> > > > environment. Frequently, the index is stored either in a reserved
> > > namespace
> > > > in the table or another index table.
> > > >
> > > > Despite the fact that this is a common design pattern in
> BigTable-based
> > > > applications, most implementations of this practice to date have been
> > > > tightly coupled with a particular application. As a result, few
> > > > general-purpose frameworks for secondary indexing on BigTable-like
> > > > databases exist, and those that do are tied to a particular
> > > implementation
> > > > of the BigTable model.
> > > >
> > > > There are several existing tools (Solr, Lily), but these are focused
> on
> > > > doing text based search and are highly restrictive to indexes created
> > > > through their framework. What if you want to use your existing
> indexes?
> > > Or
> > > > leverage the indexes to do complex queries?
> > > >
> > > > We developed a solution to this problem called Culvert that supports
> > > online
> > > > index updates as well as a variation of the HIVE query language. In
> > > > designing Culvert, we sought to make the solution pluggable so that
> it
> > > can
> > > > be used on any of the many BigTable-like databases (HBase, Cassandra,
> > > > etc.). Furthermore, it is also easily extensible to existing, hand
> > rolled
> > > > indexes.
> > > >
> > > > As well as being a secondary indexing framework, it is also a query
> > > > execution mechanism - think pig/hive minus the fancy command line. We
> > > > support a subset of SQL, but are able to take full advantage of
> > > home-rolled
> > > > and built-in indexes, leading to query execution times potentially
> > orders
> > > > of magnitude smaller than existing approaches and certainly orders of
> > > > magnitude more easily.
> > > >
> > > > -- Jesse
> > > > -------------------
> > > > Jesse Yates
> > > > 240-888-2200
> > > > @jesse_yates
> > > >
> > >
> >
> >
> >
> > --
> > -------------------
> > Jesse Yates
> > 240-888-2200
> > @jesse_yates
> >
>



-- 
-------------------
Jesse Yates
240-888-2200
@jesse_yates

Mime
View raw message