accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <>
Subject Re: [VOTE] [BLOG] New blog draft "Apache Accumulo 1.6.0 Quickstart"
Date Wed, 21 May 2014 19:21:43 GMT
I still think the minimal commands would be best, perhaps with fewer
words to explain what each command does.

Perhaps the commands, with an abbreviated description would be best in
a summary section?

# Make sure Hadoop and ZooKeeper are running.
export HADOOP_PREFIX=/path/to/hadoop
export ZOOKEEPER_HOME=/path/to/zookeeper
curl <url> -o <file>                # download the release
tar xf <file>                          # unpack the release
cd <dir>                              # navigate to the unpacked directory
bin/         # copy configuration, based on a
menu-driven user choices
bin/     # build the native implementation of
the in-memory maps
bin/accumulo init                  # initialize accumulo in HDFS and ZooKeeper
bin/                      # start the Accumulo services

Additionally, I would emphasize the file to
populate the config area... rather than manually copying
configuration. I think that's been in since 1.5.x, and simply makes it
easier to select which config to copy. Additionally, in 1.6.0, it
helps produce better configs than the examples for some circumstances
(Hadoop 1, for instance), and you can simplify the discussion about
editing the files after copying to get Hadoop 1 support.

Christopher L Tubbs II

On Wed, May 21, 2014 at 2:44 PM, Josh Elser <> wrote:
> Keith and Christopher,
> I think I need to come up with a better title, because the intent of the
> post wasn't "how do I start 1.6.0", it's "what's changed in how I start
> 1.6.0 compared to older versions".
> Since this is a blog, I want to actually spend the time describing the what
> and why as to these changes. I don't want to just throw some shell commands
> up on the page because there's no context at all. That doesn't help someone
> understand why they're running their commands.
> I will add code block to the "copy example configs" section to match the
> others, and perhaps mention as an alternative too
> (although I'll have to learn how it works too).
> On 5/21/14, 2:26 PM, Christopher wrote:
>> This is a bit wordy and could probably be reduced to:
>> # Make sure Hadoop and ZooKeeper are running.
>> export HADOOP_PREFIX=/path/to/hadoop
>> export ZOOKEEPER_HOME=/path/to/zookeeper
>> curl <url> -o <file>
>> tar xf <file>
>> cd <dir>
>> bin/
>> bin/
>> bin/accumulo init
>> bin/
>> --
>> Christopher L Tubbs II
>> On Tue, May 20, 2014 at 12:56 PM, Josh Elser <> wrote:
>>> Including plaintext for those who don't have a blog account (yet).
>>> On 5/20/14, 12:40 PM, Josh Elser wrote:
>>>> All,
>>>> I took a few moments to write up some of the details surrounding changes
>>>> that took place in 1.6.0. It covers downloading a release from us, the
>>>> changes with native maps and how to build them, how to choose example
>>>> configurations and then init'ing and starting Accumulo
>>>> Grammatical feedback and inaccuracies would be humbly accepted. This
>>>> will be open for feedback for 3 days (2014/05/23 1700 UTC) after which
>>>> I'll promote it to the main blog.
>>>> Thanks!
>>> --- Plaintext draft content
>>> Getting Started with Apache Accumulo 1.6.0
>>> On May 12th, 2014, the Apache Accumulo project happily announced version
>>> 1.6.0 to the community. This is a new major release for the project which
>>> contains many numerous new features and fixes. For the full list of
>>> notable
>>> changes, I'd recommend that you check out the release notes that were
>>> published alongside the release itself. For this post, I'd like to cover
>>> some of the changes that have been made at the installation level that
>>> are a
>>> change for users who are already familiar with the project.
>>> Download the release
>>> Like always, you can find out releases on the our downloads page at
>>>  You have the choice of
>>> downloading
>>> the source and building it yourself, or choosing the binary tarball which
>>> already contains pre-built jars for use.
>>> Native Maps
>>> One of the major components of the original BigTable design was an
>>> "In-Memory Map" which provided fast insert and read operations. Accumulo
>>> implements this using a C++ sorted map with a custom allocator which is
>>> invoked by the TabletServer using JNI. Each TabletServer uses its own
>>> "native" map. It is highly desirable to use this native map as it comes
>>> with
>>> a notable performance increase over a Java map (which is the fallback
>>> when
>>> the Accumulo shared library is not found) in addition to greatly reducing
>>> the TabletServer's JVM garbage collector stress when ingesting data.
>>> In previous versions, the binary tarball contained a pre-compiled version
>>> of
>>> the native library (under lib/native/). Shipping a compiled binary was a
>>> convenience but also left much confusion when it didn't work on systems
>>> which had different, incompatible versions of GCC toolchains installed
>>> than
>>> what the binary was built against. As such, we have stopped bundling the
>>> pre-built shared library in favor of users building this library on their
>>> own, and instead include an accumulo-native.tar.gz file within the lib
>>> directory which contains the necessary files to build the library
>>> yourself.
>>> To reduce the burden on users, we've also introduced a new script inside
>>> of
>>> the bin directory:
>>> Invoking this script will automatically unpack, build and install the
>>> native
>>> map in $ACCUMULO_HOME/lib/native. If you've used older versions of
>>> Accumulo,
>>> you will also notice that the library name is different in an attempt to
>>> better follow standard conventions: on Linux and
>>> libaccumulo.dylib on Mac OS X.
>>> Example Configurations
>>> Apache Accumulo still bundles a set of example configuration files in
>>> conf/examples. Each sub-directory contains the complete set of files to
>>> run
>>> on a single node with the named memory limitations. For example, the
>>> files
>>> contained in conf/examples/3GB/native-standalone will run Accumulo on a
>>> single node, with native maps (don't forget to build them first!), within
>>> a
>>> total memory footprint of 3GB. Copy the contents of one of these
>>> directories
>>> into conf/ and make sure that your relevant installation details (e.g.
>>> HADOOP_PREFIX, JAVA_HOME, etc) are properly set in
>>> The change in these scripts is that they default to using Apache Hadoop 2
>>> packaging details, such as the Hadoop conf directory and jar locations.
>>> It
>>> is highly recommended by the community that you use Apache Accumulo 1.6.0
>>> with at least Apache Hadoop 2.2.0, most notably, to ensure that you will
>>> not
>>> lose data in the face of power failure. If you are still running on a
>>> Hadoop
>>> 1 release (1.2.1), you will need to edit both and
>>> accumulo-site.xml. There are comments in each file which instruct you
>>> what
>>> needs to be changed.
>>> Starting Accumulo
>>> Initializing and starting Accumulo hasn't changed at all! After you have
>>> created the configuration files and, if you're using them, built the
>>> native
>>> maps, run:
>>>    accumulo init
>>> This will prompt you to name your Accumulo instance and set the Accumulo
>>> root user's password, then start Accumulo using
>>>    $ACCUMULO_HOME/bin/

View raw message