accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Piety <>
Subject Re: problem installing accumulo
Date Thu, 03 Jan 2013 19:26:34 GMT

I guess a little of both. In the Enron email set I have a bunch of folders
representing people. Each folder has subfolders that equate to mailboxes
(inbox, sent_mail, etc...). Each mailbox simply contains text files named
1, 2, 3, 4 that equate to an individual email.

Each email is a text file that if easy to parse into specific fields.  I
want to place those emails in accumulo and run some simple MapReduce for
the demo. Similar to what I saw in some *Cloudbase *training last year.
What I didn't remember is how the tables were arranged.

I was just going to make each email, regardless of mailbox as a row in
accumulo and make make the mailbox and owner separate columns (or column
qualifier to be more specific). My issue is the To and CC fields. Each can
be a list. I was thinking of making the column family to and the column
qualifier 1,2,3, ...).  I could also make the column qualifier for the to
family the actual value "". I wasn't exactly sure of the best

Each email has a Message_ID and so far I think they are unique. If not I
can generate a unique ID.

Again this will be for a simple demo where people may want to search from
some person, to some person and maybe for specific terms in the body of the

Hope this gives a good idea of what I am trying to do. Feel free to ask any
other questions you may have if I wasn';t clear enough. Again I have more
experience working with existing structures. I am trying t use this
experience to learn a little about how to organize the data.

thanks in advance,


On Thu, Jan 3, 2013 at 12:28 PM, John Vines <> wrote:

> Are you looking for generic pointers for it or do you have specific
> questions? Feel free to ask away and someone will be able to help.
> John
> On Thu, Jan 3, 2013 at 12:23 PM, Tim Piety <> wrote:
>> John,
>> No I hadn't. Thank you that was it. I to another look at the install doc
>> and didn't see this step in there.  I then looked at the README file on the
>> ACCUMULO website and it is in there.
>> I was able to start accumulo and then start an accumulo shell and execute
>> the tables command and it listed !METADATA. I presume that this means I am
>> up and running.
>> I am going to use the enron dataset for my demo. I do have a few
>> questions regarding how to structure it if you don't mind a few more
>> questions.
>> thanks again.
>> Tim
>> On Thu, Jan 3, 2013 at 12:07 PM, John Vines <> wrote:
>>> Did you initialize accumulo by running bin/accumulo init?
>>> On Thu, Jan 3, 2013 at 12:02 PM, Tim Piety <> wrote:
>>>> Hi,
>>>> I posted a message the the dev list before Xmas and got not response. I
>>>> figuered I'd try this list. If this is not the correct forum can someone
>>>> please let me know what the correct forum is. I am trying to install
>>>> accumulo for a simple demo. I have hadoop installed and running. I verified
>>>> by testing a mapreduce program and I can look at the HDFS system.
>>>> When I try to start accumulo I get a INFO message saying attempting to
>>>> talk to zookeeper. I verified zookeeper is running and I can access it
>>>> using the The next line to display is INFO :Waiting for accumulo
>>>> to be initialized. That line repeats infinitely.
>>>> I looked at the logs and get a message in the tserver_localhost.out
>>>> saying unable obtain instance id at /accumulo/instance_id. A quick web
>>>> search found a message (
>>>> saying I needed to put the HADOOP/conf directory in  my CLASSPATH. I tried
>>>> that, but that did not work.
>>>> I have looked and didn't find any other groupsw where I could post  a
>>>> question.
>>>> thanks,
>>>> Tim

View raw message