Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C86C9E64 for ; Fri, 20 Apr 2012 13:35:51 +0000 (UTC) Received: (qmail 50468 invoked by uid 500); 20 Apr 2012 13:35:51 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 50430 invoked by uid 500); 20 Apr 2012 13:35:50 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 50420 invoked by uid 99); 20 Apr 2012 13:35:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Apr 2012 13:35:50 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.160.41] (HELO mail-pb0-f41.google.com) (209.85.160.41) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Apr 2012 13:35:45 +0000 Received: by pbcup15 with SMTP id up15so1137382pbc.0 for ; Fri, 20 Apr 2012 06:35:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:mime-version:content-type:subject:date:in-reply-to:to :references:message-id:x-mailer:x-gm-message-state; bh=WDFzzToGDHN51OOFGWrJVzzxRPsf/vwykb/Z8/jkjSo=; b=ihaQxy8c8Mk6PchSUrMT618e6iZZ6HxqnYepEkFlAhq7DJhpG5eXIUCMObdGu++qfA QgK+h60xKVUZGd8rp+rtrJVGIT/in8MAKVMjuVzVP5rgQw4lhtJnmWa9SxyGstNtuh7f t6WZeUvqQf16VukTFXgetkOUdH9nIVYSz3NeaqMR7IYQs6Jjaz2s8yxz5TIXLiTviLdA NYS35uKrY4dkaELvjoczjgnAOOR8SgwxBj5uNwkjrfoYjypW39RePPKzcdCm40nA7R0T PBQvt1/B45kJ86vxx5LIRZnQn3Jks3MWkR7mb7BaDTEnK+7TRCx3X8ykERNCsPsWsx8Z xV2A== Received: by 10.68.195.71 with SMTP id ic7mr13087162pbc.34.1334928924553; Fri, 20 Apr 2012 06:35:24 -0700 (PDT) Received: from [192.168.1.3] (c-68-34-238-6.hsd1.md.comcast.net. [68.34.238.6]) by mx.google.com with ESMTPS id qo2sm5385813pbb.15.2012.04.20.06.35.21 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 20 Apr 2012 06:35:22 -0700 (PDT) From: Keys Botzum Mime-Version: 1.0 (Apple Message framework v1257) Content-Type: multipart/alternative; boundary="Apple-Mail=_716A7AF8-3BAE-4D21-987C-A85B8D43BDB7" Subject: Re: Accumulo on MapR Continued Date: Fri, 20 Apr 2012 09:35:15 -0400 In-Reply-To: To: user@accumulo.apache.org References: <5D6F05D5-7779-40F8-9887-5557C94C8B5B@maprtech.com> <84C21276-7BA4-40BF-B34B-E963DE9F13A1@maprtech.com> <28448BFC-4882-4B60-87E8-F57F9A392C5E@maprtech.com> <802A3202-0BB3-458A-9A69-13E7E7B4A5BF@maprtech.com> <64C055E2-6590-4297-9A40-76679E133E3B@maprtech.com> <6105A360-6FDE-4792-B5E5-61B0D23F7866@maprtech.com> <041D5C7C-50A6-4E1E-B2E0-3FA403081A28@maprtech.com> <8227FBC9-C7C3-472F-9910-4B760BFB1327@maprtech.com> Message-Id: <25E7BB40-3470-44AD-A969-DF94B1C021DD@maprtech.com> X-Mailer: Apple Mail (2.1257) X-Gm-Message-State: ALoCoQkFhyoJx9Nt52ZDSCJM7WYEqOp8Muac8BHJqMdmZc9ybpBki5W4ARmGQsb9pM5XqhMANNKM X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_716A7AF8-3BAE-4D21-987C-A85B8D43BDB7 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Keith, I was able to get Accumulo to use a Accumulo specific configuration of = Hadoop. It was a bit of a hack. Basically I created a fake Hadoop = installation tree that is almost entirely symbolic links to the real = tree under /opt/mapr/hadoop. The only real file in the tree is = core-site.xml where I set the two properties. The essential steps where: cd /opt/accumulo mkdir hadoop mkdir hadoop/hadoop-0.20.2 cd hadoop/hadoop-0.20.2 ln -s /opt/mapr/hadoop/hadoop-0.20.2/* . rm conf mkdir conf cd conf ln -s /opt/mapr/hadoop/hadoop-0.20.2/conf/* cp core-site.xml t mv t core-site.xml edit core-site.xml as needed Then I set the HADOOP_HOME in accumulo-env.sh to that directory and = everything worked fine. By the way, I tried setting HADOOP_CONF_DIR and that had no effect. Since I plan to document these steps, I want to make sure I understood = your intent and that I haven't missed something. Typically in Hadoop = components the ultimate configuration is a combination of each = components *-site.xml file. As a result I can set things in, for = example, hbase-site.xml that are really Hadoop properties. Assuming I = understood what you and Eric were saying, this is not true in Accumulo. = That's fine by me, but I just want to make sure I'm not saying things = that aren't true. Thanks again for all of your help, Keys p.s. I'm running the random and ingest tests you and Eric suggested as = we speak. The random test completed successfully. ________________________________ Keys Botzum Senior Principal Technologist WW Systems Engineering kbotzum@maprtech.com 443-718-0098 MapR Technologies http://www.mapr.com On Apr 18, 2012, at 3:11 PM, Keith Turner wrote: > I suppose accumulo could be pointed to a different hadoop config dir. >=20 > On Wed, Apr 18, 2012 at 1:58 PM, Keys Botzum = wrote: >> Eric and Keith, >>=20 >> I will attempt the additional tests you have suggested. >>=20 >> Any ideas on what to do regarding those configuration properties? = With hbase >> in hbase-site.xml, we set those properties and they work fine. Is = there some >> incantation I'm missing here? I really don't want those properties to = be >> global as they will negatively impact performance and are only = relevant to >> hbase and Accumulo. >>=20 >> Thanks, >> Keys >> ________________________________ >> Keys Botzum >> Senior Principal Technologist >> WW Systems Engineering >> kbotzum@maprtech.com >> 443-718-0098 >> MapR Technologies >> http://www.mapr.com >>=20 >>=20 >>=20 >> On Apr 18, 2012, at 1:42 PM, Keith Turner wrote: >>=20 >> Settings in accumulo-site.xml do not end up in the hadoop config >> object, so setting them will probably have no effect. >>=20 >> I would suggest running continuous ingest test and random walk test = if >> you really want to stress it. These are the test we use prior to an >> accumulo release. You would need to exclude the random walk security >> test, it triggers known bug in 1.4 that are not fixed. >>=20 >> Running the test on a cluster overnight would be good. >>=20 >> Keith >>=20 >> On Wed, Apr 18, 2012 at 1:17 PM, Keys Botzum = wrote: >>=20 >> Thanks to the help of Keith, Todd, and Eric, as well as MapR = engineering, >> all of the Accumulo tests is test/system/auto are now passing. Note = that the >> latelastcontact test only passes if I actually install zookeeper on = the >> host. This is because of the dependency on zkCli.sh that I mentioned >> earlier. >>=20 >>=20 >> The final piece of the puzzle was that MapR does aggressive read = ahead >> caching of data as well as aggregation of writes to improve = performance. As >> with Hbase, we don't think this type of behavior is helpful with = something >> like Accumulo. In our specific case, the interaction between Accumulo = and >> MapR's behavior results in the large row test failing. >>=20 >>=20 >> So now I have one more question. To disable the caching and = aggregation >> behavior, we need to set these properties: >>=20 >> >>=20 >> fs.mapr.readbuffering >>=20 >> false >>=20 >> >>=20 >>=20 >> >>=20 >> fs.mapr.aggregate.writes >>=20 >> false >>=20 >> >>=20 >>=20 >> If I set them in core-site.xml they of course work but that's a = global >> setting. I want to only affect Accumulo. If I set them in = accumulo-site.xml, >> I presume they take effect for normal Accumulo usage, but I'm nearly = certain >> that settings in accumulo-site.xml do not affect the tests as I = posted >> earlier. How can I set those two properties in a way that will cause = the >> tests temporary configuration to take them into account? I tried = editing >> TestUtils.py TestUtilsMixin settings as did work for the Accumulo = property >> table.file.compress.type, but the MapR related properties don't seem = to >> take. Ideas? >>=20 >>=20 >> Also, if all of the auto tests pass successfully do you feel = comfortable >> that the testing was sufficient or do you recommend running = additional >> tests? >>=20 >>=20 >> Thanks! >>=20 >> Keys >>=20 >> ________________________________ >>=20 >> Keys Botzum >>=20 >> Senior Principal Technologist >>=20 >> WW Systems Engineering >>=20 >> kbotzum@maprtech.com >>=20 >> 443-718-0098 >>=20 >> MapR Technologies >>=20 >> http://www.mapr.com >>=20 >>=20 >>=20 --Apple-Mail=_716A7AF8-3BAE-4D21-987C-A85B8D43BDB7 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 cd /opt/accumulo
mkdir = hadoop
= mkdir hadoop/hadoop-0.20.2
= cd  hadoop/hadoop-0.20.2
ln -s = /opt/mapr/hadoop/hadoop-0.20.2/* .
rm = conf
= mkdir conf
cd conf
ln = -s /opt/mapr/hadoop/hadoop-0.20.2/conf/*
cp = core-site.xml t
mv t = core-site.xml
edit core-site.xml as = needed

Then I set the HADOOP_HOME in = accumulo-env.sh to that directory and everything worked = fine.

By the way, I tried setting = HADOOP_CONF_DIR and that had no effect.

Since I = plan to document these steps, I want to make sure I understood your = intent and that I haven't missed something. Typically in Hadoop = components the ultimate configuration is a combination of each = components *-site.xml file. As a result I can set things in, for = example, hbase-site.xml that are really Hadoop properties. Assuming I = understood what you and Eric were saying, this is not true in Accumulo. = That's fine by me, but I just want to make sure I'm not saying things = that aren't true.

Thanks again for all of your = help,
Keys

p.s. I'm running the = random and ingest tests you and Eric suggested as we speak. The random = test completed successfully.
kbotzum@maprtech.com
443-718-0= 098
MapR Technologies
http://www.mapr.com



On Apr 18, 2012, at 3:11 PM, Keith Turner wrote:

I = suppose accumulo could be pointed to a different hadoop config = dir.

On Wed, Apr 18, 2012 at 1:58 PM, Keys Botzum <kbotzum@maprtech.com> = wrote:
Eric and = Keith,

I will attempt = the additional tests you have suggested.

Any ideas on = what to do regarding those configuration properties? With = hbase
in hbase-site.xml, we = set those properties and they work fine. Is there = some
incantation I'm missing = here? I really don't want those properties to = be
global as they will = negatively impact performance and are only relevant = to
hbase and = Accumulo.

Thanks,
Keys
________________________________
Keys Botzum
Senior Principal Technologist
WW Systems Engineering
kbotzum@maprtech.com
443-718-0098
MapR Technologies
http://www.mapr.com



On Apr 18, = 2012, at 1:42 PM, Keith Turner wrote:

Settings in = accumulo-site.xml do not end up in the hadoop = config
object, so setting them = will probably have no effect.

I would suggest = running continuous ingest test and random walk test = if
you really want to stress = it.  These are the test we use prior to = an
accumulo release.  You = would need to exclude the random walk = security
test, it triggers = known bug in 1.4 that are not fixed.

Running the = test on a cluster overnight would be good.

Keith

On Wed, Apr 18, = 2012 at 1:17 PM, Keys Botzum <kbotzum@maprtech.com> = wrote:

Thanks to the = help of Keith, Todd, and Eric, as well as MapR = engineering,
all of the = Accumulo tests is test/system/auto are now passing. Note that = the
latelastcontact test only = passes if I actually install zookeeper on = the
host. This is because of = the dependency on zkCli.sh that I mentioned
earlier.


The final piece = of the puzzle was that MapR does aggressive read = ahead
caching of data as well = as aggregation of writes to improve performance. = As
with Hbase, we don't think = this type of behavior is helpful with = something
like Accumulo. In = our specific case, the interaction between Accumulo = and
MapR's behavior results in = the large row test failing.


So now I have = one more question. To disable the caching and = aggregation
behavior, we need = to set these properties:

<property>

<name>fs.mapr.readbuffering</name>

<value>false</value>

</property>


<property>

<name>fs.mapr.aggregate.writes</name>

<value>false</value>

</property>


If I set them = in core-site.xml they of course work but that's a = global
setting. I want to only = affect Accumulo. If I set them in = accumulo-site.xml,
I presume = they take effect for normal Accumulo usage, but I'm nearly = certain
that settings in = accumulo-site.xml do not affect the tests as I = posted
earlier. How can I set = those two properties in a way that will cause = the
tests temporary = configuration to take them into account? I tried = editing
TestUtils.py = TestUtilsMixin settings as did work for the Accumulo = property
table.file.compress.type, but the MapR related properties = don't seem to
take. = Ideas?


Also, if all of = the auto tests pass successfully do you feel = comfortable
that the testing = was sufficient or do you recommend running = additional
tests?


Thanks!

Keys

________________________________

Keys = Botzum

Senior = Principal Technologist

WW Systems = Engineering

kbotzum@maprtech.com

443-718-0098

MapR = Technologies

http://www.mapr.com




= --Apple-Mail=_716A7AF8-3BAE-4D21-987C-A85B8D43BDB7--