accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keys Botzum <kbot...@maprtech.com>
Subject Re: Accumulo on MapR Continued
Date Fri, 20 Apr 2012 13:35:15 GMT
Keith,

I was able to get Accumulo to use a Accumulo specific configuration of Hadoop. It was a bit
of a hack. Basically I created a fake Hadoop installation tree that is almost entirely symbolic
links to the real tree under /opt/mapr/hadoop. The only real file in the tree is core-site.xml
where I set the two properties. The essential steps where:
	cd /opt/accumulo
	mkdir hadoop
	mkdir hadoop/hadoop-0.20.2
	cd  hadoop/hadoop-0.20.2
	ln -s /opt/mapr/hadoop/hadoop-0.20.2/* .
	rm conf
	mkdir conf
	cd conf
	ln -s /opt/mapr/hadoop/hadoop-0.20.2/conf/*
	cp core-site.xml t
	mv t core-site.xml
	edit core-site.xml as needed

Then I set the HADOOP_HOME in accumulo-env.sh to that directory and everything worked fine.

By the way, I tried setting HADOOP_CONF_DIR and that had no effect.

Since I plan to document these steps, I want to make sure I understood your intent and that
I haven't missed something. Typically in Hadoop components the ultimate configuration is a
combination of each components *-site.xml file. As a result I can set things in, for example,
hbase-site.xml that are really Hadoop properties. Assuming I understood what you and Eric
were saying, this is not true in Accumulo. That's fine by me, but I just want to make sure
I'm not saying things that aren't true.

Thanks again for all of your help,
Keys

p.s. I'm running the random and ingest tests you and Eric suggested as we speak. The random
test completed successfully.
________________________________
Keys Botzum
Senior Principal Technologist
WW Systems Engineering
kbotzum@maprtech.com
443-718-0098
MapR Technologies
http://www.mapr.com



On Apr 18, 2012, at 3:11 PM, Keith Turner wrote:

> I suppose accumulo could be pointed to a different hadoop config dir.
> 
> On Wed, Apr 18, 2012 at 1:58 PM, Keys Botzum <kbotzum@maprtech.com> wrote:
>> Eric and Keith,
>> 
>> I will attempt the additional tests you have suggested.
>> 
>> Any ideas on what to do regarding those configuration properties? With hbase
>> in hbase-site.xml, we set those properties and they work fine. Is there some
>> incantation I'm missing here? I really don't want those properties to be
>> global as they will negatively impact performance and are only relevant to
>> hbase and Accumulo.
>> 
>> Thanks,
>> Keys
>> ________________________________
>> Keys Botzum
>> Senior Principal Technologist
>> WW Systems Engineering
>> kbotzum@maprtech.com
>> 443-718-0098
>> MapR Technologies
>> http://www.mapr.com
>> 
>> 
>> 
>> On Apr 18, 2012, at 1:42 PM, Keith Turner wrote:
>> 
>> Settings in accumulo-site.xml do not end up in the hadoop config
>> object, so setting them will probably have no effect.
>> 
>> I would suggest running continuous ingest test and random walk test if
>> you really want to stress it.  These are the test we use prior to an
>> accumulo release.  You would need to exclude the random walk security
>> test, it triggers known bug in 1.4 that are not fixed.
>> 
>> Running the test on a cluster overnight would be good.
>> 
>> Keith
>> 
>> On Wed, Apr 18, 2012 at 1:17 PM, Keys Botzum <kbotzum@maprtech.com> wrote:
>> 
>> Thanks to the help of Keith, Todd, and Eric, as well as MapR engineering,
>> all of the Accumulo tests is test/system/auto are now passing. Note that the
>> latelastcontact test only passes if I actually install zookeeper on the
>> host. This is because of the dependency on zkCli.sh that I mentioned
>> earlier.
>> 
>> 
>> The final piece of the puzzle was that MapR does aggressive read ahead
>> caching of data as well as aggregation of writes to improve performance. As
>> with Hbase, we don't think this type of behavior is helpful with something
>> like Accumulo. In our specific case, the interaction between Accumulo and
>> MapR's behavior results in the large row test failing.
>> 
>> 
>> So now I have one more question. To disable the caching and aggregation
>> behavior, we need to set these properties:
>> 
>> <property>
>> 
>> <name>fs.mapr.readbuffering</name>
>> 
>> <value>false</value>
>> 
>> </property>
>> 
>> 
>> <property>
>> 
>> <name>fs.mapr.aggregate.writes</name>
>> 
>> <value>false</value>
>> 
>> </property>
>> 
>> 
>> If I set them in core-site.xml they of course work but that's a global
>> setting. I want to only affect Accumulo. If I set them in accumulo-site.xml,
>> I presume they take effect for normal Accumulo usage, but I'm nearly certain
>> that settings in accumulo-site.xml do not affect the tests as I posted
>> earlier. How can I set those two properties in a way that will cause the
>> tests temporary configuration to take them into account? I tried editing
>> TestUtils.py TestUtilsMixin settings as did work for the Accumulo property
>> table.file.compress.type, but the MapR related properties don't seem to
>> take. Ideas?
>> 
>> 
>> Also, if all of the auto tests pass successfully do you feel comfortable
>> that the testing was sufficient or do you recommend running additional
>> tests?
>> 
>> 
>> Thanks!
>> 
>> Keys
>> 
>> ________________________________
>> 
>> Keys Botzum
>> 
>> Senior Principal Technologist
>> 
>> WW Systems Engineering
>> 
>> kbotzum@maprtech.com
>> 
>> 443-718-0098
>> 
>> MapR Technologies
>> 
>> http://www.mapr.com
>> 
>> 
>> 


Mime
View raw message