incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Corbin Hoenes <cor...@tynt.com>
Subject 0.3.0 config file
Date Sun, 02 May 2010 13:51:15 GMT
I'm reprocessing a bunch of data ~45 days ~70GB per day.  It's taking a while...is there some
configuration that might help demux perform better when it's fed a lot of files?  I've noticed
sort takes a long time when it's got too many maps.  Can I lower the amount of maps, etc...?

I saw this in the config but noticed the TODO comments.  Anything here I should configure?

<!-- Chukwa Job parameters -->
	<property>
	  <name>io.sort.mb</name>
	  <value>@TODO-DEMUX-IO-SORT-MB@</value>
	  <description>The total amount of buffer memory to use while sorting
	  files, in megabytes.  By default, gives each merge stream 1MB, which
	  should minimize seeks.</description>
	</property>

	<property>
	  <name>fs.inmemory.size.mb</name>
	  <value>@TODO-DEMUX-FS-INMEMORY-SIZE_MB@</value>
	  <description>The size of the in-memory filsystem instance in MB</description>
	</property>

	<property>
	  <name>io.sort.factor</name>
	  <value>@TODO-DEMUX-IO-SORT-FACTOR@</value>
	  <description>The number of streams to merge at once while sorting
	  files.  This determines the number of open file handles.</description>
	</property>


Mime
View raw message