flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagadish Bihani <jagadish.bih...@pubmatic.com>
Subject Flume throughput correlation with RAM
Date Tue, 09 Oct 2012 07:46:00 GMT
Hi

My flume setup is:

Source Agent : cat source - File Channel - Avro Sink
Dest Agent :     avro source - File Channel - HDFS Sink.

There is only 1 source agent and 1 destination agent.

I measure throughput as amount of data written to HDFS per second.
( I have rolling interval 30 sec; so If 60 MB file is generated in 30 
sec the
throughput is : -- 2 MB/sec ).

I have run *source agent on various machines *with different hardware 
configurations :
(In all cases I run flume agent with JAVA OPTIONS as
"-DJAVA_OPTS="-Xms500m -Xmx1g -Dcom.sun.management.jmxremote
-XX:MaxDirectMemorySize=2g")

JDK is 32 bit.

Experiment 1:
=====
RAM : 16 GB
Processor: Intel Xeon E5620 @ 2.40 GHz (16 cores).
64 bit Processor with 64 bit Kernel.
Throughput: 2 MB/sec

Experiment 2:
======
RAM : 4 GB
Processor: Intel Xeon E5504  @ 2.00GHz (4 cores). 32 bit Processor
64 bit Processor with 32 bit Kernel.
Throughput : 30 KB/sec

Experiment 3:
======
RAM : 8 GB
Processor:Intel Xeon E5520 @ 2.27 GHz (16 cores).32 bit Processor
64 bit Processor with 32 bit Kernel.
Throughput : 80 KB/sec

  -- So as can be seen there is huge difference in the throughput with 
same configuration but
different hardware.
-- In the first case where throughput is more RES is around 160 MB in 
other cases it is in
the range of 40 MB - 50 MB.

Can anybody please give insights that why there is this huge difference 
in the throughput?
What is the correlation between RAM and filechannel/HDFS sink 
performance and also
with 32-bit/64 bit kernel?

Regards,
Jagadish

Mime
View raw message