directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <aok...@bellsouth.net>
Subject Unusual file sizes
Date Fri, 16 Apr 2004 18:49:57 GMT
Hi,

I was running some tests to see just how much of a performance boost 
is gained by using primitive types over Object wrappers for 
primitives.  I performed an experiment using an int decoder/encoder 
which is attached.  The encoder decoder performs just like the 
JE's Integer binding does - took the code straight out of TupleInput 
and TupleOutput. It turns out that for 10K keys each with 5 
duplicates ( that's 50K records total ) there was less than a 7% 
improvement in performance time for some tests.  I've attached the 
tests as well specifically.  Take a look at testIntStress() and
testIntegerStress().

It's interesting how going the primitive route has not really added all 
that much to the performance.  I guess Java is pretty well optimized 
for Object creation - at least 1.4.2-b28 SUN JVM is.  Also the binding 
API's are pretty efficient.

Now the most interesting aspect is the massive 7MB file that 
is generated for both: both produce the same exact sized of file.  

Listing the test environment directory I get the following:

$ ls -ls ../testEnv/
total 7153
7153 -rwxr-xr-x    1 akarasul None      7323907 Apr 16 14:27 00000000.jdb
   0 -rwxr-xr-x    1 akarasul None            0 Apr 16 14:27 je.lck

Notice the 7,323,907 bytes which is almost 7 MB.  Now the total 
bytes including keys should be around 400K.  I estimated the 
space generously like so:

   (10K keys) * (5 dups per key) = 50K * 4byte values = 200Kb 

Now double it to account for 4 byte keys and we get 400Kb.  

So basically I'm storing 18 times the data I'm putting in.  I realize 
with smaller data this is a higher ratio but it leads me to the 
question of how much overhead is generated per database tuple?  Also 
is the overhead constant?

Alex


Mime
View raw message