jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Müller" <thomas.muel...@day.com>
Subject Re: Efficient but simple hash for storing lot of unidentifiable contents
Date Thu, 08 Jan 2009 09:15:43 GMT
Hi,

What about:

1) Start with the node of your choice (let's say /data/)
2) Check how many nodes and child nodes are in that node
3) If more than 50: pick a child node randomly from the range 00-50 and
continue at step 2)
4) Create a node with a random name, done

I would make sure there are no same name siblings (for multiple reasons).
There are at least two solutions:

A) Disallow same name siblings, use node names n00-n50, and let the
algorithm re-try if there is a clash. I'm not sure if that works well with
clustering.

B) Use a cryptographically secure pseudo random number generator (for
example a UUID) as the node name. This works well with clustering, the node
names will be quite long however.

Disadvantage:
- Maybe performance. To improve that, you could keep the information 'root
is full' in memory and start with a random node /data/00 - /data/50 directly

Advantages:
- No counter required
- No synchronization required
- Holes are automatically filled

Regards,
Thomas

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message