jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Klimetschek <aklim...@day.com>
Subject Re: A generic question about JackRabbit
Date Sun, 28 Mar 2010 17:28:06 GMT
On Sat, Mar 27, 2010 at 00:16, Rami Ojares <rami.ojares@gmail.com> wrote:
> But Alex said a few radical comments so I just can't refrain myself from
> commenting them a little.
>
>> I think normalization is often thought to be absolutely fundamental
>> for any data schema, because it is a central part of RDBMS and this is
>> basically the only thing that was taught the last 20-30 years (went
>> this route myself). But a major reason for normalization was simply
>> the space constraint, it's not fundamental at all.
>>
>
> Data normalization is concerned with issues of how the relational model
> should be used.
> If you don't have the data in 1st normal form then the relational operators
> and other tools
> that the relational model provides are unable to query the data.

Yes, and all I was saying that because of space constraints, a storage
model around normalization won. I didn't say that normalization is
generally bad, and I didn't question the obvious fact that
normalization is a central concept of an RDBMS.

>> The same story with the ACID constraints... banking accounts are not
>> the only software application nowadays ;-)
>>
>
> Wow, that came out of the bush and hit me right there in the back between my
> shoulder blades ;-)
>
> Basically what you are saying is that when you update data in your storage
> system (that could any data model)
> you don't really care if only part of the data you sent there got updated
> (Atomicity)
> You don't care that if you have set some rules for your data that these
> rules are respected. (Consistency)
> If many threads are accessing the data it does not concern you if they see
> each other's updates
> partially and modify each others data while their operations are underway.
> (Isolation)
> And certainly you don't give a toss about whether your data really stays in
> the storage once you have put it there. (Durability)
>
> The reason why acid properties are related to banking industry is that
> people demand correctness when banks deal with their money.

Yes, I didn't question that, I just noted that banking is a very
special case. And it comes at a cost that most banking applications
are blocked an entire night to be able to safely process transactions.
It's just that banking is the number one sample and teaching schema
for RDBMS, because it's very good at it. However, this affects how
people think about data storage, as if ACID is a given requirement for
everything. It isn't. I think this opens up quite a bit nowadays in
light of the Nosql movement. See the CAP theorem [1].

Most applications don't need all those strict constraints, eg. can
perfectly live with eventual consistency. The real world is not
perfect either.

All I am saying is that it's just necessary to detach from certain
principles in the RDBMS world if you want to see the big picture.
Whatever you chose in the end depends on the needs of your
application. And there are many applications that don't benefit from
the (strict) RDBMS model, rather they loose flexibility, performance,
etc.

[1] http://en.wikipedia.org/wiki/CAP_theorem

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetschek@day.com

Mime
View raw message