jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Ottinger <dreamr...@gmail.com>
Subject Re: A generic question about JackRabbit
Date Fri, 26 Mar 2010 15:27:00 GMT
On Fri, Mar 26, 2010 at 11:07 AM, Rami Ojares <rami.ojares@gmail.com> wrote:

> On 26.3.2010 16:46, Joseph Ottinger wrote:
>> It was E.F.Codd, for one thing, and not Ted. :)
> Ted was his nickname.
> I was trying to be casual :-)
> (But I see you realized this while I was writing this email)
> I don't know him casually (or formally, so...) My mistake. :)

>  But... relational models work very well for *reporting* but not so much
>> for
>> other things. They're kinda slow, for example. Hierarchical and
>> set-related
>> models (document-based, ODBMS, and CODASYL dbs, respectively) are much,
>> much
>> faster for some operations, and slower for others.
> Yes the speed issue.
> It is true that if your data is structured hierarchically and you access it
> in your application then you can create
> the fastest application by storing it in the fashion that is good for the
> first application.
> It just so happens that other applications come along that want to use the
> same data also (to avoid redundancy).
> And for them the hierarchical model might prove to be a nightmare.
> Sure, it might. That's a big word to slip in there; I *might* get hit by a
bus tomorrow, I *might* win the lottery, I *might* grow a third eye. But
it's academic until I actually have those things happen.

> The key idea of relational model is that it is a generic way to model any
> kind of data.
> New data structures can be derived from the relations the data has.
> Ad hoc data models do not have this property.
> But of course are completely valid if only one application uses them.
> No reason you can't combine the two models, either. That's where the NoSQL
camp gets things wrong, often; they forget to remember that you can use an
RDMS for the things for which it's appropriate, and a NoSQL datastore for
the things for which IT is appropriate.

>  Relational models excel in reporting and warehousing. The rest... meh,
>> they're kinda slow on updates and set relationships. We're just so used to
>> the lack of speed that we think it's the norm.
> The lack of speed comes out of users wish to have constraints on the data
> model.
> If you allow any kind of data into your relations then I don't see why
> relational model would be slow.
> The only reason would be bad implementations.
> Sure, of course constraints are part of the reason you use an RDMS in the
first place...

> Further relational model is a MODEL not implementation.
> So implementation can be anything as long as it produces the relational
> model.
> Sure. How many literal implementations have there been?

>  However, if you're doing a lot of updates or your set modifications are
>> common, you'll typically find the other models faster, and - depending on
>> your taste - easier.
> I would like Thomas Mueller (developer of H2) to comment on this.
> Is the speed somehow related to the relational model?
> As far as I understand the speed suffers because of ACID requirements, not
> so much because of the relational model.
> And the poor usability comes from standardization (SQL).

The poor usability comes from how people typically design relational
schemas, and yes, ACID factors in. But if you're willing to explain away the
flaws by saying "the flaws are there only if you use the things that expose
the flaws," you're not saying much of anything at all.

For most people, because of how they use RDMSes, the databases are slower
than they could be. That's because they use various normalization levels.
That's because they need transactions that work. They use SQL.

That's the norm. So that's what defines how well an RDMS works for most
people. Certainly you can use alternatives that work better based on a
specific problem domain or an alternative solution from the hand of a master
DBA for the specific database you use... but that's *not* the norm.

> Standards are good because of compatibility but they rarely if ever produce
> the optimal user experience.

Sure... but when generalizing you pretty much have to go with the standards.

>  Certainly transaction processing is easier with nonrelational.
> Why?

Because the locking is (usually) more limited to only those records
affected, and relationship-checking is usually scoped better.

>  That's not to say that relational databases are *bad* -- they're just very
>> good at structured data, used for reporting, which is what they were used
>> for most often. Now they're seen as part of the background noise of data
>> storage, so everyone converts their problem into something the relational
>> database can use - as opposed to finding the solution that fits the
>> problem
>> well.
> I think people use the relational model because projects who produce data
> that is never reused in other applications are rare.
> Also I don't see non-relational models as "bad" :-)

*nod* If your data is going to be consumed by a wide variety of applications
in a general form, a general-purpose data structure is good. Of course, REST
was designed to help normalize this... but REST wouldn't be grand for
generating grist for the reporting mill, either.

>  We've decided to let our tools dictate our solutions, instead of picking
>> our
>> solutions to fit our problems.
> Tools no policy.
> Well I am all for this one.


Joseph B. Ottinger

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message