jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Waschkowski" <mwaschkow...@gmail.com>
Subject Re: 3.1.3.1 Removing Items
Date Wed, 15 Aug 2007 16:17:48 GMT
> I'm sure Florent knows what Ivan (and you?) want.


I'm not. Why don't you let Florent respond to my comments hmmm?


> As in SQL, you can't NOT have a column there, its defined ahead of time.
> The
> > column *always* exists, regardless of the value stored in the column for
> a
> > particular row.
>
> Exactly. There is no way to remove a column from just some rows in SQL
> (and not others in the same table). That's why SQL supports NULL
> (meaning 'unknown' or 'missing information' and 'inapplicable
> information'). In JCR, you don't need NULL as you actually can remove
> a property from a node.
>

Yes, you don't need null in JCR, but that is also the point. In java, null
is supported, and JCR is not a filesystem etc., its a JAVA api. With
databases, the Java Database api (JDBC) supports both setting of and the
retrieving of nulls from a database, so I believe that the JCR API not
having support for a similiar semantic is not ideal. It can be worked
around, but still not ideal. I really don't care how its handled internally
(ala Roy) because there are many different ways of implementing it, its the
standard API that I'm concerned with.

Our company is using JCR in a manner more like a traditional database, and
thats the perspective I'm coming from. Comparing the two persistence
mechanisms is something we do regularly. Keeping the property (rather than
removing it when value is set to null) is actually closer to a SQL system
IMO. In SQL a row is defined both by the data it contains, and the columns
its associated with and all columns *always* exist. JCR is, of course, quite
different, and in some ways less efficient - it defines the property names
(columns) for every node (record) in the system. In an SQL system, the
definition is done once, but each row relates to the original definition. In
JCR, many contacts  have to redundantly repeat the definition information
for each node! Obviously, this is because a node (not considering nodes
types for now) is unstructured and could potentially contain any number of
properties. But, and here is the but, the definition of a record in a table,
or the properties of a node in the repository are critical pieces of
information.
   ie. in certain cases, the meta data is about a record or node is as
important as the data itself!

The unfortunate part, from my point of view, is that some meta data about
the data is lost when a property is removed by jcr when its value is set to
null. I do have situations when I need that property (regardless of the
value), and we have had to implement various work arounds to handle this
(see below) Comparing to SQL, if you set the value of column of a row to
null, the value is null, but you don't lose the column. More importantly,
this is NOT at all similar to the way Java itself operates. If you set the
value of a property of an object to null, then the property is null. You
don't lose that property just because its null. Hibernate API and JPA are
both java object based, so setting nulls doesn't delete properties, just
does what you would expect, sets the values to null, and how it actually
gets persisted isn't actually that interesting as the API will handle it.

Workaround - the main workaround that we have done is to store the structure
of a node separately. Then, if we need to we can map a node back to the
structure and determine what are all the possible attributes of a particular
node, and then go forward from there. Please note, using node types to
specify the structure ahead of time is NOT an option.

I really find it ironic that the repo doesn't keep all the properties
because each node is like its own little bundle of data, which includes meta
information (the node name), data (the node value) and type information (the
node type). By having a property be removed when its set to null, that
little bundle of data loses critical information about what exactly defines
itself, and, in our case, require workarounds that shouldn't be necessary.

Best,

Mark Waschkowski

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message