jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Reschke <julian.resc...@gmx.de>
Subject Re: node naming
Date Wed, 09 Oct 2013 07:41:47 GMT
On 2013-10-08 22:13, Jukka Zitting wrote:
> Hi,
>
> On Tue, Oct 8, 2013 at 11:45 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
>> And these arbitrary keys really require that two different normalization
>> forms remain different?
>
> I'm afraid they probably do.
>
> While it's unlikely for unnormalized data to be used too frequently in
> practice, someone could still easily craft a request or a piece of
> content that could confuse code that doesn't expect the repository to
> do auto-normalization. Another potentially troublesome example are
> in-memory caches and other data structures that use paths as keys and
> could thus be circumvented or potentially polluted with invalid data
> if we relax path semantics. And yet another is the practice of
> avoiding a too flat content hierarchy by distributing content across
> subtrees based on the first few characters of a node name, which could
> lead to lost, misplaced or duplicated content depending on how the
> hierarchy is accessed.
>
>> The use case are real-world users that mix platforms (Windows, Mac) and
>> browsers (Webkit vs the rest) and end up with two nodes where there should
>> be only one.
>>
>> And no, it would need to be done consistently (file upload through browser,
>> WebDAV access, other HTTP based APIs, etc), and thus would be very hard to
>> do all over the place.
>
> Right, but it still would be doable on that level without potentially
> compromising clients that use JCR directly. Combined with a
> repository-level validation mechanism that rejects non-normalized
> content (or content that after normalization would conflict with
> existing content), we could still catch cases where such higher-level
> processing hasn't been applied and prevent those from causing trouble.

That sounds like you propose to do the normalization-on-lookup one layer 
above the JCR API. Won't that be extremely expensive?

>> I wonder whether we could make normalization (or lack of it) depend on a mixin type?
>
> Another potential solution might be to make such behavior
> session-specific. An extra session attribute could be used to enable
> auto-normalization just for that session. Clients that expect
> filesystem semantics could use that option, while existing
> database-oriented clients wouldn't have to worry about such things
> (apart from the potential validation errors).

That's an interesting suggestion...



Mime
View raw message