Mailing-List: contact oak-dev-help@jackrabbit.apache.org; run by ezmlm
Precedence: bulk
Reply-To: oak-dev@jackrabbit.apache.org
Received-SPF: pass (athena.apache.org: domain of julian.reschke@gmx.de
 designates 213.165.64.22 as permitted sender)
Message-ID: <4FB2CA16.5050601@gmx.de>
Date: Tue, 15 May 2012 23:26:46 +0200
From: Julian Reschke <julian.reschke@gmx.de>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64;
 rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: oak-dev@jackrabbit.apache.org
CC: Jukka Zitting <jukka.zitting@gmail.com>
Subject: Re: index handling
References: <4FB2993C.2030701@gmx.de>
 <CAOFYJNaM+tEw+Uqrk7btOxd=JmmdbPbROqTHq=gygzwjUrBvYQ@mail.gmail.com>
 <4FB2AF53.9000808@gmx.de>
 <CAOFYJNZsX3u0b0W0Q2EP1+pMFQv5Q4EAPpz8AOKoUXmfpTJimA@mail.gmail.com>
In-Reply-To: 
 <CAOFYJNZsX3u0b0W0Q2EP1+pMFQv5Q4EAPpz8AOKoUXmfpTJimA@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

On 2012-05-15 22:51, Jukka Zitting wrote:
> Hi,
>
> On Tue, May 15, 2012 at 9:32 PM, Julian Reschke<julian.reschke@gmx.de>  wrote:
>> On 2012-05-15 20:39, Jukka Zitting wrote:
>>> Couldn't we just pass them as is down to oak-core, where such names
>>> can be rejected as invalid by a commit hook? Or if an exception is
>>> required before save(), I'd rather have such checks explicitly in
>>> NodeImpl&    friends instead of overloading the path mapper with extra
>>> responsibilities.
>>
>> So you want it either higher in the stack or lower but not here? :-)
>
> :-) "here" is also fine if there's a good case for it.
>
> My main worry is that we end up overloading the path and name mappers
> with responsibilities that don't really belong there. See for example

Understood.

> the recent discussion in OAK-89 where I argued that the name mapper
> shouldn't be responsible for throwing exceptions that are relevant or
> interpreted only higher up the stack. Instead the mapper should only
> be responsible for a tightly scoped API that simply returns the mapped
> name or null if such a mapping is not possible. The interpretation of
> what a missing mapping means should be left to the higher level since
> there's at least three different outcomes. Similarly I'm concerned
> about special case rules like an index not being allowed on the last
> element of a path creeping into the path mapper.

OK.

I would argue we shouldn't let it creep into oak-core, because so far 
oak-core doesn't know anything about "[" and "]", and I thought we 
wanted to keep this out of it.

The reason to move it into the PathMapper is because that one currently 
throws away information we need. If it did return a sequence of path 
segments (as we do in SPI) instead of a reconstructed string, the upper 
layer could inspect the parsing result and proceed from there.

So maybe we need to separate parsing from reconstructing, and let those 
parts of the code that need more information inspect the parsing result 
directly?

>> Yes, we can make this work, but it essentially means that the path is parsed
>> twice.
>
> I don't see the duplicate parsing as a problem as the common case can
> be heavily optimized (essentially to a no-op) and the second parsing
> step is much simpler since it doesn't need to worry about name mapping
> anymore. The path resolver is pretty simple, apart from identifier
> resolution it only needs a fraction of the amount of code we already
> have in JcrPathParser. A quick and dirty version of such a resolver is
> included below:
> ...

I'm not sure what kind of layering you are suggesting. If the idea is to 
break the current functionality of the jcr-to-oak-path-mapper into two 
steps, where the first step returns sufficient information for the 
checks we need, that would work for me as well.

Right now we have:

1) parsing into segments

2) mapping the individual parts (prefix mapping, expanded name mapping)

3) reconstruction

in a monolithic call; and to do the validation we need we need to 
inspect the result of steps 1 & 2 before proceeding.

(Note that identifier mapping currently happens above, because it turned 
out we don't need path parsing for that at all).

Best regards, Julian