jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Klimetschek <aklim...@adobe.com>
Subject Re: [jr3] Orderable child nodes: required (to be the default)?
Date Mon, 23 Jan 2012 09:29:26 GMT
I think there is a use case for large flat unordered lists (using a node
type with orderable = false). For example, dictionaries, where the key and
value are in the properties and the node name is not really relevant, and
where you don't want to create nested structures just for the sake of
scalability - which also increases the total number of nodes.

Cheers,
Alex

On 21.01.12 23:37, "Tobias Bocanegra" <tripod@adobe.com> wrote:

>I agree,
>so for large childnode lists, a stable but uncontrollable order would
>be ok, although violating the spec?
>
>--
>
>toby
>On Sat, Jan 21, 2012 at 5:04 AM, Stefan Guggisberg
><stefan.guggisberg@gmail.com> wrote:
>> On Sat, Jan 21, 2012 at 3:00 AM, Tobias Bocanegra <tripod@adobe.com>
>>wrote:
>>> nt:unstructured has orderable childnodes, and we also preach to use
>>> nt:unstructured whereever you can. Also, i assume that a lot of
>>> applications benefit from node ordering (like a CMS). So I think that
>>> the majority of the repositories have a lot of nodes with ordered
>>> childnodes.
>>
>>
>> agreed, at least for small child node lists (e.g. < 1k child nodes).
>> but i doubt there's a need for large (e.g. 1M entries)  orderable
>> child node lists.
>>
>>> thus i would not put too much effort in having an
>>> efficient non-ordered format, but making the ordered storage as fast
>>> as possible.
>>
>> i am thinking of large child node collections with stable order.
>> however, they wouldn't be insertion-ordered nor client-orderable.
>> that's the compromise.
>>
>>> using lexicographical ordering for unordered child nodes
>>> would certainly be a good option,
>>
>> that would be an implementation detail. as long as the order
>> is stable (repeated reads of the same node revision should
>> provide the same order of the child nodes) we shouldn't make
>> any promises about how the order is defined.
>>
>> cheers
>> stefan
>>
>>> [...] as i can reduce the lookup (but not
>>> insert). also, i think that re-orderings happen relatively rarely.
>>>
>>> regards, toby
>>>
>>> On Fri, Jan 20, 2012 at 9:13 AM, Stefan Guggisberg
>>> <stefan.guggisberg@gmail.com> wrote:
>>>> On Fri, Jan 20, 2012 at 5:43 PM, Jukka Zitting
>>>><jukka.zitting@gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> Thinking about this more generally, it would definitely be useful to
>>>>> be able to use a more efficient underlying storage for unorderable
>>>>> nodes. However, there still are hard use cases that require us to
>>>>> support orderable nodes as indicated by the type of the parent node.
>>>>> Thus the implementation basically has two options:
>>>>>
>>>>> 1) Use a single data structure for all nodes, like Jackrabbit
>>>>> currently does. This simplifies the implementation but prevents us
>>>>> from enjoying the performance and scalability benefits of unorderable
>>>>> nodes.
>>>>>
>>>>> 2) Use two data structures, one for orderable and another for
>>>>> unorderable nodes. This is more complex (not least because it links
>>>>> node types to the underlying storage model), but is probably required
>>>>> if we want to efficiently support up to millions of child nodes per a
>>>>> single parent.
>>>>
>>>> i agree that it's probably required for efficiently handling both
>>>>large
>>>> and small child node lists.
>>>>
>>>> i am not sure that it necessarily means linking node types to the
>>>> underlying storage modal. the microkernel prototype currently has
>>>> no notion of node types and that's IMO a good thing.
>>>>
>>>> node types have a way to dominant role in the current jackrabbit
>>>> core implementation. jackrabbit started out as the official reference
>>>> implementation for jsr-170, the focus was on supporting every
>>>> feature of the spec.
>>>>
>>>> in jr3 i guess we can and should trade some 'note type correctness'
>>>> for improved efficiency.
>>>>
>>>>>
>>>>> It might also be possible to have the underlying storage model
>>>>> unorderable, and just include extra ordering metadata at a higher
>>>>> layer where also node types are handled. Option 2b, if you like, with
>>>>> probably fairly significant overhead when iterating over nodes (the
>>>>> implementation would probably need to pre-fetch all child nodes in
>>>>> advance to access their ordering metadata).
>>>>>
>>>>> If we do have native support for unorderable nodes, then I wouldn't
>>>>> promise any particular ordering (alphabetic, insertion order, etc.)
>>>>> but rather leave it undefined like in Java's Map interface. The
>>>>> underlying implementation is then free to use whatever ordering it
>>>>> thinks is best.
>>>>
>>>> i agree.
>>>>
>>>> cheers
>>>> stefan
>>>>
>>>>>
>>>>> PS. Note that ordering affects not just node traversal, but also the
>>>>> document ordering used in search results. Though in practice we
>>>>> already now disable document ordering of search results by default.
>>>>>
>>>>> BR,
>>>>>
>>>>> Jukka Zitting



-- 
Alexander Klimetschek
Developer // Adobe (Day) // Berlin - Basel





Mime
View raw message