jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: About the missing COUNT() feature in jackrabbit
Date Wed, 04 Apr 2012 09:39:18 GMT

It's true that COUNT() queries are slow in Jackrabbit. Databases have the
same problem: if you have no index (and in many cases even if you have
one), the database internally still needs to read all rows that match, at
least in a lot of cases.


On 4/4/12 11:08 AM, "Bart van der Schans" <b.vanderschans@onehippo.com>

>The problem with providing a COUNT feature is authorization. The get
>to a correct count you need to fetch all the nodes from the PM and
>authorize them one-by-one. This works fine for a few hundred nodes,
>but will take forever when you scale up.For example "SELECT COUNT(*)
>FROM nt:base" would directly translate to fetch every single bundle
>from the persistence layer (database) and do that with a single query
>for every node and then authorize each of those nodes.
>The only way I can see a well performing count feature is when the
>authorization tokens are indexed as well, aka running authorized
>queries. For JR 2 this is probably not an option.
>IMO providing a count feature in the query language would give the
>wrong impression to developers. It would look like something you can
>"just use", but in fact it will not work once you start putting
>content in your production repository.
>On Wed, Apr 4, 2012 at 8:35 AM, Christian Stocker
><christian.stocker@liip.ch> wrote:
>> Hi
>> Any opinions on this? I'm mainly wondering if there's a chance that
>> something like this goes into jackrabbit or if there's opposition to
>> that. Not the actual implementation, just the idea of it. I'll work more
>> on it, if you think that could be something for jackrabbit-jcr-server
>> and propose something more decent.
>> If not, I'll pursue other challenges ;)
>> Greetings
>> chregu
>> On 31.03.12 08:18, Christian Stocker wrote:
>>> Hi
>>> As we all know, there's no COUNT feature in SQL2. This is not that bad,
>>> if you're not trying to get a count via remoting, since you can get it
>>> with .getSize() locally.  But if you're working via the webdav layer,
>>> it's not really efficient right now, especially if your resultsets are
>>> larger.
>>> There are in my opinion 2 opinions to solve this.
>>> 1) add a count() feature to the SQL2 parser, there's already a ticket
>>> for that (among other things)
>>> https://issues.apache.org/jira/browse/JCR-2605
>>> That of course would be a great solutions, it solves the problem for
>>> cases who need that
>>> 2) add an option to the SEARCH webdav interface to get the count if
>>> results instead of the whole result set.
>>> For 2) I made a little prototype (which is really just that ;)) to see
>>> if that would be feasible. That patch can be found here
>>> https://github.com/chregu/jackrabbit/compare/count-in-webdav
>>> It adds a D:count option to a SEARCH query like this
>>> <D:searchrequest xmlns:D="DAV:">
>>>       <JCR-SQL2>
>>>               <![CDATA[SELECT * FROM [nt:unstructured] ]]>
>>>       </JCR-SQL2>
>>>       <D:count>1</D:count>
>>> </D:searchrequest>
>>> and if that is set it returns only the count of results. Currently like
>>> below, but that format has certainly to be changed (it was just the
>>> shortest way to get something back):
>>> <D:multistatus xmlns:D="DAV:">
>>>       <D:response>
>>>               <D:href>#count</D:href>
>>>               <D:responsedescription>433</D:responsedescription>
>>>       </D:response>
>>> </D:multistatus>
>>> My questions:
>>> - Is there any chance that something like this goes into jackrabbit and
>>> we can build on that feature?
>>> - Or is there a better way to just get the count of a query?
>>> - How should the response look like for something like this? The above
>>> is certainly not right ;)
>>> - And last but not least, are there any plans to implement COUNT() some
>>> time in the future directly in SQL2?
>>> Thanks for any feedback.
>>> chregu
>> --
>> Liip AG  //  Feldstrasse 133 //  CH-8004 Zurich
>> Tel +41 43 500 39 81 // Mobile +41 76 561 88 60
>> www.liip.ch // blog.liip.ch // GnuPG 0x0748D5FE
>Amsterdam - Oosteinde 11, 1017 WT Amsterdam
>Boston - 1 Broadway, Cambridge, MA 02142
>US +1 877 414 4776 (toll free)
>Europe +31(0)20 522 4466

View raw message