jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Mehrotra <chetan.mehro...@gmail.com>
Subject Re: How to get total nodes in Oak repo
Date Wed, 13 Aug 2014 15:18:16 GMT
Hi Andrew,

Your script would work.

----
total_count = 1
countNodes = { n ->
    n.getChildNodeNames()?.each {
        def child = n.getChildNode(it);
        total_count += 1;
        //println it
        countNodes(child)
    }

}

countNodes(session.workingNode)
println "Total nodes in tree ${session.workingPath}: ${total_count}"
----

However as noted in the javadoc for getChildNodeEntries its more
performant to use getChildNodeEntries compared to getChildNodeNames ->
getChildNode i.e. O(n) vs. O(n log n). So following script [1] (but
bit more complex) might perform better

----
import com.google.common.base.Function
import com.google.common.collect.TreeTraverser
import org.apache.jackrabbit.oak.spi.state.NodeState

import static com.google.common.collect.Iterables.transform

def getChildCount(NodeState ns){
    def traversor = {ns2 -> transform(ns2.childNodeEntries, {cne ->
cne.nodeState} as Function)} as TreeTraverser
    return traversor.preOrderTraversal(ns).size()
}
----

>From within Oak run following

----
Apache Jackrabbit Oak 1.1-SNAPSHOT
Jackrabbit Oak Shell (Apache Jackrabbit Oak 1.1-SNAPSHOT, JVM: 1.7.0_55)
Type ':help' or ':h' for help.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/> :load https://gist.githubusercontent.com/chetanmeh/2138c188a1bcc135eeb3/raw/getChildCount.groovy
/> getChildCount(session.workingNode)
----

Chetan Mehrotra
[1] https://gist.github.com/chetanmeh/2138c188a1bcc135eeb3


On Wed, Aug 13, 2014 at 8:06 PM, Andrew Khoury <akhoury@adobe.com> wrote:
> Thanks Chetan, this really helped.
>
> This was for a tar based deployment.  I want to count the the total nodes
> including hidden ones under /oak:index branch and all.  So I wrote an
> oak-run groovy console script that counts all nodes under the current
> working node:
> https://gist.github.com/andrewmkhoury/c5588a6a4b57e7e0e593
>
>
> Please let me know if you see any issues with this.
> -Andrew
>
> On 8/8/14, 4:39 PM, "Andrew Khoury" <akhoury@adobe.com> wrote:
>
>>Hi Chetan,
>>How about for TarMK?  What is the quickest way to calculate total nodes?
>>Thanks,
>>Andrew
>>
>>On 8/7/14, 10:37 PM, "Chetan Mehrotra" <chetan.mehrotra@gmail.com> wrote:
>>
>>>At JCR level traversal is the only option. For Mongo based deployment
>>>you can get a rough estimate via ds.nodes.stats() command.
>>>
>>>- count - This property provides an estimate of number of nodes
>>>- It also includes the nodes which store the index data. Note that
>>>these index are Oak indexes and are different from Mongo indexes
>>>- It also includes nodes which are marked deleted but yet not garbage
>>>collected
>>>
>>>$ mongo <server>:<port>/<db>
>>>$ db.nodes.stats()
>>>$ {
>>>        "ns" : "aem-author.nodes",
>>>        "count" : 593688,
>>>        "size" : 453287536,
>>>        "avgObjSize" : 763,
>>>        "storageSize" : 629633024,
>>>        "numExtents" : 16,
>>>        "nindexes" : 5,
>>>        "lastExtentSize" : 168742912,
>>>        "paddingFactor" : 1,
>>>        "systemFlags" : 0,
>>>        "userFlags" : 1,
>>>        "totalIndexSize" : 102437104,
>>>        "indexSizes" : {
>>>                "_id_" : 86902704,
>>>                "_modified_-1" : 15027488,
>>>                "_bin_1" : 449680,
>>>                "_deletedOnce_1" : 24528,
>>>                "_sdType_1" : 32704
>>>        },
>>>        "ok" : 1
>>>}
>>>Chetan Mehrotra
>>>
>>>
>>>On Fri, Aug 8, 2014 at 1:15 AM, Andrew Khoury <akhoury@adobe.com> wrote:
>>>> Hi,
>>>> What is the quickest and most efficient way to get the total number of
>>>>nodes in an Oak repository?  Is there a built in way or do I need to do
>>>>a full traversal or query?
>>>> Thanks,
>>>> Andrew Khoury
>>
>

Mime
View raw message