jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessandro Bologna" <alessandro.bolo...@gmail.com>
Subject Re: Performance as compared to simple sql db query is quite bad
Date Fri, 01 Feb 2008 08:42:01 GMT
Sateesh,

just my two cents here.

I think that Jackrabbit is not really meant to be compared in this fashion.
What you are benchmarking is a sequential scan on a table, and that is
really not what Jackrabbit is meant to be doing. In this task, I wonder if a
CSV file scan would outperforms MySQL, and honestly I believe it would, but
it would not be a reason to say that CSV is "faster" than an RDBMs.
An RDMBS will ouperform any homegrown CSV based data structure when it comes
to performing complex relational queries, and will make your life as
programmer simpler. In a similar way, Jackrabbit could (or...eventually
could) outperform an RDBMS when you are using it for complex content
searches on unstructured, hierarchical repositories.And certainly, it will
make your life simpler as a programmer.

It's a tradeoff between the overhead that is introduced and the flexibility
that you gain. I am not really surprised that creating all the transient
storage, nodes structures, persistence manager abstraction would affect the
performance of a sequential access.

With all that said, your XPATH query does not match the structure that you
described: //data/componentIds/*/*" vs /content/data/Ids/type1/*. How is it
really? And, are you sure that in this case you need to a query? You can
navigate the nodes directly with node.getNodes() and avoid at least one
level of overhead.

Alessandro





On Jan 31, 2008 7:13 PM, zevon <sateeshl@expedia.com> wrote:

>
> The project I am looking into needs to store friendly ids to a file, and
> there could be multiple types of file. I want to show the result which
> lists
> all the id's. Below is the JR structure and SQL table and their
> performance.
> As per configuration, I am using MSSqlPersistenceManager and using
> DataStore
> for files. Otherwise rest of the config is defaults.
>
> JR structure:
>
> /content/data/Ids/type1/*
> /content/data/Ids/type2/*
> /content/data/Ids/type3/*
> /content/data/Ids/type4/*
> /content/data/Ids/type5/*
>
> All the ids are equally distributed under the specific type.
>
> SQL structure:
>
> A table with columns: Name, Type
>
> Performance numbers in ms (items are spread equally among types), when
> using
> the below query. As the numbers show, it's pretty bad. Is this expected?
> any
> way to better this?
>
> Items:   JR        SQL
> 150       551       15
> 1000     2969     78
> 2000     6470     94
> 4000     16816   94
> 8000     58966   125
>
>                        Workspace workSpace = session.getWorkspace();
>                        QueryManager queryManager =
> workSpace.getQueryManager();
>
>                         StringBuffer queryStr = new
>                         StringBuffer("//data/componentIds/*/*");
>                         Query query = queryManager.createQuery(
> queryStr.toString(),
>                         Query.XPATH);
>
>                        Query query = queryManager.createQuery(
> queryStr.toString(),
>                                        Query.XPATH);
>
>                        long begin = System.currentTimeMillis();
>                        QueryResult queryResult = query.execute();
>                        int iSize = 0;
>                        NodeIterator queryResultNodeIterator =
> queryResult.getNodes();
>                        while (queryResultNodeIterator.hasNext()) {
>
>                                Node componentIdNode =
> queryResultNodeIterator.nextNode();
>                                iSize++;
>                                // System.out.println(
> componentIdNode.getName());
>                        }
>                        long end = System.currentTimeMillis();
>                        System.out.println("**** time for: " + iSize + " :
> "
>                                        + (end - begin));
>
> For SQL, it's a simple JDBC call:
>                long begin = System.currentTimeMillis();
>
>                Statement stmt = con.createStatement();
>                ResultSet rs = stmt.executeQuery("SELECT * FROM
> ArtifactFriendlyName");
>                int iSize = 0;
>                while (rs.next()) {
>                        iSize++;
>                        String s = rs.getString("FriendlyName");
>                }
>
>                long end = System.currentTimeMillis();
>
>                System.out.println("Time taken for: " + iSize + " : " +
> (end - begin));
>
>
> Thanks,
> Sateesh.
>
> --
> View this message in context:
> http://www.nabble.com/Performance-as-compared-to-simple-sql-db-query-is-quite-bad-tp15218031p15218031.html
> Sent from the Jackrabbit - Users mailing list archive at Nabble.com<http://nabble.com/>
> .
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message