oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron Goodale <good...@apache.org>
Subject Re: Query Tool Bugs?
Date Fri, 11 May 2012 20:31:15 GMT
Mike,

As my Lucene Catalog has started to explode in volume (catalog contains
over 290,000 docs) I have also needed to query the catalog in a more
complicated manner.  (I am going to assume you are using the Lucene
Catalog, if not then read no further).

If you use the Luke Tool from Lucene[1] you can query the index in almost
any manner you can imagine.  This tool is also extremely fast since it was
designed to connected directly to the index.  I used the lukeall-1.0.1.jar
and it worked great. Hope this helps.

Cameron

[1] - http://bit.ly/JjRzu5

On Fri, May 11, 2012 at 11:29 AM, Verma, Rishi (388J) <
Rishi.Verma@jpl.nasa.gov> wrote:

> Hey Mike,
>
> Wanted to add some thoughts:
>
> >I talked with Rishi regarding this and he recommended that the date and
> >time be split when performing a query. Reason being is that the query tool
> >blows up when trying to compare datetime values. He mentioned that he
> >tried querying against ISO 8601 date/time values before and it didn't work
> >for him and the way around it was to split it up. I think behind the
> >scenes, the query tool is actually doing an ascii comparison, which might
> >be why the tool might be having performance issues?
>
>
> I was using the XML-RPC FileManager API directly to issue my queries (see
> [1] for RangeQuery API). The problem wasn't so much syntax errors, but
> performance issues. In other words, queries would hang and then eventually
> fail due to the large number of comparison checks being performed.
>
> +1 on the idea to split up your date for better search performance, but
> from your stacktrace, it looks like you have a syntax issue. Have you been
> able to test comparison (ie. use of '<' '>' etc) queries for non-time
> related metadata elements? That might be a good place to start, to see if
> you've got your syntax right.
>
> Thanks!
> Rishi
>
> --
> [1]
> http://svn.apache.org/repos/asf/oodt/tags/0.3/filemgr/src/main/java/org/apa
> che/oodt/cas/filemgr/structs/RangeQueryCriteria.java
>
> On 5/11/12 9:07 AM, "Cayanan, Michael D (388J)"
> <michael.d.cayanan@jpl.nasa.gov> wrote:
>
> >Hi Chris,
> >
> >Comments are below....
> >
> >On 5/10/12 6:22 PM, "Mattmann, Chris A (388J)"
> ><chris.a.mattmann@jpl.nasa.gov> wrote:
> >
> >>Hi Mike,
> >>
> >>On May 10, 2012, at 1:28 PM, Cayanan, Michael D (388J) wrote:
> >>
> >>> Hi All,
> >>>
> >>> I'm having several issues with the Query Tool and wondering if anyone
> >>>has run into these issues before:
> >>>
> >>> First, I'm having an issue when giving the Query Tool a query
> >>>containing multiple conditions:
> >>>
> >>> Below is a command-line run of my query:
> >>>
> >>> ./query_tool --url http://localhost:9000 --sql -query "SELECT * FROM
> >>>L0a_Radar WHERE RangeBeginningDate>'2007-01-01' AND
> >>>RangeBeginningTime>'12:00:00.000Z'"
> >>> log4j:WARN No appenders could be found for logger
> >>>(org.apache.commons.httpclient.HttpClient).
> >>> log4j:WARN Please initialize the log4j system properly.
> >>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig
> >>>for more info.
> >>> org.apache.xmlrpc.XmlRpcException: java.lang.Exception:
> >>>org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Failed
> >>>to perform complex query : You have an error in your SQL syntax; check
> >>>the manual that corresponds to your MySQL server version for the right
> >>>syntax to use near 'INTERSECT (SELECT DISTINCT product_id FROM
> >>>L0a_Radar_metadata WHERE element_id =' at line 1
> >>> at
> >>>org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeException(XmlRpcCl
> >>>i
> >>>entResponseProcessor.java:104)
> >>> at
> >>>org.apache.xmlrpc.XmlRpcClientResponseProcessor.decodeResponse(XmlRpcCli
> >>>e
> >>>ntResponseProcessor.java:71)
> >>> at
> >>>org.apache.xmlrpc.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:73)
> >>> at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:194)
> >>> at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:185)
> >>> at org.apache.xmlrpc.XmlRpcClient.execute(XmlRpcClient.java:178)
> >>> at
> >>>org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.complexQuery(
> >>>X
> >>>mlRpcFileManagerClient.java:952)
> >>> at
> >>>org.apache.oodt.cas.filemgr.tools.QueryTool.performSqlQuery(QueryTool.ja
> >>>v
> >>>a:251)
> >>> at org.apache.oodt.cas.filemgr.tools.QueryTool.main(QueryTool.java:241)
> >>> Exception in thread "main"
> >>>org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException:
> >>>java.lang.Exception:
> >>>org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: Failed
> >>>to perform complex query : You have an error in your SQL syntax; check
> >>>the manual that corresponds to your MySQL server version for the right
> >>>syntax to use near 'INTERSECT (SELECT DISTINCT product_id FROM
> >>>L0a_Radar_metadata WHERE element_id =' at line 1
> >>> at
> >>>org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.complexQuery(
> >>>X
> >>>mlRpcFileManagerClient.java:958)
> >>> at
> >>>org.apache.oodt.cas.filemgr.tools.QueryTool.performSqlQuery(QueryTool.ja
> >>>v
> >>>a:251)
> >>> at org.apache.oodt.cas.filemgr.tools.QueryTool.main(QueryTool.java:241)
> >>
> >>Just out of curiosity, is that correct ISO 8601 date/time format? Looks
> >>like a partial one, missing the timezone do you think that might
> >>affect ir?
> >
> >I talked with Rishi regarding this and he recommended that the date and
> >time be split when performing a query. Reason being is that the query tool
> >blows up when trying to compare datetime values. He mentioned that he
> >tried querying against ISO 8601 date/time values before and it didn't work
> >for him and the way around it was to split it up. I think behind the
> >scenes, the query tool is actually doing an ascii comparison, which might
> >be why the tool might be having performance issues?
> >
> >>
> >>>
> >>> I tried surrounding the entire condition with quotes, but still no
> >>>luck:
> >>>
> >>> ./query_tool --url http://localhost:9000 --sql -query "SELECT * FROM
> >>>L0a_Radar WHERE "RangeBeginningDate>'2007-01-01' AND
> >>>RangeBeginningTime>'12:00:00.000Z'""
> >>> Ambiguous output redirect.
> >>>
> >>> I'm assuming this is a syntax thing, although I don't know what the
> >>>tool is expecting.
> >>
> >>Did you check the code in SVN?
> >
> >I'm running 0.3 of the code. Does the trunk fix this? I have the code
> >checked out onto my local machine. I can certainly build the trunk and see
> >if I get the same results.
> >
> >>
> >>>
> >>> My second issue that I'm running into is in regards to querying of
> >>>dates. I tried the following query below and got the following output:
> >>>
> >>> ./query_tool --url http://localhost:9000 --sql -query "SELECT * FROM
> >>>L0a_Radar WHERE RangeBeginningDate>'2007-03-02'"
> >>> log4j:WARN No appenders could be found for logger
> >>>(org.apache.commons.httpclient.HttpClient).
> >>> log4j:WARN Please initialize the log4j system properly.
> >>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig
> >>>for more info.
> >>> Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
> >>>String index out of range: -1
> >>> at
> >>>java.lang.AbstractStringBuilder.substring(AbstractStringBuilder.java:881
> >>>)
> >>> at java.lang.StringBuffer.substring(StringBuffer.java:416)
> >>> at
> >>>org.apache.oodt.cas.filemgr.tools.QueryTool.performSqlQuery(QueryTool.ja
> >>>v
> >>>a:255)
> >>> at org.apache.oodt.cas.filemgr.tools.QueryTool.main(QueryTool.java:241)
> >>>
> >>> For this particular product, I have 1 product in my catalog where the
> >>>RangeBeginningDate is equal to '2007-03-01'. Not sure if that factors
> >>>into why an exception is being thrown here. When I use an earlier date
> >>>on my query, the tool returns a result as expected:
> >>>
> >>> ./query_tool --url http://localhost:9000 --sql -query "SELECT * FROM
> >>>L0a_Radar WHERE RangeBeginningDate>'2007-01-01'"
> >>> log4j:WARN No appenders could be found for logger
> >>>(org.apache.commons.httpclient.HttpClient).
> >>> log4j:WARN Please initialize the log4j system properly.
> >>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig
> >>>for more info.
> >>>
> >>>/Users/mcayanan/smap/staging,2007-03-01,23:30:25.000Z,314,L0a_Radar,V205
> >>>1
> >>>7SGS0706023302501.VCD,V20517SGS0706023302501.VCD,2012-05-08T14:27:59.385
> >>>-
> >>>07:00,L0a_Radar,23:30:25.000Z,2007-03-01
> >>
> >>Interesting! Did you scope the code to see if there's a RangeQuery issue?
> >>
> >>Feel free to file a bug and would love you to investigate!
> >
> >I haven't dived into the code, but will certainly do this as SMAP will
> >need these capabilities. I will file a bug if it turns out that this is
> >indeed a bug.
> >
> >Thanks,
> >Mike
> >
> >>
> >>Cheers,
> >>Chris
> >>
> >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>Chris Mattmann, Ph.D.
> >>Senior Computer Scientist
> >>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>Office: 171-266B, Mailstop: 171-246
> >>Email: chris.a.mattmann@nasa.gov
> >>WWW:   http://sunset.usc.edu/~mattmann/
> >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>Adjunct Assistant Professor, Computer Science Department
> >>University of Southern California, Los Angeles, CA 90089 USA
> >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>
> >
>
>

Mime
View raw message