Return-Path: X-Original-To: apmail-incubator-jena-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-jena-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9D083B94F for ; Fri, 20 Jan 2012 19:13:06 +0000 (UTC) Received: (qmail 85268 invoked by uid 500); 20 Jan 2012 19:13:06 -0000 Delivered-To: apmail-incubator-jena-dev-archive@incubator.apache.org Received: (qmail 85218 invoked by uid 500); 20 Jan 2012 19:13:06 -0000 Mailing-List: contact jena-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jena-dev@incubator.apache.org Delivered-To: mailing list jena-dev@incubator.apache.org Received: (qmail 85206 invoked by uid 99); 20 Jan 2012 19:13:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jan 2012 19:13:06 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jan 2012 19:13:02 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id AE1D8158BA7 for ; Fri, 20 Jan 2012 19:12:41 +0000 (UTC) Date: Fri, 20 Jan 2012 19:12:41 +0000 (UTC) From: "Rob Vesse (Commented) (JIRA)" To: jena-dev@incubator.apache.org Message-ID: <1037896272.61104.1327086761714.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <252097925.57977.1327011159896.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (JENA-199) BindingBase can hit a null pointer exception on certain queries against a TDB dataset MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/JENA-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190011#comment-13190011 ] Rob Vesse commented on JENA-199: -------------------------------- So I got into this a lot more and was able to reproduce the issue in code and isolate it from Fuseki It appears that something in this data causes TDB to return a null for one of the variables which is bizarre because there is no optional variables and it shouldn't be possible to set a null on a Binding AFAIK. Stack Trace from the code I will shortly attach is as follows: null java.lang.NullPointerException at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:204) at com.hp.hpl.jena.sparql.engine.binding.BindingBase.hashCode(BindingBase.java:185) at java.util.HashMap.put(HashMap.java:372) at java.util.HashSet.add(HashSet.java:200) at org.openjena.atlas.data.SortedDataBag.add(SortedDataBag.java:109) at org.openjena.atlas.data.DistinctDataNet.netAdd(DistinctDataNet.java:58) at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.fill(QueryIterDistinct.java:87) at com.hp.hpl.jena.sparql.engine.iterator.QueryIterDistinct.moveToNextBinding(QueryIterDistinct.java:118) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.moveToNextBinding(QueryIteratorWrapper.java:43) at com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.nextBinding(QueryIteratorBase.java:152) at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextBinding(ResultSetStream.java:84) at com.hp.hpl.jena.sparql.engine.ResultSetStream.nextSolution(ResultSetStream.java:102) at com.hp.hpl.jena.sparql.engine.ResultSetStream.next(ResultSetStream.java:111) at com.hp.hpl.jena.sparql.resultset.ResultSetApply.apply(ResultSetApply.java:43) at com.hp.hpl.jena.sparql.resultset.XMLOutput.format(XMLOutput.java:52) at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:481) at com.hp.hpl.jena.query.ResultSetFormatter.outputAsXML(ResultSetFormatter.java:459) at bugs.tdb.TDBEmptyOutput.test(TDBEmptyOutput.java:44) at bugs.tdb.TDBEmptyOutput.main(TDBEmptyOutput.java:29) So the lack of Text, CSV and TSV output appears to be that it hits this error partway through and all those writers either do a single flush at the end of output or in the case of the Text output have to do a complete iteration over the ResultSet before they can output. Hence the lack of outputs and output in the XML/JSON cases, what I failed to notice previously was that the XML/JSON output was incomplete (it was valid) but it doesn't include as many results as it should. Something in the Fuseki layer appears to catch and clean up the incomplete input whereas when evaluated via the code you get larger but incomplete output. > BindingBase can hit a null pointer exception on certain queries against a TDB dataset > ------------------------------------------------------------------------------------- > > Key: JENA-199 > URL: https://issues.apache.org/jira/browse/JENA-199 > Project: Jena > Issue Type: Bug > Components: TDB > Reporter: Rob Vesse > Labels: csv, results, sparql, tdb, tsv > Attachments: 5b.txt, 8.txt, sp2b10k.nt > > > This is a strange bug which I have been unable to reduce to a more minimal example than the files I will attach so I apologize for that. > Essentially the problem manifests as follows, when using a TDB dataset with Fuseki some queries will return blank output if the user requests Text, CSV or TSV. When using XML/JSON the output is fine. > The test data used is SP2B 10k, two of the SP2B queries that exhibit this issue are as follows: > PREFIX rdf: > PREFIX foaf: > PREFIX bench: > PREFIX dc: > SELECT DISTINCT ?person ?name > WHERE { > ?article rdf:type bench:Article . > ?article dc:creator ?person . > ?inproc rdf:type bench:Inproceedings . > ?inproc dc:creator ?person . > ?person foaf:name ?name > } > And: > PREFIX xsd: > PREFIX rdf: > PREFIX foaf: > PREFIX dc: > SELECT DISTINCT ?name > WHERE { > ?erdoes rdf:type foaf:Person . > ?erdoes foaf:name "Paul Erdoes"^^xsd:string . > { > ?document dc:creator ?erdoes . > ?document dc:creator ?author . > ?document2 dc:creator ?author . > ?document2 dc:creator ?author2 . > ?author2 foaf:name ?name > FILTER (?author!=?erdoes && > ?document2!=?document && > ?author2!=?erdoes && > ?author2!=?author) > } UNION { > ?document dc:creator ?erdoes. > ?document dc:creator ?author. > ?author foaf:name ?name > FILTER (?author!=?erdoes) > } > } > I will attach these as files as well for convenience. > If you run Fuseki with a memory dataset using the --mem option, load this data and run the same queries the Text, CSV and TSV output works fine. This implies that there is something in the TDB code related to its return of results or iterators which somehow causes the Text, CSV and TSV formatters to either error or to believe that they have no results to format. > I'm completely unfamiliar with the TDB codebase so I haven't attempted to discover what the cause of the issue is though I may poke around anyway -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira