Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 30756 invoked from network); 17 Dec 2010 00:39:13 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Dec 2010 00:39:13 -0000 Received: (qmail 64910 invoked by uid 500); 17 Dec 2010 00:39:10 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 64688 invoked by uid 500); 17 Dec 2010 00:39:10 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 64680 invoked by uid 99); 17 Dec 2010 00:39:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Dec 2010 00:39:10 +0000 X-ASF-Spam-Status: No, hits=4.0 required=10.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erickerickson@gmail.com designates 209.85.216.48 as permitted sender) Received: from [209.85.216.48] (HELO mail-qw0-f48.google.com) (209.85.216.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Dec 2010 00:39:03 +0000 Received: by qwh6 with SMTP id 6so152920qwh.35 for ; Thu, 16 Dec 2010 16:38:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=KgGIEo8jQ8ggpVbyb3dKqthK6OwXHfQMqM88iXZURaA=; b=G/ri6RFqJBpjW6v6KO0AXSnZOMrxQ8wTA6zpFeebuB/rm0LFzHw616kbriza4qiJ0c kpWf4V5lXqSuLgAamxFIHN2/rWm35ZbPjQwBeZUTDfp1Pqa6rFz/l4PvT5FOeN0KAEeW pEvT5FmCFJDU/NQY7IgYy+NdK4BbtgeGVasEY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=NK6XRz5AmdEZPe7wdGwI0Ano4+0T8L+uUkO3A+IknbgbIcy2ui0O9IJdZ5kouZgxSv 8zmuLTWOT5NCqI+WzFypC+RoL55/isb7RrzWlb7GpJtJEzZRU1VelxSdQsKUr9WcxdVL OASj0NAwrQO57q43HJ2BXkI27IYpDWZF0aG4w= MIME-Version: 1.0 Received: by 10.229.241.13 with SMTP id lc13mr207559qcb.190.1292546321690; Thu, 16 Dec 2010 16:38:41 -0800 (PST) Received: by 10.229.235.208 with HTTP; Thu, 16 Dec 2010 16:38:41 -0800 (PST) In-Reply-To: References: Date: Thu, 16 Dec 2010 19:38:41 -0500 Message-ID: Subject: Re: Query Problem From: Erick Erickson To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016363b8efc5c30da0497906419 X-Virus-Checked: Checked by ClamAV on apache.org --0016363b8efc5c30da0497906419 Content-Type: text/plain; charset=ISO-8859-1 OK, it works perfectly for me on a 1.4.1 instance. I've looked over your files a couple of times and see nothing obvious (but you'll never find anyone better at overlooking the obvious than me!). Tokenizing and stemming are irrelevant in this case because your type is "string", which is an untokenizedtype so you don't need to go there. The way your query parses and analyzes backs this up, so you're getting to the right schema definition. Which may bring us to whether what's in the index is what you *think* is in there. I'm betting not. Either you changed the schema and didn't re-index (say changed index="false" to index="true"), you didn't commit the documents after indexing or other such-like, or changed the field type and didn't reindex. So go into ..../solr/admin. Click on "schema browser", click on "fields". Along the left you should see "SectionName", click on that. That will show you the #indexed# terms, and you should see, exactly, "Programas_Home" in there, just like in your returned documents. Let us know if that's in fact what you do see. It's possible you're being mislead by the difference between seeing the value in a returned document (the stored value) and what's searched on (the indexed token(s)). And I'm assuming that some asterisks in your mails were really there for bolding and you are NOT doing wildcard searches for, for instance, *SectionName:Programas_Home*. But we're at a point where my 1.4.1 instance produces the results you're expecting, at least as I understand them so I don't think it's a problem with Solr, but some change you've made is producing results you don't expect but are correct. Like I said, look at the indexed terms. If you see "Programas_Home" in the admin console after following the steps above, then I don't know what to suggest.... Best Erick On Thu, Dec 16, 2010 at 5:12 PM, Ezequiel Calderara wrote: > The jars are named like *1.4.1* . So i suppose its the version 1.4.1 > > Thanks! > > On Thu, Dec 16, 2010 at 6:54 PM, Erick Erickson >wrote: > > > OK, what version of Solr are you using? I can take a quick check to see > > what behavior I get.... > > > > Erick > > > > On Thu, Dec 16, 2010 at 4:44 PM, Ezequiel Calderara > >wrote: > > > > > I'll check the Tokenizer to see if that's the problem. > > > The results of Analysis Page for "SectionName:Programas_Home" > > > Query Analyzer org.apache.solr.schema.FieldType$DefaultAnalyzer {} > term > > > position 1 term text Programas_Home term type word source start,end > 0,14 > > > payload > > > > > > So it's not having problems with that... Also in the debug you can see > > that > > > the parsed query is correct... > > > So i don't know where to look... > > > > > > I know nothing about "Stemming" or tokenizing, but i will look if that > > has > > > anything to do. > > > > > > If anyone can help me out, please do :D > > > > > > > > > > > > > > > On Thu, Dec 16, 2010 at 5:55 PM, Erick Erickson < > erickerickson@gmail.com > > > >wrote: > > > > > > > Ezequiel: > > > > > > > > Nice job of including relevant details, by the way. Unfortunately I'm > > > > puzzled too. Your SectionName is a "string" type, so it should > > > > be placed in the index as-is. Be a bit cautious about looking at > > > > returned results (as I see in one of your xml files) because the > > returned > > > > values are the verbatim, stored field NOT what's tokenized, and the > > > > tokenized data is what's searched.. > > > > > > > > That said, you SectionName should not be tokenized at all because > > > > it's a string type. Take a look at the admin page, "schema browser" > and > > > > see what values for "SectionName" look (these will be the tokenized > > > > values". They should be exactly > > > > Programas_Name, complete with underscore, case changes, etc. Is that > > > > the case? > > > > > > > > Another place that might help is the admin/analysis page. Check the > > debug > > > > boxes and input your steps and it'll show you what the > transformations > > > > are applied. But a quick look leaves me completely baffled. > > > > > > > > Sorry I can't be more help > > > > Erick > > > > > > > > On Thu, Dec 16, 2010 at 2:07 PM, Ezequiel Calderara < > > ezechico@gmail.com > > > > >wrote: > > > > > > > > > Hi all, I have the following problems. > > > > > I have this set of data (View data (Pastebin) < > > > > > http://pastebin.com/jKbUhjVS> > > > > > ) > > > > > If i do a search for: *SectionName:Programas_Home* i have no > results: > > > > > Returned > > > > > Data (PasteBin) > > > > > If i do a search for: *Programas_Home* i have only 1 result: Result > > > > > Returned > > > > > (Pastebin) > > > > > if i do a search for: SectionName:Programa* i have 1 result: Result > > > > > Returned > > > > > (Pastebin) > > > > > > > > > > This is my *schema* (Pastebin) and > > this > > > > is > > > > > my > > > > > *solrconfig* > > > ?>>(PasteBin) > > > > > > > > > > I don't understand why when searching for > > "SectionName:Programas_Home" > > > > > isn't > > > > > returning any results at all... > > > > > > > > > > Can someone send some light on this? > > > > > -- > > > > > ______ > > > > > Ezequiel. > > > > > > > > > > Http://www.ironicnet.com < > > http://www.ironicnet.com/> > > > > > > > > > > > > > > > > > > > > > -- > > > ______ > > > Ezequiel. > > > > > > Http://www.ironicnet.com > > > > > > > > > -- > ______ > Ezequiel. > > Http://www.ironicnet.com > --0016363b8efc5c30da0497906419--