Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C17B171C9 for ; Wed, 23 Sep 2015 17:16:56 +0000 (UTC) Received: (qmail 93241 invoked by uid 500); 23 Sep 2015 17:16:51 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 93163 invoked by uid 500); 23 Sep 2015 17:16:51 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 93152 invoked by uid 99); 23 Sep 2015 17:16:51 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Sep 2015 17:16:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id DAE03C145D for ; Wed, 23 Sep 2015 17:16:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.999 X-Spam-Level: X-Spam-Status: No, score=0.999 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_MSPIKE_H2=-0.001] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 1NQKHFrCniB8 for ; Wed, 23 Sep 2015 17:16:50 +0000 (UTC) Received: from mail-qg0-f43.google.com (mail-qg0-f43.google.com [209.85.192.43]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id D315342BC6 for ; Wed, 23 Sep 2015 17:16:49 +0000 (UTC) Received: by qgt47 with SMTP id 47so24682709qgt.2 for ; Wed, 23 Sep 2015 10:16:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:subject:to:references:organization :message-id:date:user-agent:mime-version:in-reply-to:content-type; bh=vAlzptj904zC/u9xqUqQi/+kKko4hL43xTElE/aqJ44=; b=l7jnZ+s9He4LPgcJFp96PlzqkcG/jr+r7EyIR9XOi3jTFe8PyzLc5R6K2xJvQxvQ4R z3fgV4oy5w1QdCZtKhOEqULogLvn4/p4Zb/s+EKgXmDQ0nzOWjDEhbHQgT+fSedFegQG lAccuNvaC4YWZ/HR6eOGMtQqa4bqlX/nHParx64EPjCxjzbXayFXf1uG4lPWmjxDkslS an2pi0IRqCRjiUAxt38zLPZLL1tn7w6SRJFOOAuQgy0xvJy3Ga0pg/tBFdqKzBD5O6mE xsF+y1P51Oi3nfolN5Pw5CJU7tNKKtuJ58vcBn5Kx3F4NvjC2T0FrZhEUl/R3SQFv8ef /Xxg== X-Gm-Message-State: ALoCoQm03GTjTlBIo6HBhaxKbhiusw31IbPwtxUtNxR+YtizvPpw99J0HPKfAQbOm2Er+1fHLUsp X-Received: by 10.140.147.18 with SMTP id 18mr22809503qht.44.1443028603310; Wed, 23 Sep 2015 10:16:43 -0700 (PDT) Received: from [198.206.42.50] (tir-w-hopper.wiln.noaa.gov. [198.206.42.50]) by smtp.gmail.com with ESMTPSA id 38sm1839965qgh.11.2015.09.23.10.16.42 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 23 Sep 2015 10:16:42 -0700 (PDT) From: Mark Fenbers X-Google-Original-From: Mark Fenbers Subject: Re: query parsing To: solr-user@lucene.apache.org References: <56029391.9020508@noaa.gov> <1443012895250-4230793.post@n3.nabble.com> <5602B2C3.5050601@noaa.gov> <5602C02E.6010109@noaa.gov> <5602D1BD.6000607@noaa.gov> Organization: DOC/NOAA/NWS/OHRFC Message-ID: <5602DE79.9000405@noaa.gov> Date: Wed, 23 Sep 2015 13:16:41 -0400 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/mixed; boundary="------------040104040201040006090209" --------------040104040201040006090209 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit On 9/23/2015 12:30 PM, Erick Erickson wrote: > Then my next guess is you're not pointing at the index you think you are > when you 'rm -rf data' > > Just ignore the Elall field for now I should think, although get rid of it > if you don't think you need it. > > DIH should be irrelevant here. > > So let's back up. > 1> go ahead and "rm -fr data" (with Solr stopped). I have no "data" dir. Did you mean "index" dir? I removed 3 index directories (2 for spelling): cd /localapps/dev/eventLog; rm -rfv index solr/spFile solr/spIndex > 2> start Solr > 3> do NOT re-index. > 4> look at your index via the schema-browser. Of course there should be > nothing there! Correct! It said "there is no term info :(" > 5> now kick off the DIH job and look again. Now it shows a histogram, but most of the "terms" are long -- the full texts of (the table.column) eventlogtext.logtext, including the whitespace (with %0A used for newline characters)... So, it appears it is not being tokenized properly, correct? > Your logtext field should have only single tokens. The fact that you have > some very > long tokens presumably with whitespace) indicates that you aren't really > blowing > the index away between indexing. Well, I did this time for sure. I verified that initially, because it showed there was no term info until I DIH'd again. > Are you perhaps in Solr Cloud with more than one replica? Not that I know of, but being new to Solr, there could be things going on that I'm not aware of. How can I tell? I certainly didn't set anything up for solrCloud deliberately. > In that case you > might be getting the index replicated on startup assuming you didn't > blow away all replicas. If you are in SolrCloud, I'd just delete the > collection and > start over, after insuring that you'd pushed the configset up to Zookeeper. > > BTW, I always look at the schema.xml file from the Solr admin window just as > a sanity check in these situations. Good idea! But the one shown in the browser is identical to the one I've been editing! So that's not an issue. --------------040104040201040006090209--