lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Fenbers <mark.fenb...@noaa.gov>
Subject Re: query parsing
Date Wed, 23 Sep 2015 17:16:41 GMT
On 9/23/2015 12:30 PM, Erick Erickson wrote:
> Then my next guess is you're not pointing at the index you think you are
> when you 'rm -rf data'
>
> Just ignore the Elall field for now I should think, although get rid of it
> if you don't think you need it.
>
> DIH should be irrelevant here.
>
> So let's back up.
> 1> go ahead and "rm -fr data" (with Solr stopped).
I have no "data" dir.  Did you mean "index" dir?  I removed 3 index 
directories (2 for spelling):
cd /localapps/dev/eventLog; rm -rfv index solr/spFile solr/spIndex
> 2> start Solr
> 3> do NOT re-index.
> 4> look at your index via the schema-browser. Of course there should be
> nothing there!
Correct!  It said "there is no term info :("
> 5> now kick off the DIH job and look again.
Now it shows a histogram, but most of the "terms" are long -- the full 
texts of (the table.column) eventlogtext.logtext, including the 
whitespace (with %0A used for newline characters)...  So, it appears it 
is not being tokenized properly, correct?
> Your logtext field should have only single tokens. The fact that you have
> some very
> long tokens presumably with whitespace) indicates that you aren't really
> blowing
> the index away between indexing.
Well, I did this time for sure.  I verified that initially, because it 
showed there was no term info until I DIH'd again.
> Are you perhaps in Solr Cloud with more than one replica?
Not that I know of, but being new to Solr, there could be things going 
on that I'm not aware of.  How can I tell?  I certainly didn't set 
anything up for solrCloud deliberately.
> In that case you
> might be getting the index replicated on startup assuming you didn't
> blow away all replicas. If you are in SolrCloud, I'd just delete the
> collection and
> start over, after insuring that you'd pushed the configset up to Zookeeper.
>
> BTW, I always look at the schema.xml file from the Solr admin window just as
> a sanity check in these situations.
Good idea!  But the one shown in the browser is identical to the one 
I've been editing!  So that's not an issue.


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message