lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Johnson <brianmjohn...@yahoo.com>
Subject SOLR-470 & default value in schema with NOW
Date Wed, 23 Apr 2008 18:23:54 GMT
So I just ran into this bug:
    https://issues.apache.org/jira/browse/SOLR-470

and read about this related one:
    https://issues.apache.org/jira/browse/SOLR-544

Here is the relevant trace:

Apr 22, 2008 10:59:01 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException: java.text.ParseException: Unparseable date: "2008-04-03T22:42:13Z"
        at org.apache.solr.schema.DateField.toObject(DateField.java:173)
        at org.apache.solr.schema.DateField.toObject(DateField.java:83)
        at org.apache.solr.update.DocumentBuilder.loadStoredFields(DocumentBuilder.java:285)
...
Caused by: java.text.ParseException: Unparseable date: "2008-04-03T22:42:1
        at java.text.DateFormat.parse(Unknown Source)

The root cause (I believe, am going to confirm tonight) is that I have multiple index files
I'm uploading into this column in the schema:
   <field name="timestamp_created" type="date" indexed="true" stored="true" required="true"
multiValued="false" default="NOW" />

Here is my typedef for 'date':
    <fieldType name="date" class="solr.DateField" sortMissingLast="true" omitNorms="true"/>


What I came to realize is that my index files contain this column value consistently specified,
but one of my files does not contain the column at all. Due to my indication of a default
value, I am reliant on the SOLR default for NOW being in the same format (no millis, .0, .00,
.000, etc) as I have passed in my feed. As you can see from the exception, my feed does not
contain any millis which is a valid format according to 544 and the documentation I've read.


Now finally, my problem. The format for NOW doesn't seem to be documented so I have no idea
what I need to 'match' (or even that matching is necessary from the documentation outside
these 2 bugs) in order to take advantage of the default value feature and mix that with data
from my streams. I can see from here that it isn't the 'no millis' form since a discrepancy
is triggering this bug. 

Solutions?

A) Should I create a format normalizer and configure that into my typedef for 'date' so that
I am agnostic of these differences in terms of input and insure the indexed format is consistent?
I believe this would be a <analyzer type="index"><filter .../></analyzer>.
I'm not concerned about the presence or absence of millis on the output. Would this approach
work? Based on the presence of the filter in the fieldType, it feels like a hack.

B) Should I remove the default value and just insure all my streams have this value specified
consistently an not trigger the bug? It seems to me that SOLR should be robust in this respect,
but reading SOLR-544 I can see that this isn't an opinion that is held by all.

C) Should I apply one of the existing SOLR-470 patch files and move on?

D) Should I take a stab at https://issues.apache.org/jira/browse/SOLR-440 as an alternative
'class' for my 'date' type?

Thanks,

Brian




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message