lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-8904) Switch from SimpleDateFormat to Java 8 DateTimeFormatter.ISO_INSTANT
Date Tue, 29 Mar 2016 12:46:25 GMT

     [ https://issues.apache.org/jira/browse/SOLR-8904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

David Smiley updated SOLR-8904:
-------------------------------
    Attachment: SOLR_8904.patch

This new patch adds parsing leniency in accordance with the [Robustness Principle|https://en.wikipedia.org/wiki/Robustness_principle]
-- "Be conservative in what you send, be liberal in what you accept".  In the dev list I mentioned
making this leniency settable via a System property but I am not making it toggle-able here.
 Perhaps it could be added later or we can let it be this way forever.

Patch summary:
* {{DateMathParser.parseMath}} now parses the date literal part using a new method, {{parseNoMath()}}
that uses a static instance of DTF created like this: {{new DateTimeFormatterBuilder().  
    .parseCaseInsensitive().parseLenient().appendInstant().toFormatter(Locale.ROOT)}}.  
** I have this method private because, in general, Solr should be parsing with "date math"
from user input.  When it's from Solr (e.g. SolrJ parsing dates), it can use {{Integer.parse}}
which is strict. Solr is strict about formatting output.
** parsing leniency is now explicitly tested.  A leading "+" is always optional no matter
how many digits are in the year.  And numbers need not have leading 0 pads. However it's an
error to have values out of bounds (e.g. >= 60 secs).
* I changed {{ValueSourceAugmenter}} to call {{DateMathParser.parseMath}}  (instead of without).
* I changed all 3 spots in the analytics contrib module that parse dates to parse with math
(instead of without).
* I moved most testing in {{DateFieldTest}} into {{DateMathParserTest}}, doing a bit of consolidating/refactoring
along the way.  The only test that remains is testing createField.  I made that test a bit
more clear that TrieDateField parses with date math.

New suggested CHANGES.txt:
{noformat}
 * Formatted date-times from Solr have some differences.  If the year is more
than 4 digits, there is a leading '+'.  When there is a non-zero number of 
milliseconds, it is padded with zeros to 3 digits.  Negative year (BC) dates are
now possible.  It is now an error to supply a portion of the date out 
of its, range, like 67 seconds.
{noformat}

I think this is ready.

> Switch from SimpleDateFormat to Java 8 DateTimeFormatter.ISO_INSTANT
> --------------------------------------------------------------------
>
>                 Key: SOLR-8904
>                 URL: https://issues.apache.org/jira/browse/SOLR-8904
>             Project: Solr
>          Issue Type: Task
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 6.0
>
>         Attachments: SOLR_8904.patch, SOLR_8904_switch_from_SimpleDateFormat_to_Instant_parse_and_format.patch
>
>
> I'd like to move Solr away from SimpleDateFormat to Java 8's java.time.formatter.DateTimeFormatter
API, particularly using simply ISO_INSTANT without any custom rules.  This especially involves
our DateFormatUtil class in Solr core, but also involves DateUtil (I filed SOLR-8903 to deal
with additional delete/move/deprecations for that one).
> In particular, there's {{new Date(Instant.parse(d).toEpochMilli())}} for parsing and
{{DateTimeFormatter.ISO_INSTANT.format(d.toInstant())}} for formatting.  Simple & thread-safe!
> I want to simply cut over completely without having special custom rules.  There are
differences in how ISO_INSTANT does things:
> * Formatting: Milliseconds are 0 padded to 3 digits if the milliseconds is non-zero.
 Thus 30 milliseconds will have ".030" added on.  Our current formatting code emits ".03".
> * Dates with years after '9999' (i.e. 10000 and beyond, >= 5 digit years):  ISO_INSTANT
strictly demands a leading '\+' -- it is formatted with a "\+" and if such a year is parsed
it *must* have a "\+" or there is an exception.  SimpleDateFormatter requires the opposite
-- no '+' and and if you tried to give it one, it would throw an exception.  
> * Currently we don't support negative years (resulting in invisible errors mostly!).
 ISO_INSTANT supports this!
> In addition, DateFormatUtil.parseDate currently allows the trailing 'Z' to be optional,
but the only caller that could exploit this is the analytics module.  I'd like to remove the
optional-ness of 'Z' and inline this method away to {{new Date(Instant.parse(d).toEpochMilli())}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message