lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: Appropriate field type for date-without-time
Date Mon, 16 Apr 2018 13:08:26 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Shawn,

On 4/15/18 4:49 PM, Shawn Heisey wrote:
> On 4/15/2018 2:31 PM, Christopher Schultz wrote:
>> I'd usually call this a "date", but Solr's documentation says
>> that a "date" is what I would call a timestamp (including time
>> zone).
> 
> That is correct.  Lucene dates are accurate to the millisecond.
> They don't actually handle timezones the way you might be thinking
> -- the information is UTC.  When using date rounding (NOW/WEEK,
> NOW/DAY, etc) you can tell Solr what the timezone is so that the
> boundaries are correct, but the information in the index is UTC.
> 
>> https://lucene.apache.org/solr/guide/7_3/field-types-included-with-so
lr.
>>
>> 
html
>> 
>> [ I remember reading but cannot currently seem to find a
>> reference page with the actual pre-defined field types Solr ships
>> with. That page above lists the class names, but not the aliases
>> used by a real Solr installation.
> 
> That info is what you need to define the fieldType in the schema.
> So you would put something like "solr.DatePointField" as the
> class.

What about the "standard" aliases for existing fieldTypes? I remember
reading a page where "int" versus "pint" were compared, but I can't
seem to find that, now.

>> Is there an existing appropriate field type for
>> "date-without-time"?
> 
> The answer to this question is not yes, but it's also not no.  All
> date types in Solr have millisecond precision.

Okay, so if I want to have a date-without-timestamp, I'll either need
to set all timestamps to 00:00:00 or invent something like
pint-encoded-date, right?

> But if you use DateRangeField, you can deal with larger time
> periods.  A query like "2018" actually works.  At both query and
> index time, the less precise syntax is translated internally to a
> *range* before the query or indexing happens.

Sounds like wasting a little space with 00:00:00 timestamps is
probably the way to go. Even if using pint would be equivalent (and
perhaps even a little more efficient), I think using a "real" date
field is more appropriate.

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlrUoEoACgkQHPApP6U8
pFj4lBAAzBSwzlq/mYpK9KraK3UkRhvDfQY5Tk9UpjaDvigROMks5oaGUybZmYLa
6oIguO+xwrMpYU08X3RCtDMPkJKFxXcQhj4x3zgMj/JM2FaCjgkWMsE1oU+68qKB
Ad4HMMqPsmDuG22zcXJWlMLNIfgZk89u2c97Tt/eWvtUYMnZMjT+6CfA43z8JRnM
i8ixDaEl7TZVDD3G4YW/cXCQacpIPmynMOH60gng5ylC04nMLCQyvf3zV0WB7X+t
JTGEjGmMENJhqVq3PnH6VYjGeSU92c8/bbEf+us1nRkIjayEnA7Uv7L87l56viVY
3jpEvHPjGiluDpTfLRUQzaTvu7PUwL1MefmKYnri9NP+HB2v8AhGN+oCyRI/RM5r
hYMTOdyX9VcVOUF3DluWpOCpG9WaJaEfT6ifw6bifNQpWG9lj6B8zxAfGGWRL9dU
iOOCBYwDioYaolRz6oIcTny22/mm3SE4IXGkrH9C2U9WU/nUFhWEjqbw4MWF0ten
0RSJ8coj05fsFdA0A1owA2wOqXuJGmaMfNjZiPR05ucgIFaM0MxgIyFzNeMGxKSd
aUp5EfrS2EHa23DDgsMF0i7C5KTw/Xlzr0Y+9WWdSlRWtYGvBZThP261lJ/jHmpS
FcDsNz4Y5/V2XnNcp0ieD+RoaAMctiehFuzPu9h2awZcF25CGDI=
=vaBk
-----END PGP SIGNATURE-----

Mime
View raw message