lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From findbestopensource <findbestopensou...@gmail.com>
Subject Re: date issues
Date Thu, 23 Feb 2012 08:19:32 GMT
Yes. By storing as String, You should be able to do range search. I am not
sure, which is better, storing as String / Integer.

 Regards
 Aditya
 www.findbestopensource.com


On Thu, Feb 23, 2012 at 1:25 PM, Jason Toy <jasontoy@gmail.com> wrote:

> Can I still do range searches on a string? It seems like it would be more
> efficient to store as an integer.
> > Hi,
> >
> > You could consider storing date field as String in "YYYYMMDD" format.
> This
> > will save space and it will perform better.
> >
> > Regards
> > Aditya
> > www.findbestopensource.com
> >
> >
> > On Thu, Feb 23, 2012 at 11:55 AM, Jason Toy <jasontoy@gmail.com> wrote:
> >
> >> I  have a solr instance with about 400m docs. For text searches it is
> >> perfectly fine. When I do searches that calculate  the amount of times a
> >> word appeared in the doc set for every day of a month, it usually causes
> >> solr to crash with out of memory errors.
> >> I calculate this by running  ~30 queries, one for each day to see the
> >> count for that day.
> >> Is there a better way I could do this?
> >>
> >> Currently the date fields are stored as:
> >> <fieldType name="date" class="solr.TrieDateField" omitNorms="true"
> >> precisionStep="0" positionIncrementGap="0"/>
> >>
> >> and the timestamps are stored in the format of:
> >> 2012-02-22T21:11:14Z
> >>
> >> We have no need to store anything beyond the date. Will just changing
> the
> >> time portion to zeros make things faster:
> >> 2012-02-22T00:00:00Z
> >>
> >> I thought that to optimize this, there would be an actual date type that
> >> doesnt store the time component, but looking through the solr docs, I
> don't
> >> see anything specifically for a date as opposed to a timestamp.  Would
> it
> >> be faster for me to store dates in an sint format?  What is the optimal
> >> format I should use? If the format is to continue to use TrieDateField,
>  is
> >> it not a waste to store the hour/minute/seconds even if they are not
> being
> >> used?
> >>
> >> Is there anything else I can do to make this more efficient?
> >>
> >> I have looked around on the mailing list and on google and not sure what
> >> to use, I would appreciate any pointers.  Thanks.
> >>
> >> Jason
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message