lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Sturge (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-1709) Distributed Date Faceting
Date Sun, 17 Apr 2011 14:29:05 GMT

    [ https://issues.apache.org/jira/browse/SOLR-1709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020802#comment-13020802
] 

Peter Sturge commented on SOLR-1709:
------------------------------------

Updating ResponseBuilder rather than FacetInfo really came from tracing the references through
the hierarchy - so, I don't think anything is missed by moving this to FacetInfo props, and
should provide better encapsulation.
Deprecating data faceting in favour of generic range faceting should be fine, as long as there
exists a clear path to easily move from 'the way we were' with date facets, to 'the way it
will be' (range faceting). It would be a shame to break clients that rely on the existing
date facet parameters/syntax, so I guess if they're mapped to range (I think some of this
is in 3.x already?), that would be good.

Thanks


> Distributed Date Faceting
> -------------------------
>
>                 Key: SOLR-1709
>                 URL: https://issues.apache.org/jira/browse/SOLR-1709
>             Project: Solr
>          Issue Type: Improvement
>          Components: SearchComponents - other
>    Affects Versions: 1.4
>            Reporter: Peter Sturge
>            Assignee: Hoss Man
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: FacetComponent.java, FacetComponent.java, ResponseBuilder.java,
SOLR-1709.patch, SOLR-1709_distributed_date_faceting_v3x.patch, solr-1.4.0-solr-1709.patch
>
>
> This patch is for adding support for date facets when using distributed searches.
> Date faceting across multiple machines exposes some time-based issues that anyone interested
in this behaviour should be aware of:
> Any time and/or time-zone differences are not accounted for in the patch (i.e. merged
date facets are at a time-of-day, not necessarily at a universal 'instant-in-time', unless
all shards are time-synced to the exact same time).
> The implementation uses the first encountered shard's facet_dates as the basis for subsequent
shards' data to be merged in.
> This means that if subsequent shards' facet_dates are skewed in relation to the first
by >1 'gap', these 'earlier' or 'later' facets will not be merged in.
> There are several reasons for this:
>   * Performance: It's faster to check facet_date lists against a single map's data, rather
than against each other, particularly if there are many shards
>   * If 'earlier' and/or 'later' facet_dates are added in, this will make the time range
larger than that which was requested
>         (e.g. a request for one hour's worth of facets could bring back 2, 3 or more
hours of data)
>     This could be dealt with if timezone and skew information was added, and the dates
were normalized.
> One possibility for adding such support is to [optionally] add 'timezone' and 'now' parameters
to the 'facet_dates' map. This would tell requesters what time and TZ the remote server thinks
it is, and so multiple shards' time data can be normalized.
> The patch affects 2 files in the Solr core:
>   org.apache.solr.handler.component.FacetComponent.java
>   org.apache.solr.handler.component.ResponseBuilder.java
> The main changes are in FacetComponent - ResponseBuilder is just to hold the completed
SimpleOrderedMap until the finishStage.
> One possible enhancement is to perhaps make this an optional parameter, but really, if
facet.date parameters are specified, it is assumed they are desired.
> Comments & suggestions welcome.
> As a favour to ask, if anyone could take my 2 source files and create a PATCH file from
it, it would be greatly appreciated, as I'm having a bit of trouble with svn (don't shoot
me, but my environment is a Redmond-based os company).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message