incubator-blur-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Blur Wiki] Update of "BlurPlatform" by TimWilliams
Date Tue, 22 Jul 2014 16:53:17 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Blur Wiki" for change notification.

The "BlurPlatform" page has been changed by TimWilliams:
https://wiki.apache.org/blur/BlurPlatform

Comment:
 

New page:
== Overview ==
This describes a proposal for creating from Apache Blur a distributed search/indexing platform
on which Blur "classic" could be implemented.  

In modern open source search platforms, we find Lucene at the very core and a monolithic application
stack implemented on top of it handling the distributed indexing, searching, failures, features,
etc.  We suppose here that it would be helpful if an intermediate abstraction could be introduced
providing the primitives for a distributed Lucene server on which specific search applications
could be built. This document describes an approach for separating those concerns in Blur
and re-implementing Blur classic on top of this new platform.


== Motivation ==
We have a nice, incredibly scalable, search system why such a big change?  It's a fair question,
here are some thoughts:

 * To allow for indexing/searching based on other/new data models (e.g. more than just the
Row/Record constructs).
 * Allow implementations to build whole new APIs given direct access to the Lucene primitives.
 * Allows flexibility to build totally custom applications.

== Approach ==
The key to the approach is building a command execution framework, then transition the implementation
of the Thrift server classes to utilize that framework.  For example, the IndexServer might
transition to be able to run generic IndexCommand's across its shards.  This framework may
provide:
 * Command preemption.
 * Command cancelling.
 * Full status information.
 * Distributed traceability.
 * Metering, time/memory/etc

This would allow someone to implement new features on top of the platform by implementing
some sort of Command class, something like (should be read as pseudo-code really):

 {{{
  public abstract class IndexCommand<T> {
    private String _table;
  
    public abstract <T> T process(BlurIndex index);
    public abstract void merge(T partial);
    public abstract IndexResponse<T> terminate();
  
    public String getTable() {
      return _table;
    }
  }
  }}}

The point is that as a command implementor, you process a BlurIndex (which gives you full
access to IndexReader/Searcher/Writers); define how results should be merged together; and
how they should finally be returned.

Mime
View raw message