incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Williams (JIRA)" <>
Subject [jira] [Created] (BLUR-344) Expose a Scanner capability that allows various implementations (e.g. ExportScanner)
Date Fri, 18 Jul 2014 00:35:05 GMT
Tim Williams created BLUR-344:

             Summary: Expose a Scanner capability that allows various implementations (e.g.
                 Key: BLUR-344
             Project: Apache Blur
          Issue Type: New Feature
          Components: Blur Console
            Reporter: Tim Williams
            Assignee: Tim Williams

Blur should have the ability to have "scanner" plugins that, given a query, are handed all
the matching records of the query.  These would be async long running calls from the thrift
api perspective.  

The scanner would essentially be given a collector of the hits with the fields defined by
the passed in selector.

The client would ask for a scan, then poll for the status periodically and - depending on
the Scanner implementation - pick up the results in whatever form they were requested.

For a concrete implementation, think of export.  The ExportScanner would be given a location
in HDFS and scan over all the results and drop them in that directory - maybe in a particular
requested form.  The Scanner pattern could be have many useful implementations though - for
example, to insert a subset of the data into a new Blur Table.

Here are some client API thoughts:
struct ScannerQuery {
  1:Query query,
  2:Selector selector,
  3:string id,
  4:string userContext,
  5:string scannerName,
  6:i64 startTime = 0,
  7:map<string,string> properties

enum ScanStatus {

  void scan(
    1:ScannerQuery scannerQuery
  ) throws (1:BlurException ex)

  list<string> scanList(
  ) throws (1:BlurException ex)

  ScanStatus statusScan(
    1:string scanId
  ) throws (1:BlurException ex)

  void cancelScan(
    1:string scanId
 ) throws (1:BlurException ex)

This message was sent by Atlassian JIRA

View raw message