hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "Hbase/HbaseRest" by BryanDuxbury
Date Thu, 15 Nov 2007 19:19:00 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by BryanDuxbury:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest

------------------------------------------------------------------------------
  This is a provisional spec for the Hbase-REST API done under the aegis of [https://issues.apache.org/jira/browse/HADOOP-2068
HADOOP-2068].
- 
- ~-''St.Ack comment: Bryan I added comments inline.  Remove them once them when you are done.-~
  
  == System Information ==
  
@@ -13, +11 @@

          XML entity body that contains a list of the tables like so:
  {{{
  <tables>
-   <table name="first_table" uri="/first_table" />
+   <table name="first_table" uri="/first_table" arbitrary-key1="value" ... />
-   <table name="second_table" uri="/second_table" />        
+   <table name="second_table" uri="/second_table" arbitrary-key1="value" ... />   
    
  </tables>
  }}}
  
@@ -29, +27 @@

  <table>
    <columnFamilies>
       <columnFamily name="meta" />
-        <columnFamily name="content" max-versions=3 compression="NONE" in-memory="false"
max-length=2147483647 bloom-filter="none" />
+        <columnFamily name="content" max-versions=3 compression="NONE" in-memory="false"
max-length=2147483647 bloom-filter="none" arbitrary-key1="value" ... />
-        <columnFamily name="stats" max-versions=3 compression="NONE" in-memory="false"
max-length=2147483647 bloom-filter="none" />
+        <columnFamily name="stats" max-versions=3 compression="NONE" in-memory="false"
max-length=2147483647 bloom-filter="none" arbitrary-key1="value" ... />
       </columnFamilies>
  </table>
  }}}
  
- ~-''St.Ack comment: FYI, here is an example column descriptor: {name: triples, max versions:
3, compression: NONE, in memory: false, max length: 2147483647, bloom filter: none}.  We're
also about to add being able to add arbitrary key/value pairs to both table and column descriptors-~
   
  
  '''GET /[table_name]/regions'''
      Retrieve a list of the regions for this table so that you can efficiently split up the
work (a la MapReduce).
@@ -82, +79 @@

          Accept:
              application/xml:    The client is expecting an XML entity body that contains
the 
                                  columns and data together.
+             
-             octet-stream:       The client is expecting raw binary data. This implies that

-                                 there is only a single column being retrieved.
              Multipart/related:  The client is expecting raw binary data, but organized into
a 
                                  multipart response. The client must be prepared to parse
the 
                                  column values out of the data.
      Options: 
          columns: A semicolon-delimited list of column names. If omitted, the result will
contain all columns in the row.
  
- ~-''St.Ack comment: +1 that MIME is way to return rows.  -1 that octet-stream would be an
option.  Just expect xml or MIME if full row specified -~
      
  '''POST/PUT /[table_name]/row/[row_key]/'''
  '''POST/PUT /[table_name]/row/[row_key]/[timestamp]'''
@@ -99, +94 @@

      Headers:
          Content-type:
              application/xml:    The client is sending one or more columns of data in an
XML entity.
-             octet-stream:       The client is sending EXACTLY ONE column value as raw binary
-                                 as specified in the columns attribute.
              Multipart/related:  The client is sending multiple columns of data encoded with
boundaries.
              
      Options:
@@ -111, +104 @@

          the query string column options do not match the Content-type header, or if the
binary data of either
          octet-stream or Multipart/related is unreadable.
  
- ~-''St.Ack comment: -1 again on octet-stream.  It messes up your nice clean API.  Might
consider adding column name as MIME header if multipart rather than have columns as option
IF multipart (ignored if XML).  Might not make sense if this only time its done (since every
where else need to be able to handle the column option) -~
+ ~-''St.Ack comment: Might consider adding column name as MIME header if multipart rather
than have columns as option IF multipart (ignored if XML).  Might not make sense if this only
time its done (since every where else need to be able to handle the column option) -~
+ 
+ ~-''Bryan comment: While we certainly could use headers, I'd prefer not to. Headers seem
like an ugly way to say what you're sending. In REST, you're supposed to specify '''what'''
you're acting on in the URI, not headers, and which columns to save to qualifies to me. It
may turn out to be an implementation question, but we'll see. ''-~
+ 
      
  '''DELETE /[table_name]/row/[row_key]/'''
  '''DELETE /[table_name]/row/[row_key]/[timestamp]'''
@@ -130, +126 @@

      Request that a scanner be created with the specified options. Returns a scanner ID that
can be used to iterate over the results of the scanner.
      Options: 
          columns: A semicolon-delimited list of column names. If omitted, each result will
contain all columns in the row.
+         
          start_key, end_key: Starting and ending keys that enclose the region that should
be scanned.
  
      Returns:
@@ -144, +141 @@

          Accept:
              application/xml:    The client is expecting an XML entity body that contains
the 
                                  columns and data together.
+ 
-             octet-stream:       The client is expecting raw binary data. This implies that

-                                 there is only a single column being retrieved.
              Multipart/related:  The client is expecting raw binary data, but organized into
a 
                                  multipart response. The client must be prepared to parse
the 
                                  column values out of the data.
@@ -162, +158 @@

          Accept:
              application/xml:    The client is expecting an XML entity body that contains
the 
                                  columns and data together.
+ 
-             octet-stream:       The client is expecting raw binary data. This implies that

-                                 there is only a single column being retrieved.
              Multipart/related:  The client is expecting raw binary data, but organized into
a 
                                  multipart response. The client must be prepared to parse
the 
                                  column values out of the data.
@@ -174, +169 @@

          If the scanner is used up, HTTP 404 (Not Found).
  
  ~- Stack comment: DELETE to increment strikes me as wrong.  What about a POST/PUT to the
URL /[table_name]/scanner/[scanner_id]/next?  Would return current and move scanner to next
item? -~
+ 
+ ~- Bryan comment: Unforunately I don't think there is any good HTTP verb for this operation.
DELETEing /current is about as good as POST/PUTing /next. With the DELETE approach, there
is one less resource, though. -~
      
  '''DELETE /[table_name]/scanner/[scanner_id]'''
      Close a scanner. You must call this when you are done using a scanner to deallocate
it.

Mime
View raw message