hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Trivial Update of "Hbase/HbaseRest" by stack
Date Thu, 15 Nov 2007 06:55:22 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest

The comment on the change is:
Added comments

------------------------------------------------------------------------------
- This is a provisional spec for the Hbase-REST api.
+ This is a provisional spec for the Hbase-REST API done under the aegis of [https://issues.apache.org/jira/browse/HADOOP-2068
HADOOP-2068].
  
+ ~-''St.Ack comment: Bryan I added comments inline.  Remove them once them when you are done.-~
  
  == System Information ==
  
- GET /
+ '''GET /'''
      Retrieve a list of all the tables in HBase.
      
      Returns: 
@@ -14, +15 @@

              <table name="first_table" uri="/first_table" />
              <table name="second_table" uri="/second_table" />        
          </tables>
+ ~-''St.Ack comment: FYI, there is an xhtml formatter in hbase under the shell package. 
If we used that for outputting metadata-type pages such as this one, then we'll have a legup
on implementation.  It uses xmlenc which is bundled w/ hadoop.  xmlenc is fast and dumb (like
me).  IIRC, it doesn't do entities; it adds the entity to the closer element too.  This is
dumb.  On otherhand, it makes it so we don't have to have the entities vs. elements argument
(Smile).''-~
      
- GET /[table_name]
+ '''GET /[table_name]'''
      Retrieve metadata about the table. This includes column family descriptors.
  
      Returns: 
@@ -27, +29 @@

                  <columnFamily name="stats" />
              </columnFamilies>
          </table>
-     
+ ~-''St.Ack comment: FYI, here is an example column descriptor: {name: triples, max versions:
3, compression: NONE, in memory: false, max length: 2147483647, bloom filter: none}.  We're
also about to add being able to add arbitrary key/value pairs to both table and column descriptors-~
   
+ 
- GET /[table_name]/regions
+ '''GET /[table_name]/regions'''
-     Retrieve a list of the regions for this table so that you can efficiently split up work
(a la MapReduce).
+     Retrieve a list of the regions for this table so that you can efficiently split up the
work (a la MapReduce).
      
      Options: 
          start_key, end_key: Only return the list of regions that contain the range start_key...end_key
@@ -41, +44 @@

              <region start_key="0201" server="region_server_3" />
          </regions>
  
+ ~-''St.Ack comment: This won't be needed if you use TableInputFormat in your mapper -- but
no harm in having it in place-~   
  
  == Row Interaction ==
  
- GET /[table_name]/row/[row_key]/timestamps
+ '''GET /[table_name]/row/[row_key]/timestamps'''
      Retrieve a list of all the timestamps available for this row key.
  
      Returns: 
@@ -54, +58 @@

              <timestamp value="20071115T000800" uri="/first_table/row/0001/20071115T000800"
/>
              <timestamp value="20071115T001200" uri="/first_table/row/0001/20071115T001200"
/>
          </timestamps>
+ 
+ ~-''St.Ack comment: Currently not supported in native hbase client but we should add it-~
-     
+    
- GET /[table_name]/row/[row_key]/
+ '''GET /[table_name]/row/[row_key]/'''
- GET /[table_name]/row/[row_key]/[timestamp]
+ '''GET /[table_name]/row/[row_key]/[timestamp]'''
      Retrieve data from a row, constrained by an optional timestamp value.
  
      Headers:
@@ -70, +76 @@

                                  column values out of the data.
      Options: 
          columns: A semicolon-delimited list of column names. If omitted, the result will
contain all columns in the row.
+ 
+ ~-''St.Ack comment: +1 that MIME is way to return rows.  -1 that octet-stream would be an
option.  Just expect xml or MIME if full row specified -~
      
- POST/PUT /[table_name]/row/[row_key]/
+ '''POST/PUT /[table_name]/row/[row_key]/'''
- POST/PUT /[table_name]/row/[row_key]/[timestamp]
+ '''POST/PUT /[table_name]/row/[row_key]/[timestamp]'''
      Set the value of one or more columns for a given row key with an optional timestamp.
  
      Headers:
@@ -89, +97 @@

          HTTP 201 (Created) if the column(s) could successfully be saved. HTTP 415 (Unsupported
Media Type) if 
          the query string column options do not match the Content-type header, or if the
binary data of either
          octet-stream or Multipart/related is unreadable.
+ 
+ ~-''St.Ack comment: -1 again on octet-stream.  It messes up your nice clean API.  Might
consider adding column name as MIME header if multipart rather than have columns as option
IF multipart (ignored if XML).  Might not make sense if this only time its done (since every
where else need to be able to handle the column option) -~
      
- DELETE /[table_name]/row/[row_key]/
+ '''DELETE /[table_name]/row/[row_key]/'''
- DELETE /[table_name]/row/[row_key]/[timestamp]
+ '''DELETE /[table_name]/row/[row_key]/[timestamp]'''
      Delete the specified columns from the row. If there are no columns specified, then it
will delete ALL columns. Optionally, specify a timestamp.
      
      Options:
@@ -103, +113 @@

      
  == Scanning ==    
  
- POST/PUT /[table_name]/scanner
+ '''POST/PUT /[table_name]/scanner'''
      Request that a scanner be created with the specified options. Returns a scanner ID that
can be used to iterate over the results of the scanner.
      Options: 
          columns: A semicolon-delimited list of column names. If omitted, each result will
contain all columns in the row.
@@ -113, +123 @@

          HTTP 201 (Created) with a Location header that references the scanner URI. Example:
          /first_table/scanner/1234348890231890
          
- GET /[table_name]/scanner/[scanner_id]/current
+ '''GET /[table_name]/scanner/[scanner_id]/current'''
      Get the row and columns for the current item in the scanner without advancing the scanner.

      Equivalent to a queue peek operation. Multiple requests to this URI will return the
same result.
      
@@ -132, +142 @@

          
          If the scanner is used up, HTTP 404 (Not Found).
      
- DELETE /[table_name]/scanner/[scanner_id]/current
+ '''DELETE /[table_name]/scanner/[scanner_id]/current'''
      Return the current item in the scanner and advance to the next one. Think of it as a
queue dequeue operation.
      
      Headers:
@@ -149, +159 @@

          depends on the Accept header. See the documentation for getting an individual row
for data format.
          
          If the scanner is used up, HTTP 404 (Not Found).
+ 
+ ~- Stack comment: DELETE to increment strikes me as wrong.  What about a POST/PUT to the
URL /[table_name]/scanner/[scanner_id]/next?  Would return current and move scanner to next
item? -~
      
- DELETE /[table_name]/scanner/[scanner_id]
+ '''DELETE /[table_name]/scanner/[scanner_id]'''
      Close a scanner. You must call this when you are done using a scanner to deallocate
it.
      
      Returns:
          HTTP 202 (Accepted) if it can be closed. HTTP 404 (Not Found) if the scanner id
is invalid. 
          HTTP 410 (Gone) if the scanner is already closed or the lease time has expired.
  
+ 
+ == Exception Handling ==
+ Generally, exceptions will show on the REST client side as 40Xs with a descriptive message
and possibly a body with the java stack trace.  TODO: Table of the types of exceptions a client
could get and how they should react.
+ 

Mime
View raw message