Return-Path: Delivered-To: apmail-lucene-hadoop-commits-archive@locus.apache.org Received: (qmail 54003 invoked from network); 15 Nov 2007 19:19:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Nov 2007 19:19:22 -0000 Received: (qmail 37599 invoked by uid 500); 15 Nov 2007 19:19:09 -0000 Delivered-To: apmail-lucene-hadoop-commits-archive@lucene.apache.org Received: (qmail 37555 invoked by uid 500); 15 Nov 2007 19:19:09 -0000 Mailing-List: contact hadoop-commits-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-commits@lucene.apache.org Received: (qmail 37546 invoked by uid 99); 15 Nov 2007 19:19:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Nov 2007 11:19:09 -0800 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Nov 2007 19:19:07 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id B9D20D2E7 for ; Thu, 15 Nov 2007 19:19:00 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: hadoop-commits@lucene.apache.org Date: Thu, 15 Nov 2007 19:19:00 -0000 Message-ID: <20071115191900.2158.41236@eos.apache.org> Subject: [Lucene-hadoop Wiki] Update of "Hbase/HbaseRest" by BryanDuxbury X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification. The following page has been changed by BryanDuxbury: http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest ------------------------------------------------------------------------------ This is a provisional spec for the Hbase-REST API done under the aegis of [https://issues.apache.org/jira/browse/HADOOP-2068 HADOOP-2068]. - - ~-''St.Ack comment: Bryan I added comments inline. Remove them once them when you are done.-~ == System Information == @@ -13, +11 @@ XML entity body that contains a list of the tables like so: {{{ - +
-
+
}}} @@ -29, +27 @@
- + - +
}}} - ~-''St.Ack comment: FYI, here is an example column descriptor: {name: triples, max versions: 3, compression: NONE, in memory: false, max length: 2147483647, bloom filter: none}. We're also about to add being able to add arbitrary key/value pairs to both table and column descriptors-~ '''GET /[table_name]/regions''' Retrieve a list of the regions for this table so that you can efficiently split up the work (a la MapReduce). @@ -82, +79 @@ Accept: application/xml: The client is expecting an XML entity body that contains the columns and data together. + - octet-stream: The client is expecting raw binary data. This implies that - there is only a single column being retrieved. Multipart/related: The client is expecting raw binary data, but organized into a multipart response. The client must be prepared to parse the column values out of the data. Options: columns: A semicolon-delimited list of column names. If omitted, the result will contain all columns in the row. - ~-''St.Ack comment: +1 that MIME is way to return rows. -1 that octet-stream would be an option. Just expect xml or MIME if full row specified -~ '''POST/PUT /[table_name]/row/[row_key]/''' '''POST/PUT /[table_name]/row/[row_key]/[timestamp]''' @@ -99, +94 @@ Headers: Content-type: application/xml: The client is sending one or more columns of data in an XML entity. - octet-stream: The client is sending EXACTLY ONE column value as raw binary - as specified in the columns attribute. Multipart/related: The client is sending multiple columns of data encoded with boundaries. Options: @@ -111, +104 @@ the query string column options do not match the Content-type header, or if the binary data of either octet-stream or Multipart/related is unreadable. - ~-''St.Ack comment: -1 again on octet-stream. It messes up your nice clean API. Might consider adding column name as MIME header if multipart rather than have columns as option IF multipart (ignored if XML). Might not make sense if this only time its done (since every where else need to be able to handle the column option) -~ + ~-''St.Ack comment: Might consider adding column name as MIME header if multipart rather than have columns as option IF multipart (ignored if XML). Might not make sense if this only time its done (since every where else need to be able to handle the column option) -~ + + ~-''Bryan comment: While we certainly could use headers, I'd prefer not to. Headers seem like an ugly way to say what you're sending. In REST, you're supposed to specify '''what''' you're acting on in the URI, not headers, and which columns to save to qualifies to me. It may turn out to be an implementation question, but we'll see. ''-~ + '''DELETE /[table_name]/row/[row_key]/''' '''DELETE /[table_name]/row/[row_key]/[timestamp]''' @@ -130, +126 @@ Request that a scanner be created with the specified options. Returns a scanner ID that can be used to iterate over the results of the scanner. Options: columns: A semicolon-delimited list of column names. If omitted, each result will contain all columns in the row. + start_key, end_key: Starting and ending keys that enclose the region that should be scanned. Returns: @@ -144, +141 @@ Accept: application/xml: The client is expecting an XML entity body that contains the columns and data together. + - octet-stream: The client is expecting raw binary data. This implies that - there is only a single column being retrieved. Multipart/related: The client is expecting raw binary data, but organized into a multipart response. The client must be prepared to parse the column values out of the data. @@ -162, +158 @@ Accept: application/xml: The client is expecting an XML entity body that contains the columns and data together. + - octet-stream: The client is expecting raw binary data. This implies that - there is only a single column being retrieved. Multipart/related: The client is expecting raw binary data, but organized into a multipart response. The client must be prepared to parse the column values out of the data. @@ -174, +169 @@ If the scanner is used up, HTTP 404 (Not Found). ~- Stack comment: DELETE to increment strikes me as wrong. What about a POST/PUT to the URL /[table_name]/scanner/[scanner_id]/next? Would return current and move scanner to next item? -~ + + ~- Bryan comment: Unforunately I don't think there is any good HTTP verb for this operation. DELETEing /current is about as good as POST/PUTing /next. With the DELETE approach, there is one less resource, though. -~ '''DELETE /[table_name]/scanner/[scanner_id]''' Close a scanner. You must call this when you are done using a scanner to deallocate it.