hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hbase/JSONRest" by Michael Gottesman
Date Sun, 10 Aug 2008 22:48:28 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by Michael Gottesman:

  JSON Rest is at its core a Jetty Http Java Servlet that gives end users complete access
to the HBase Client API through a combination of URLs, JSON, and Query Strings. It supports
Filters, Scanners, and Transactions (Transactions are available with the correct patches),
something which no other HBase client framework does at the moment. But most importantly to
my altruistic side, it is a very modular framework so that it is easy for any end user to
modify it to his/her needs. This was a fun project for me. I hope it is useful to you.
  == Why JSON? ==
  This is a question that I get asked a lot. Why JSON? Why not Thrift? Why not Protocol Buffers?
The answer I give is simple: The Internet. JSON is native to the web and native to the browser.
By using JSON as your message format, you allow for easy interoperability with any/all platforms
currently in the marketplace. Thus you maximize your potential user base through familiarity,
ease of use, and parser availability for the end user.
+ == What about the Old XML Rest Interface? ==
+ It will be deprecated once JSONRest is completely finished. Its current wikipage is here:
{}. If you would like to contribute to the construction of JSONRest, please visit HBase Issue
XXXX, located at the following url: {}
+ == Important Usage Notes *IMPORTANT READ THIS* ==
+ Please read this before you go further to the usage section. I envision two sorts of users
of JSONRest at the moment: the normal database user, and the timestamp database user.
- ''Michael -- would suggest that you reference the old REST interface to hbase; explain that
this implemenation supercedes and that the old is deprecated.  I can help.  Otherwise, looks
excellent.  Keep going.  St.Ack 08/09/2008 (Remove this comment when you've read it)''
+ === For Normal Database Users ===
+ I imagine that many people just want to use HBase as a normal database that can scale to
large loads. This does not necessarily require the usage of the timestamp side of HBase. In
this user's mind when he/she overwrites data, it is destroyed. REMEMBER IT IS NOT DESTROYED.
When you overwrite data in HBase through the usage of the non-timestamp part of the Client
API, all that you are doing is writing another Cell at a sooner timestamp. This will cover
up the older versions of the Cell. They are still there, just hidden if you use the non-timestamp
methods. They will still be there and be accessable via the timestamp methods until enough
old timestamp Cells accumulate that HBase cleans them up. This is done via a configuration
setting which allows you to set the number of different timestamp versions of a cell can exist
before HBase deletes the latest value.
+ === For Timestamp Database Users ===
+ All the methods without timestamps use HConstants.LatestTimestamp (sp?), so you can still
use them if you want to create a new entry with the latest timestamp or delete the latest
  = Usage =
  == Get ==
@@ -58, +66 @@

  === Cell ===
  === Timestamp ===
  == Post ==
+ === Database ===
+ There are no currently supported Database POST use cases.
+ === Table ===
+ === Row ===
+ The Currently supported Row POST use cases are:
+  * Single Row Mutation
+ ==== UseCase: Single Row Mutation ====
+ A mutation is a group of batch operations which transform a Row (including a 'null' row)
from one state to another. To perform a Single Row Mutation one submits a request with the
following form to JSONRest:
+ and attaches to the post an argument of the following form:
+ {{{
+ JSON ARRAY =>  [
+                  {"column_name":"STRING", "value":"STRING"},
+                  {"column_name":"STRING", "value":"STRING"},
+                  {"column_name":"STRING", "value":"STRING"}...
+                ]
+ }}}
+ If your query is successful, JSONRest will respond with a create:true JSON string that looks
like so:
+ {{{
+ {"created":true}
+ }}}
+ So, lets say you had a table named "people" with Column Families "Name", "Address", and
"Phone" and a row named "person-12345" and you wanted to create a new Row for a new friend
John. So you decide to input John's personal information into HBase. You would then send the
following request to JSONRest:
+ {{{
+ POST /people/johnny_appleseed
+ }}}
+ with the following attached content:
+ {{{
+ [
+    {"column_name":"name:first", "value":"Johnny"},
+    {"column_name":"name:last", "value":"Appleseed"},
+    {"column_name":"address:street", "value":"4500 Orange Drive"},
+    {"column_name":"address:city", "value":"Pear City"},
+    {"column_name":"address:state", "value":"Tangerine State"},
+    {"column_name":"phone:home", "value":"1111111111"},
+    {"column_name":"phone:work", "value":"6666666666"}
+ ]
+ }}}
+ Later, lets say that John moves from 4500 Orange Drive => 4600 Clementine Blvd, in Banana
State (Pear City happens to exist in both states). Then you would send the following request
to JSONRest:
+ {{{
+ POST /people/johnny_appleseed
+ }}}
+ with the following attached content:
+ {{{
+ [
+    {"column_name":"address:street", "value":"4600 Clementine Blvd"},
+    {"column_name":"address:state", "value":"Banana State"}
+ ]
+ }}}
+ Since what you are really doing is creating a new timestamped Row at the latest timestamp
in the columns "address:street" and "address:state", JSONRest will return with the same message
from above:
+ {{{
+ {"created":true}
+ }}}
+ === Cell ===
+ === Timestamp ===
  == Put ==
  == Delete ==
  == Transactions ==

View raw message