hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/HbaseShell/HQL" by udanax
Date Thu, 14 Feb 2008 08:53:34 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by udanax:
http://wiki.apache.org/hadoop/Hbase/HbaseShell/HQL

The comment on the change is:
Deleteing

------------------------------------------------------------------------------
- = HQL plan for Hbase 0.2 =
+ deleted
  
- == HQLClient API ==
- 
- !ResultSet object is used for return the results instead of direct call to scanner.[[BR]]
- The following code shows how to perform a query on the Hbase.
-   
- {{{
-   /* Initializes  a hql client object */
-   HQLClient hql = HQLClient(...);
-   
-   /* ResultSet object to hold the result of query */
-   ReulstSet rs = null;
- 
-   /* execute the hql and put the results in the ResultSet object*/
-   rs = hql.executeQuery("query");
- 
-   while(rs.next())
-   {
-     // Iterate through the ResultSet
-     // and return the actual row/column/timestamp attribute names and cell value
- 
-     row = rs.getRowName();
-     column = rs.getColumnName();
-     timestamp = rs.getTimeStamp(); /* or rs.getDate(); */
-     value = rs.getValue(/* Data Type */); 
-   }
- }}}
- 
- ''Why not go whole way and implement the [http://java.sun.com/j2se/1.4.2/docs/api/java/sql/ResultSet.html
java.sql.ResultSet] interface, etc.?  Would it fit? St.Ack''
- 
- ''Your proposal is being actively considered. Ed''
- ----
- 
- == External HQL command file ==
- === Purpose ===
- To run the specified command file.
- === Syntax ===
- {{{
-   hql > @ filename [arg...];
- }}}
- 
-  * The arguments is used to replace some characters with some other characters in a query
string.
- 
- === Applications ===
- {{{
-   rs = hql.executeQuery("@ './install_tables.txt';");
- }}}
- 
- ''A file of data definition shouldn't be called a 'query file' -- and why can't the file
just as easily load data?  Can you think of something else to call it?  And its possible to
do "$ cat DATA_DEFINITION_FILE|./bin/hbase shell" to get the same effect? St.Ack''
- 
- ----
- == Parallel Execution Features ==
- 
- === Query ===
- Parallel execution can significantly reduce the elapsed time for large queries, but it doesn't
apply to every query. 
- 
- {{{
- hql > alter table tbl_name parallel(map 4 reduce 1);
- hql > select count(*) from tbl_name;
- hql > alter table tbl_name noparallel;
- }}}
- 
- ''This sets a 'parallel' property on whole table (could be just a column?)?  Whats this
do? When you do selects, does MR job?  St.Ack''
- 
- ''Why is the parallel option an attribute of the table? Isn't it an attribute of the query?
Setting the size of the m/r job for the whole table seems inefficient. bryanduxbury''
- 
- === Data Loading ===
- HQL Loader utility loads data into Hbase tables from external files. If you have a large
amount of data to load, HQL Loader's parallel support can dramatically reduce the elapsed
time needed to perform that load.
- ==== Syntax ====
- {{{
- hql > LOAD DATA FILE file_name
-   --> INTO TABLE tbl_name
-   --> [FIELDS TERMINATED BY '\t']
-   --> [LINES TERMINATED BY '\n']
-   --> [(column1[, column2, ...])];
- }}}
- ==== Applications ====
- {{{
-   map {
-     //row is server host name
-     rs = hql.executeQuery("select filePath: from server_data where row='" + row + "'" 
-          + "and column='" + COLUMNFAMILY + ":" + QUALIFIER + "';");
- 
-     hql.executeQuery("load data file '" + rs.getValue() + "' into " + OUTPUT + ";");
-   }
- 
-   main(String[] args) {
-     COLUMNFAMILY = "webserver";           /* or "hadoop" */
-     QUALIFIER = "access_log";             /* or "task_tracker_log" */
-     OUTPUT = "access_log_table";          /* or "task_tracker_log_table" */
-   }
- }}}
- 
- ''Why not do something like mysql where it loads and dumps a near-binary format?  Problem
w/ above is what to do if cell has tab or new-line in it?  St.Ack''
- 
- ''Is this best implemented as part of the shell? Why not have a bin/hbase loader that takes
the options on the command line? bryanduxbury''
- 
- ''bryanduxbury: Yes. i think it should be provided on the language level of HQL, but I have
no objection to './bin/hbase loader'. Ed'' 
- === Data Dumping ===
- HQL dumper utility dumps data into external files from Hbase Hbase tables.
- ==== Syntax ====
- {{{
- hql > DUMP TABLE tbl_name
-   --> INTO FILE file_name
-   --> [FIELDS TERMINATED BY '\t']
-   --> [LINES TERMINATED BY '\n']
-   --> [(column1[, column2, ...])];
- }}}
- ----
- == Addition of Example Applications for Hbase + HQL ==
- 
-  * Apache Log Analyzer
-  * Server Log Analyzer
- 

Mime
View raw message