hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Kennedy <james.kenn...@troove.net>
Subject Re: Opening HRegionServer/HMaster/HClient for extension
Date Wed, 20 Jun 2007 17:54:32 GMT
Hi Michael,

I'll create a Jira task and fix the patch spacing.  I can't really talk 
too much about the HRegionServer/HClient extension i'm developing but I 
do think that there could be a general purpose need.  For example, HBase 
let's you filter a scan by rowkey and column key. But what about actual 
data values?  An extension could be an HRegionServer with a scanner that 
can filters rows by column values given some WHERE criteria.  Or maybe 
that's a bad example cause that should be built directly into HBase?   
Another would be implementing distributed joins between tables...

I havn't had a chance to re-profile yet. I'd modified the HBase code so 
I could extend and so part of the motivation of this patch was so that I 
could revert, update, add the patch you suggested, and then re-apply the 
extension patch.

I've done that and will hopefully get back to profiling this afternoon.

Thanks,
James

Michael Stack wrote:
> The patch looks like an improvement to me.  Whats the rationale for 
> needing to extend client/server?  Do you think it of general 
> applicability?
> I'd suggest making an issue and attaching a patch (file against hbase 
> component and it looks like your tabs are not the hadoop two spaces 
> convention going by the below).  We can continue discussion therein.  
> I offer to vote for it after review and trying it local.
>
> St.Ack
> P.S. Did HADOOP-1498, applied yesterday, change the profiling 
> characteristics you wrote about a few days ago?
>
>
> James Kennedy wrote:
>> For what i'm doing I found it necessary to extend 
>> HRegionServer/HRegion/HClient for some custom functionality.
>>
>> Following good Java practice I see that the HBase code as been 
>> programmed defensively, keeping stuff private as much as possible.
>>
>> However, for extensibility it would be nice if the servers/client 
>> were easy to extend.
>>
>> Attached is a patch that makes several methods protected instead of 
>> private, adds getters to fields of inner classes, and some other 
>> modifications i found were useful for some simple extension code.
>>
>> I didn't make this a Jira task because I wasn't sure if you guys 
>> approved of opening up the code like this but hopefully someone will 
>> find this useful.
>>
>> - James K
>> ------------------------------------------------------------------------
>>
>> Index: 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java

>>
>> ===================================================================
>> --- 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java
   
>> (revision 549130)
>> +++ 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java
   
>> (working copy)
>> @@ -62,7 +62,7 @@
>>    /*
>>     * Data structure that holds current location for a region and its 
>> info.
>>     */
>> -  static class RegionLocation {
>> +  protected static class RegionLocation {
>>      HRegionInfo regionInfo;
>>      HServerAddress serverAddress;
>>  
>> @@ -76,6 +76,22 @@
>>        return "address: " + this.serverAddress.toString() + ", 
>> regioninfo: " +
>>          this.regionInfo;
>>      }
>> +
>> +    public HRegionInfo getRegionInfo() {
>> +        return regionInfo;
>> +    }
>> +
>> +    public void setRegionInfo(HRegionInfo regionInfo) {
>> +        this.regionInfo = regionInfo;
>> +    }
>> +
>> +    public HServerAddress getServerAddress() {
>> +        return serverAddress;
>> +    }
>> +
>> +    public void setServerAddress(HServerAddress serverAddress) {
>> +        this.serverAddress = serverAddress;
>> +    }
>>    }
>>       // Map tableName -> (Map startRow -> (HRegionInfo, HServerAddress)
>> @@ -116,7 +132,7 @@
>>      this.rand = new Random();
>>    }
>>    -  private void handleRemoteException(RemoteException e) throws 
>> IOException {
>> +  protected void handleRemoteException(RemoteException e) throws 
>> IOException {
>>      String msg = e.getMessage();
>>      
>> if(e.getClassName().equals("org.apache.hadoop.hbase.InvalidColumnNameException"))

>> {
>>        throw new InvalidColumnNameException(msg);
>> @@ -143,7 +159,7 @@
>>       /* Find the address of the master and connect to it
>>     */
>> -  private void checkMaster() throws MasterNotRunningException {
>> +  protected void checkMaster() throws MasterNotRunningException {
>>      if (this.master != null) {
>>        return;
>>      }
>> @@ -531,7 +547,7 @@
>>     * @param tableName - the table name to be checked
>>     * @throws IllegalArgumentException - if the table name is reserved
>>     */
>> -  private void checkReservedTableName(Text tableName) {
>> +  protected void checkReservedTableName(Text tableName) {
>>      if(tableName.equals(ROOT_TABLE_NAME)
>>          || tableName.equals(META_TABLE_NAME)) {
>>        @@ -547,7 +563,7 @@
>>    
>> ////////////////////////////////////////////////////////////////////////////// 
>>
>>    // Client API
>>    
>> ////////////////////////////////////////////////////////////////////////////// 
>>
>> -
>> +     /**
>>     * Loads information so that a table can be manipulated.
>>     * @@ -558,8 +574,21 @@
>>      if(tableName == null || tableName.getLength() == 0) {
>>        throw new IllegalArgumentException("table name cannot be null 
>> or zero length");
>>      }
>> -    this.tableServers = tablesToServers.get(tableName);
>> -    if (this.tableServers == null ) {
>> +    this.tableServers = getTableServers(tableName);
>> +  }
>> +  +  /**
>> +   * Gets the servers of the given table.
>> +   * +   * @param tableName - the table to be located
>> +   * @throws IOException - if the table can not be located after 
>> retrying
>> +   */
>> +  protected synchronized SortedMap<Text, RegionLocation> 
>> getTableServers(Text tableName) throws IOException {
>> +    if(tableName == null || tableName.getLength() == 0) {
>> +      throw new IllegalArgumentException("table name cannot be null 
>> or zero length");
>> +    }
>> +    SortedMap<Text, RegionLocation> serverResult  = 
>> tablesToServers.get(tableName);
>> +    if (serverResult == null ) {
>>        if (LOG.isDebugEnabled()) {
>>          LOG.debug("No servers for " + tableName + ". Doing a find...");
>>        }
>> @@ -565,8 +594,9 @@
>>        }
>>        // We don't know where the table is.
>>        // Load the information from meta.
>> -      this.tableServers = findServersForTable(tableName);
>> +      serverResult = findServersForTable(tableName);
>>      }
>> +    return serverResult;
>>    }
>>  
>>    /*
>> @@ -836,7 +866,7 @@
>>     * @param regionServer - the server to connect to
>>     * @throws IOException
>>     */
>> -  synchronized HRegionInterface getHRegionConnection(HServerAddress 
>> regionServer)
>> +  protected synchronized HRegionInterface 
>> getHRegionConnection(HServerAddress regionServer)
>>        throws IOException {
>>  
>>        // See if we already have a connection
>> @@ -916,7 +946,7 @@
>>     * @param row Row to find.
>>     * @return Location of row.
>>     */
>> -  synchronized RegionLocation getRegionLocation(Text row) {
>> +  protected synchronized RegionLocation getRegionLocation(Text row) {
>>      if(row == null || row.getLength() == 0) {
>>        throw new IllegalArgumentException("row key cannot be null or 
>> zero length");
>>      }
>> @@ -1554,6 +1584,20 @@
>>      }
>>           return errCode;
>> +  }  +
>> +  /**
>> +   * @return the map of opened servers
>> +   */
>> +  protected TreeMap<String, HRegionInterface> getOpenServers(){
>> +    return servers;
>> +  }
>> +
>> +  /**
>> +   * @return the configuration for this server
>> +   */
>> +  public Configuration getConf(){
>> +    return conf;
>>    }
>>       /**
>> @@ -1565,4 +1609,5 @@
>>      int errCode = (new HClient(c)).doCommandLine(args);
>>      System.exit(errCode);
>>    }
>> +
>>  }
>> Index: 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java

>>
>> ===================================================================
>> --- 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
   
>> (revision 549130)
>> +++ 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
   
>> (working copy)
>> @@ -55,7 +55,7 @@
>>   * regionName is a unique identifier for this HRegion. (startKey, 
>> endKey]
>>   * defines the keyspace for this HRegion.
>>   */
>> -class HRegion implements HConstants {
>> +public class HRegion implements HConstants {
>>    static String SPLITDIR = "splits";
>>    static String MERGEDIR = "merges";
>>    static String TMPREGION_PREFIX = "tmpregion_";
>> @@ -298,7 +298,7 @@
>>     *     * @throws IOException
>>     */
>> -  HRegion(Path rootDir, HLog log, FileSystem fs, Configuration conf, 
>> +  public HRegion(Path rootDir, HLog log, FileSystem fs, 
>> Configuration conf,        HRegionInfo regionInfo, Path initialFiles)
>>    throws IOException {
>>      @@ -386,7 +386,7 @@
>>     * This method could take some time to execute, so don't call it 
>> from a     * time-sensitive thread.
>>     */
>> -  Vector<HStoreFile> close() throws IOException {
>> +  public Vector<HStoreFile> close() throws IOException {
>>      lock.obtainWriteLock();
>>      try {
>>        boolean shouldClose = false;
>> @@ -548,43 +548,43 @@
>>    // HRegion accessors
>>    
>> ////////////////////////////////////////////////////////////////////////////// 
>>
>>  
>> -  Text getStartKey() {
>> +  public Text getStartKey() {
>>      return regionInfo.startKey;
>>    }
>>    -  Text getEndKey() {
>> +  public Text getEndKey() {
>>      return regionInfo.endKey;
>>    }
>>    -  long getRegionId() {
>> +  public long getRegionId() {
>>      return regionInfo.regionId;
>>    }
>>  
>> -  Text getRegionName() {
>> +  public Text getRegionName() {
>>      return regionInfo.regionName;
>>    }
>>    -  Path getRootDir() {
>> +  public Path getRootDir() {
>>      return rootDir;
>>    }
>>   -  HTableDescriptor getTableDesc() {
>> +  public HTableDescriptor getTableDesc() {
>>      return regionInfo.tableDesc;
>>    }
>>    -  HLog getLog() {
>> +  public HLog getLog() {
>>      return log;
>>    }
>>    -  Configuration getConf() {
>> +  public Configuration getConf() {
>>      return conf;
>>    }
>>    -  Path getRegionDir() {
>> +  public Path getRegionDir() {
>>      return regiondir;
>>    }
>>    -  FileSystem getFilesystem() {
>> +  public FileSystem getFilesystem() {
>>      return fs;
>>    }
>>  
>> @@ -973,7 +973,7 @@
>>     * Return an iterator that scans over the HRegion, returning the 
>> indicated     * columns.  This Iterator must be closed by the caller.
>>     */
>> -  HInternalScannerInterface getScanner(Text[] cols, Text firstRow)
>> +  public HInternalScannerInterface getScanner(Text[] cols, Text 
>> firstRow)
>>    throws IOException {
>>      lock.obtainReadLock();
>>      try {
>> @@ -1011,7 +1011,7 @@
>>     * @return lockid
>>     * @see #put(long, Text, BytesWritable)
>>     */
>> -  long startUpdate(Text row) throws IOException {
>> +  public long startUpdate(Text row) throws IOException {
>>      // We obtain a per-row lock, so other clients will block while 
>> one client
>>      // performs an update.  The read lock is released by the client 
>> calling
>>      // #commit or #abort or if the HRegionServer lease on the lock 
>> expires.
>> @@ -1029,7 +1029,7 @@
>>     * This method really just tests the input, then calls an internal 
>> localput()     * method.
>>     */
>> -  void put(long lockid, Text targetCol, byte [] val) throws 
>> IOException {
>> +  public void put(long lockid, Text targetCol, byte [] val) throws 
>> IOException {
>>      if (DELETE_BYTES.compareTo(val) == 0) {
>>        throw new IOException("Cannot insert value: " + val);
>>      }
>> @@ -1039,7 +1039,7 @@
>>    /**
>>     * Delete a value or write a value. This is a just a convenience 
>> method for put().
>>     */
>> -  void delete(long lockid, Text targetCol) throws IOException {
>> +  public void delete(long lockid, Text targetCol) throws IOException {
>>      localput(lockid, targetCol, DELETE_BYTES.get());
>>    }
>>  
>> @@ -1055,7 +1055,7 @@
>>     * @param val Value to enter into cell
>>     * @throws IOException
>>     */
>> -  void localput(final long lockid, final Text targetCol,
>> +  public void localput(final long lockid, final Text targetCol,
>>      final byte [] val)
>>    throws IOException {
>>      checkColumn(targetCol);
>> @@ -1090,7 +1090,7 @@
>>     * writes associated with the given row-lock.  These values have 
>> not yet
>>     * been placed in memcache or written to the log.
>>     */
>> -  void abort(long lockid) throws IOException {
>> +  public void abort(long lockid) throws IOException {
>>      Text row = getRowFromLock(lockid);
>>      if(row == null) {
>>        throw new LockException("No write lock for lockid " + lockid);
>> @@ -1124,7 +1124,7 @@
>>     * @param lockid Lock for row we're to commit.
>>     * @throws IOException
>>     */
>> -  void commit(final long lockid) throws IOException {
>> +  public void commit(final long lockid) throws IOException {
>>      // Remove the row from the pendingWrites list so      // that 
>> repeated executions won't screw this up.
>>      Text row = getRowFromLock(lockid);
>> Index: 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java

>>
>> ===================================================================
>> --- 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java
   
>> (revision 549130)
>> +++ 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java
   
>> (working copy)
>> @@ -139,6 +139,76 @@
>>      this.regionName.readFields(in);
>>      this.offLine = in.readBoolean();
>>    }
>> +  +  /**
>> +   * @return the endKey
>> +   */
>> +  public Text getEndKey(){
>> +    return endKey;
>> +  }
>> +
>> +  /**
>> +   * @param endKey the endKey to set
>> +   */
>> +  public void setEndKey(Text endKey){
>> +    this.endKey = endKey;
>> +  }
>> +
>> +  /**
>> +   * @return the regionId
>> +   */
>> +  public long getRegionId(){
>> +    return regionId;
>> +  }
>> +
>> +  /**
>> +   * @param regionId the regionId to set
>> +   */
>> +  public void setRegionId(long regionId){
>> +    this.regionId = regionId;
>> +  }
>> +
>> +  /**
>> +   * @return the regionName
>> +   */
>> +  public Text getRegionName(){
>> +    return regionName;
>> +  }
>> +
>> +  /**
>> +   * @param regionName the regionName to set
>> +   */
>> +  public void setRegionName(Text regionName){
>> +    this.regionName = regionName;
>> +  }
>> +
>> +  /**
>> +   * @return the startKey
>> +   */
>> +  public Text getStartKey(){
>> +    return startKey;
>> +  }
>> +
>> +  /**
>> +   * @param startKey the startKey to set
>> +   */
>> +  public void setStartKey(Text startKey){
>> +    this.startKey = startKey;
>> +  }
>> +
>> +  /**
>> +   * @return the tableDesc
>> +   */
>> +  public HTableDescriptor getTableDesc(){
>> +    return tableDesc;
>> +  }
>> +
>> +  /**
>> +   * @param tableDesc the tableDesc to set
>> +   */
>> +  public void setTableDesc(HTableDescriptor tableDesc){
>> +    this.tableDesc = tableDesc;
>> +  }
>>  
>>    
>> ////////////////////////////////////////////////////////////////////////////// 
>>
>>    // Comparable
>> @@ -162,4 +232,6 @@
>>      // Compare end keys.
>>      return this.endKey.compareTo(other.endKey);
>>    }
>> +
>> +  }
>> Index: 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java

>>
>> ===================================================================
>> --- 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
   
>> (revision 549130)
>> +++ 
>> /opt/eclipse/sandbox2/Hadoop/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
   
>> (working copy)
>> @@ -468,7 +468,7 @@
>>     * Sets a flag that will cause all the HRegionServer threads to 
>> shut down
>>     * in an orderly fashion.
>>     */
>> -  synchronized void stop() {
>> +  public synchronized void stop() {
>>      stopRequested = true;
>>      notifyAll();                        // Wakes run() if it is 
>> sleeping
>>    }
>> @@ -1079,7 +1079,7 @@
>>    }
>>  
>>    /** -   * Private utility method for safely obtaining an HRegion 
>> handle.
>> +   * Protected utility method for safely obtaining an HRegion handle.
>>     * @param regionName Name of online {@link HRegion} to return
>>     * @return {@link HRegion} for <code>regionName</code>
>>     * @throws NotServingRegionException
>> @@ -1084,7 +1084,7 @@
>>     * @return {@link HRegion} for <code>regionName</code>
>>     * @throws NotServingRegionException
>>     */
>> -  private HRegion getRegion(final Text regionName)
>> +  protected HRegion getRegion(final Text regionName)
>>    throws NotServingRegionException {
>>      return getRegion(regionName, false);
>>    }
>> @@ -1090,7 +1090,7 @@
>>    }
>>       /** -   * Private utility method for safely obtaining an 
>> HRegion handle.
>> +   * Protected utility method for safely obtaining an HRegion handle.
>>     * @param regionName Name of online {@link HRegion} to return
>>     * @param checkRetiringRegions Set true if we're to check retiring 
>> regions
>>     * as well as online regions.
>> @@ -1097,7 +1097,7 @@
>>     * @return {@link HRegion} for <code>regionName</code>
>>     * @throws NotServingRegionException
>>     */
>> -  private HRegion getRegion(final Text regionName,
>> +  protected HRegion getRegion(final Text regionName,
>>        final boolean checkRetiringRegions)
>>    throws NotServingRegionException {
>>      HRegion region = null;
>>   
>


Mime
View raw message