hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-495) No server address listed in .META.
Date Fri, 07 Mar 2008 06:24:58 GMT

     [ https://issues.apache.org/jira/browse/HBASE-495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-495:
------------------------

    Attachment: 495-0.1.patch

Here is a patch against 0.1.  Will make others if this passes muster.

My thought on this issue is that the cluster is so messy w/ millions of log lines, its hard
to debug.  Suggest that we commit this patch against this issue and open another when we see
duplicate regions next time.

What seems to be happening is regions are failing to open out on the regionservers because
dfs is corrupt.  Was thinking could shutdown if IOE out of HDFS but looking at where the exception
is coming up, we actually do do a filesystem check and it must be succeeding.  Also, a failed
compaction may not always be worthy of our shutting down regionserver -- in this case on region
startup it probably is but later as part of normal operation it probably is not.  DFS health
seems to be a tad more involved.

HBASE-495 No server address listed in .META.
M src/java/org/apache/hadoop/hbase/HMaster.java
  (regionServerStartup): Refactor.  Create lease BEFORE scheduling shutdown
  process.  We used do things other way round; meant that we'd shedule a
  shutdown process for every report the regionserver made.  Could be many
  if old lease hanging around.
  (registerRegionServer): Added.  This is body of what used to be in
  regionServerStartup moved here so easy to have a finally in the calling
  method (Should never be an exception out of this method so finally should
  never have to run).

  Removed some useless DEBUG level logs; If thousands of rows in .META.,
  then at least a DEBUG per row multiplied by the shutdown processes
  queued.

> No server address listed in .META.
> ----------------------------------
>
>                 Key: HBASE-495
>                 URL: https://issues.apache.org/jira/browse/HBASE-495
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.16.0
>            Reporter: stack
>             Fix For: 0.1.0, 0.2.0
>
>         Attachments: 495-0.1.patch
>
>
> Michael Bieniosek manufactured the following in a 0.16.0 install:
> {code}
> 08/03/06 17:52:02 DEBUG hbase.HTable: Advancing internal scanner to startKey g80Fi5WZHlzLqGzErrAd7V==
> 08/03/06 17:52:02 DEBUG hbase.HConnectionManager$TableServers: reloading table servers
because: No server address listed in .META. for region enwiki_080103,g80Fi5WZHlzLqGzErrAd7V==,1204768636421
> 08/03/06 17:52:12 DEBUG hbase.HConnectionManager$TableServers: reloading table servers
because: No server address listed in .META. for region enwiki_080103,g80Fi5WZHlzLqGzErrAd7V==,1204768636421
> 08/03/06 17:52:22 DEBUG hbase.HConnectionManager$TableServers: reloading table servers
because: No server address listed in .META. for region enwiki_080103,g80Fi5WZHlzLqGzErrAd7V==,1204768636421
> org.apache.hadoop.hbase.NoServerForRegionException: No server address listed in .META.
for region enwiki_080103,g80Fi5WZHlzLqGzErrAd7V==,1204768636421
>         at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:449)
>         at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:346)
>         at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:309)
>         at org.apache.hadoop.hbase.HTable.getRegionLocation(HTable.java:103)
>         at org.apache.hadoop.hbase.HTable$ClientScanner.nextScanner(HTable.java:854)
>         at org.apache.hadoop.hbase.HTable$ClientScanner.next(HTable.java:915)
>         at org.apache.hadoop.hbase.hql.SelectCommand.scanPrint(SelectCommand.java:233)
>         at org.apache.hadoop.hbase.hql.SelectCommand.execute(SelectCommand.java:100)
>         at org.apache.hadoop.hbase.hql.HQLClient.executeQuery(HQLClient.java:50)
>         at org.apache.hadoop.hbase.Shell.main(Shell.java:114)
> {code}
> When I look in the .META., I see that the above region range has multiple mentions...
: one offlined, two that have startcodes and servers associated and about 5 others that are
just HRIs.  Table is broke.  At least need the merge of overlapping regions tool to fix. 
Digging more....

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message