hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Errors after major compaction
Date Tue, 05 Jul 2011 23:43:58 GMT
Eran:
You didn't run hbck during the enabling of gs_raw_events table, right ?

I saw:
2011-06-29 16:43:50,395 DEBUG
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction (major)
requested for
gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
because User-triggered major compaction; priority=1, compaction queue
size=1248

The above might be related to:
>> 2011-06-29 16:43:57,880 INFO
org.apache.hadoop.hbase.
master.AssignmentManager: Region has been
PENDING_OPEN for too long, reassigning
region=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.

Thanks

On Tue, Jul 5, 2011 at 7:19 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> Eran:
> I logged https://issues.apache.org/jira/browse/HBASE-4060 for you.
>
>
> On Mon, Jul 4, 2011 at 2:30 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Thanks for the understanding.
>>
>> Can you log a JIRA and put your ideas below in it ?
>>
>>
>>
>> On Jul 4, 2011, at 12:42 AM, Eran Kutner <eran@gigya.com> wrote:
>>
>> > Thanks for the explanation Ted,
>> >
>> > I will try to apply HBASE-3789 and hope for the best but my
>> understanding is
>> > that it doesn't really solve the problem, it only reduces the
>> probability of
>> > it happening, at least in one particular scenario. I would hope for a
>> more
>> > robust solution.
>> > My concern is that the region allocation process seems to rely too much
>> on
>> > timing considerations and doesn't seem to take enough measures to
>> guarantee
>> > conflicts do not occur. I understand that in a distributed environment,
>> when
>> > you don't get a timely response from a remote machine you can't know for
>> > sure if it did or did not receive the request, however there are things
>> that
>> > can be done to mitigate this and reduce the conflict time significantly.
>> For
>> > example, when I run dbck it knows that some regions are multiply
>> assigned,
>> > the master could do the same and try to resolve the conflict. Another
>> > approach would be to handle late responses, even if the response from
>> the
>> > remote machine arrives after it was assumed to be dead the master should
>> > have enough information to know it had created a conflict by assigning
>> the
>> > region to another server. An even better solution, I think, is for the
>> RS to
>> > periodically test that it is indeed the rightful owner of every region
>> it
>> > holds and relinquish control over the region if it's not.
>> > Obviously a state where two RSs hold the same region is pathological and
>> can
>> > lead to data loss, as demonstrated in my case. The system should be able
>> to
>> > actively protect itself against such a scenario. It probably doesn't
>> need
>> > saying but there is really nothing worse for a data storage system than
>> data
>> > loss.
>> >
>> > In my case the problem didn't happen in the initial phase but after
>> > disabling and enabling a table with about 12K regions.
>> >
>> > -eran
>> >
>> >
>> >
>> > On Sun, Jul 3, 2011 at 23:49, Ted Yu <yuzhihong@gmail.com> wrote:
>> >
>> >> Let me try to answer some of your questions.
>> >> The two paragraphs below were written along my reasoning which is in
>> >> reverse
>> >> order of the actual call sequence.
>> >>
>> >> For #4 below, the log indicates that the following was executed:
>> >> private void assign(final RegionState state, final boolean
>> setOfflineInZK,
>> >>     final boolean forceNewPlan) {
>> >>   for (int i = 0; i < this.maximumAssignmentAttempts; i++) {
>> >>     if (setOfflineInZK && !*setOfflineInZooKeeper*(state)) return;
>> >>
>> >> The above was due to the timeout which you noted in #2 which would have
>> >> caused
>> >> TimeoutMonitor.chore() to run this code (line 1787)
>> >>
>> >>     for (Map.Entry<HRegionInfo, Boolean> e: assigns.entrySet()){
>> >>       assign(e.getKey(), false, e.getValue());
>> >>     }
>> >>
>> >> This means there is lack of coordination between
>> >> assignmentManager.TimeoutMonitor and OpenedRegionHandler
>> >>
>> >> The reason I mention HBASE-3789 is that it is marked as Incompatible
>> change
>> >> and is in TRUNK already.
>> >> The application of HBASE-3789 to 0.90 branch would change the behavior
>> >> (timing) of region assignment.
>> >>
>> >> I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4
>> >>
>> >> BTW were the incorrect region assignments observed for a table with
>> >> multiple
>> >> initial regions ?
>> >> If so, I have HBASE-4010 in TRUNK which speeds up initial region
>> assignment
>> >> by about 50%.
>> >>
>> >> Cheers
>> >>
>> >> On Sun, Jul 3, 2011 at 12:02 PM, Eran Kutner <eran@gigya.com> wrote:
>> >>
>> >>> Ted,
>> >>> So if I understand correctly the the theory is that because of the
>> issue
>> >>> fixed in HBASE-3789 the master took too long to detect that the region
>> >> was
>> >>> successfully opened by the first server so it forced closed it and
>> >>> transitioned to a second server, but there are a few things about this
>> >>> scenario I don't understand, probably because I don't know enough
>> about
>> >> the
>> >>> inner workings of the region transition process and would appreciate
>> it
>> >> if
>> >>> you can help me understand:
>> >>> 1. The RS opened the region at 16:37:49.
>> >>> 2. The master started handling the opened event at 16:39:54 - this
>> delay
>> >>> can
>> >>> probably be explained by HBASE-3789
>> >>> 3. At 16:39:54 the master log says: Opened region gs_raw_events,.....
>> on
>> >>> hadoop1-s05.farm-ny.gigya.com
>> >>> 4. Then at 16:40:00 the master log says:
>> master:60000-0x13004a31d7804c4
>> >>> Creating (or updating) unassigned node for
>> 584dac5cc70d8682f71c4675a843c3
>> >>> 09 with OFFLINE state - why did it decide to take the region offline
>> >> after
>> >>> learning it was successfully opened?
>> >>> 5. Then it tries to reopen the region on hadoop1-s05, which indicates
>> in
>> >>> its
>> >>> log that the open request failed because the region was already open -
>> >> why
>> >>> didn't the master use that information to learn that the region was
>> >> already
>> >>> open?
>> >>> 6. At 16:43:57 the master decides the region transition timed out and
>> >>> starts
>> >>> forcing the transition - HBASE-3789 again?
>> >>> 7. Now the master forces the transition of the region to hadoop1-s02
>> but
>> >>> there is no sign of that on hadoop1-s05 - why doesn't the old RS
>> >>> (hadoop1-s05) detect that it is no longer the master and relinquishes
>> >>> control of the region?
>> >>>
>> >>> Thanks.
>> >>>
>> >>> -eran
>> >>>
>> >>>
>> >>>
>> >>> On Sun, Jul 3, 2011 at 20:09, Ted Yu <yuzhihong@gmail.com> wrote:
>> >>>
>> >>>> HBASE-3789 should have sped up region assignment.
>> >>>> The patch for 0.90 is attached to that JIRA.
>> >>>>
>> >>>> You may prudently apply that patch.
>> >>>>
>> >>>> Regards
>> >>>>
>> >>>> On Sun, Jul 3, 2011 at 10:01 AM, Eran Kutner <eran@gigya.com> wrote:
>> >>>>
>> >>>>> Thanks Ted, but, as stated before, I'm already using 0.90.3, so
>> >> either
>> >>>> it's
>> >>>>> not fixed or it's not the same thing.
>> >>>>>
>> >>>>> -eran
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Sun, Jul 3, 2011 at 17:27, Ted Yu <yuzhihong@gmail.com> wrote:
>> >>>>>
>> >>>>>> Eran:
>> >>>>>> I was thinking of this:
>> >>>>>> HBASE-3789  Cleanup the locking contention in the master
>> >>>>>>
>> >>>>>> though it doesn't directly handle 'PENDING_OPEN for too long' case.
>> >>>>>>
>> >>>>>> https://issues.apache.org/jira/browse/HBASE-3741 is in 0.90.3 and
>> >>>>> actually
>> >>>>>> close to the symptom you described.
>> >>>>>>
>> >>>>>> On Sun, Jul 3, 2011 at 12:00 AM, Eran Kutner <eran@gigya.com>
>> >> wrote:
>> >>>>>>
>> >>>>>>> It does seem that both servers opened the same region around the
>> >>> same
>> >>>>>> time.
>> >>>>>>> The region was offline because I disabled the table so I can
>> >> change
>> >>>> its
>> >>>>>>> TTL.
>> >>>>>>>
>> >>>>>>> Here is the log from haddop1-s05 :
>> >>>>>>> 2011-06-29 16:37:12,576 INFO
>> >>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Received
>> >>> request
>> >>>> to
>> >>>>>>> open
>> >>>>>>> region:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:37:12,680 DEBUG
>> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
>> >>>>>> Processing
>> >>>>>>> open of
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:37:12,680 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x33004a38816050b Attempting to transition
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from M_ZK_REGION_OFFLINE to
>> >>>>>>> RS_ZK_REGION_OPENING
>> >>>>>>> 2011-06-29 16:37:12,711 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x33004a38816050b Successfully transitioned
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from M_ZK_REGION_OFFLINE to
>> >>>>>>> RS_ZK_REGION_OPENING
>> >>>>>>> 2011-06-29 16:37:12,711 DEBUG
>> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >>>>>>> Opening region: REGION => {NAME =>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> 'gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.',
>> >>>>>>> STARTKEY => 'GSLoad_1308518553_168_WEB204', ENDKEY =>
>> >>>>>>> 'GSLoad_1308518810_1249_WEB204', ENCODED =>
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309, TABLE => {{NAME =>
>> >>> 'gs_raw_events',
>> >>>>>>> FAMILIES => [{NAME => 'events', BLOOMFILTER => 'NONE',
>> >>>>> REPLICATION_SCOPE
>> >>>>>> =>
>> >>>>>>> '1', VERSIONS => '3', COMPRESSION => 'LZO', TTL => '604800',
>> >>>> BLOCKSIZE
>> >>>>> =>
>> >>>>>>> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>> >>>>>>> 2011-06-29 16:37:12,711 DEBUG
>> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >>>>>>> Instantiated
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:37:12,847 DEBUG
>> >>>>> org.apache.hadoop.hbase.regionserver.Store:
>> >>>>>>> loaded
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> hdfs://hadoop1-m1:8020/hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360,
>> >>>>>>> isReference=false, isBulkLoadResult=false, seqid=1162228062,
>> >>>>>>> majorCompaction=false
>> >>>>>>> 2011-06-29 16:37:12,848 INFO
>> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >>>>>>> Onlined
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.;
>> >>>>>>> next sequenceid=1162228063
>> >>>>>>> 2011-06-29 16:37:12,849 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x33004a38816050b Attempting to transition
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
>> >>>>>>> RS_ZK_REGION_OPENING
>> >>>>>>> 2011-06-29 16:37:12,875 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x33004a38816050b Successfully transitioned
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
>> >>>>>>> RS_ZK_REGION_OPENING
>> >>>>>>> 2011-06-29 16:37:12,951 INFO
>> >>>>> org.apache.hadoop.hbase.catalog.MetaEditor:
>> >>>>>>> Updated row
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> in region .META.,,1 with server=
>> >>> hadoop1-s05.farm-ny.gigya.com:60020,
>> >>>>>>> startcode=1307349217076
>> >>>>>>> 2011-06-29 16:37:12,951 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x33004a38816050b Attempting to transition
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
>> >>>>>>> RS_ZK_REGION_OPENED
>> >>>>>>> 2011-06-29 16:37:12,964 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x33004a38816050b Successfully transitioned
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
>> >>>>>>> RS_ZK_REGION_OPENED
>> >>>>>>> 2011-06-29 16:37:12,964 DEBUG
>> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
>> >>>> Opened
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:40:00,878 INFO
>> >>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Received
>> >>> request
>> >>>> to
>> >>>>>>> open
>> >>>>>>> region:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:40:00,878 DEBUG
>> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
>> >>>>>> Processing
>> >>>>>>> open of
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:40:01,079 WARN
>> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
>> >>>>> Attempted
>> >>>>>>> open of
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> but already online on this server
>> >>>>>>> 2011-06-29 16:43:50,395 DEBUG
>> >>>>>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
>> >> Compaction
>> >>>>>> (major)
>> >>>>>>> requested for
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> because User-triggered major compaction; priority=1, compaction
>> >>> queue
>> >>>>>>> size=1248
>> >>>>>>> 2011-06-29 20:19:49,906 INFO
>> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >>>>>>> Starting major compaction on region
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 20:19:49,906 INFO
>> >>>>> org.apache.hadoop.hbase.regionserver.Store:
>> >>>>>>> Started compaction of 1 file(s) in cf=events  into
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> hdfs://hadoop1-m1:8020/hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/.tmp,
>> >>>>>>> seqid=1162228062, totalSize=98.3m
>> >>>>>>> 2011-06-29 20:19:49,906 DEBUG
>> >>>>> org.apache.hadoop.hbase.regionserver.Store:
>> >>>>>>> Compacting
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> hdfs://hadoop1-m1:8020/hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360,
>> >>>>>>> keycount=6882816, bloomtype=NONE, size=98.3m
>> >>>>>>> 2011-06-29 20:19:59,920 INFO
>> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >>>>>>> completed compaction on region
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> after 10sec
>> >>>>>>>
>> >>>>>>> And here is the one from hadoop1-s02:
>> >>>>>>> 2011-06-29 16:43:57,935 INFO
>> >>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Received
>> >>> request
>> >>>> to
>> >>>>>>> open
>> >>>>>>> region:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:43:58,990 DEBUG
>> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
>> >>>>>> Processing
>> >>>>>>> open of
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:43:58,990 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x23004a31d8904de Attempting to transition
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from M_ZK_REGION_OFFLINE to
>> >>>>>>> RS_ZK_REGION_OPENING
>> >>>>>>> 2011-06-29 16:43:59,002 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x23004a31d8904de Successfully transitioned
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from M_ZK_REGION_OFFLINE to
>> >>>>>>> RS_ZK_REGION_OPENING
>> >>>>>>> 2011-06-29 16:43:59,002 DEBUG
>> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >>>>>>> Opening region: REGION => {NAME =>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> 'gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.',
>> >>>>>>> STARTKEY => 'GSLoad_1308518553_168_WEB204', ENDKEY =>
>> >>>>>>> 'GSLoad_1308518810_1249_WEB204', ENCODED =>
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309, TABLE => {{NAME =>
>> >>> 'gs_raw_events',
>> >>>>>>> FAMILIES => [{NAME => 'events', BLOOMFILTER => 'NONE',
>> >>>>> REPLICATION_SCOPE
>> >>>>>> =>
>> >>>>>>> '1', VERSIONS => '3', COMPRESSION => 'LZO', TTL => '604800',
>> >>>> BLOCKSIZE
>> >>>>> =>
>> >>>>>>> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>> >>>>>>> 2011-06-29 16:43:59,003 DEBUG
>> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >>>>>>> Instantiated
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> 2011-06-29 16:43:59,204 DEBUG
>> >>>>> org.apache.hadoop.hbase.regionserver.Store:
>> >>>>>>> loaded
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> hdfs://hadoop1-m1:8020/hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360,
>> >>>>>>> isReference=false, isBulkLoadResult=false, seqid=1162228062,
>> >>>>>>> majorCompaction=false
>> >>>>>>> 2011-06-29 16:43:59,205 INFO
>> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>> >>>>>>> Onlined
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.;
>> >>>>>>> next sequenceid=1162228063
>> >>>>>>> 2011-06-29 16:43:59,205 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x23004a31d8904de Attempting to transition
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
>> >>>>>>> RS_ZK_REGION_OPENING
>> >>>>>>> 2011-06-29 16:43:59,212 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x23004a31d8904de Successfully transitioned
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
>> >>>>>>> RS_ZK_REGION_OPENING
>> >>>>>>> 2011-06-29 16:43:59,214 INFO
>> >>>>> org.apache.hadoop.hbase.catalog.MetaEditor:
>> >>>>>>> Updated row
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> in region .META.,,1 with server=
>> >>> hadoop1-s02.farm-ny.gigya.com:60020,
>> >>>>>>> startcode=1306919627544
>> >>>>>>> 2011-06-29 16:43:59,214 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x23004a31d8904de Attempting to transition
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
>> >>>>>>> RS_ZK_REGION_OPENED
>> >>>>>>> 2011-06-29 16:43:59,224 DEBUG
>> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>> regionserver:60020-0x23004a31d8904de Successfully transitioned
>> >> node
>> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
>> >>>>>>> RS_ZK_REGION_OPENED
>> >>>>>>> 2011-06-29 16:43:59,224 DEBUG
>> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
>> >>>> Opened
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>> java.io.IOException: Got error in response to OP_READ_BLOCK
>> >> self=/
>> >>>>>>> 10.1.104.2:33356, remote=/10.1.104.2:50010 for file
>> >>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> /hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360
>> >>>>>>> for block 3674866614142268536_674205
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Ted, can you please point me to J-D's bug fix you mentioned? Are
>> >>> you
>> >>>>>>> positive it's the same scenario - data loss is a very serious
>> >>> problem
>> >>>>> for
>> >>>>>> a
>> >>>>>>> DB.
>> >>>>>>> I'd really like to apply that patch ASAP, because when I run hbck
>> >> I
>> >>>> get
>> >>>>>>> over
>> >>>>>>> 400 regions which are multiply assigned.
>> >>>>>>> Last question, I understand the region's data is lost but is
>> >> there
>> >>> a
>> >>>>> way
>> >>>>>> to
>> >>>>>>> at least make the table consistent again by some how removing the
>> >>>> lost
>> >>>>>>> region?
>> >>>>>>>
>> >>>>>>> Thanks.
>> >>>>>>>
>> >>>>>>> -eran
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Sat, Jul 2, 2011 at 01:46, Ted Yu <yuzhihong@gmail.com>
>> >> wrote:
>> >>>>>>>
>> >>>>>>>>>> 2011-06-29 16:43:57,880 INFO
>> >>>>>>>> org.apache.hadoop.hbase.
>> >>>>>>>> master.AssignmentManager: Region has been
>> >>>>>>>> PENDING_OPEN for too long, reassigning
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> region=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>
>> >>>>>>>> The double assignment should have been fixed by J-D's recent
>> >>>> checkin.
>> >>>>>>>>
>> >>>>>>>> On Fri, Jul 1, 2011 at 3:14 PM, Stack <stack@duboce.net>
>> >> wrote:
>> >>>>>>>>
>> >>>>>>>>> Is
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>> the region that was having the issue?  If so, if you looked
>> >> in
>> >>>>>>>>> hadoop1-s05's logs, was this region opened around 2011-06-29
>> >>>>>> 16:43:57?
>> >>>>>>>>> Was it also opened hadoop1-s02 not long after?  Did you say
>> >>> what
>> >>>>>>>>> version of hbase you are on?
>> >>>>>>>>>
>> >>>>>>>>> St.Ack
>> >>>>>>>>>
>> >>>>>>>>> On Fri, Jul 1, 2011 at 5:08 AM, Eran Kutner <eran@gigya.com>
>> >>>>> wrote:
>> >>>>>>>>>> Hi Stack,
>> >>>>>>>>>> I'm not sure what the log means. I do see references to two
>> >>>>>> different
>> >>>>>>>>>> servers, but that would probably happen if there was normal
>> >>>>>>> transition
>> >>>>>>>> I
>> >>>>>>>>>> assume. I'm using version 0.90.3
>> >>>>>>>>>> Here are the relevant lines from the master logs:
>> >>>>>>>>>>
>> >>>>>>>>>> 2011-06-19 21:39:37,164 INFO
>> >>>>>>>>> org.apache.hadoop.hbase.master.ServerManager:
>> >>>>>>>>>> Received REGION_SPLIT:
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533691659.9000a5d8df9502efc90d2c23567e4658.:
>> >>>>>>>>>> Daughters;
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518810_1249_WEB204,1308533970928.46f876a4e97be04edb35eb8f8959d482.
>> >>>>>>>>>> from hadoop1-s05.farm-ny.gigya.com,60020,1307349217076
>> >>>>>>>>>> 2011-06-19 21:43:12,983 INFO
>> >>>>>>>> org.apache.hadoop.hbase.catalog.MetaEditor:
>> >>>>>>>>>> Deleted daughter reference
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
>> >>>>>>>>>> qualifier=splitA, from parent
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533691659.9000a5d8df9502efc90d2c23567e4658.
>> >>>>>>>>>> 2011-06-29 16:29:36,143 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Starting
>> >>>>>>> unassignment
>> >>>>>>>>> of
>> >>>>>>>>>> region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> (offlining)
>> >>>>>>>>>> 2011-06-29 16:29:36,146 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Sent
>> >> CLOSE
>> >>> to
>> >>>>>>>>> serverName=
>> >>>>>>>>>> hadoop1-s05.farm-ny.gigya.com,60020,1307349217076,
>> >>>>>> load=(requests=0,
>> >>>>>>>>>> regions=1654, usedHeap=1870, maxHeap=12483) for region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> 2011-06-29 16:29:38,327 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
>> >>> new
>> >>>>>>>> unassigned
>> >>>>>>>>>> node: /hbase/unassigned/584dac5cc70d8682f71c4675a843c309
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> (region=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
>> >>>>>>>>>> server=hadoop1-s05.farm-ny.gigya.com,60020,1307349217076,
>> >>>>>>>>>> state=RS_ZK_REGION_CLOSED)
>> >>>>>>>>>> 2011-06-29 16:29:38,327 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
>> >>>>>>>>>> transition=RS_ZK_REGION_CLOSED,
>> >>>>>>>>>> server=hadoop1-s05.farm-ny.gigya.com,60020,1307349217076,
>> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
>> >>>>>>>>>> 2011-06-29 16:30:53,742 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler:
>> >>>>>> Handling
>> >>>>>>>>> CLOSED
>> >>>>>>>>>> event for 584dac5cc70d8682f71c4675a843c309
>> >>>>>>>>>> 2011-06-29 16:30:53,742 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Table
>> >> being
>> >>>>>>> disabled
>> >>>>>>>> so
>> >>>>>>>>>> deleting ZK node and removing from regions in transition,
>> >>>>> skipping
>> >>>>>>>>>> assignment of region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> 2011-06-29 16:30:53,742 DEBUG
>> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>>>>> master:60000-0x13004a31d7804c4 Deleting existing unassigned
>> >>>> node
>> >>>>>> for
>> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 that is in expected state
>> >>>>>>>>>> RS_ZK_REGION_CLOSED
>> >>>>>>>>>> 2011-06-29 16:30:53,801 DEBUG
>> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>>>>> master:60000-0x13004a31d7804c4 Successfully deleted
>> >>> unassigned
>> >>>>> node
>> >>>>>>> for
>> >>>>>>>>>> region 584dac5cc70d8682f71c4675a843c309 in expected state
>> >>>>>>>>>> RS_ZK_REGION_CLOSED
>> >>>>>>>>>> 2011-06-29 16:34:01,453 INFO
>> >>>>>>>> org.apache.hadoop.hbase.catalog.MetaEditor:
>> >>>>>>>>>> Updated region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> in META
>> >>>>>>>>>> 2011-06-29 16:37:12,247 DEBUG
>> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>>>>> master:60000-0x13004a31d7804c4 Creating (or updating)
>> >>>> unassigned
>> >>>>>> node
>> >>>>>>>> for
>> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 with OFFLINE state
>> >>>>>>>>>> 2011-06-29 16:37:12,576 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: No
>> >> previous
>> >>>>>>>> transition
>> >>>>>>>>>> plan was found (or we are ignoring an existing plan) for
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> so generated a random one;
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> hri=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
>> >>>>>>>>>> src=, dest=hadoop1-s05.farm-ny.gigya.com
>> >>> ,60020,1307349217076;
>> >>>> 5
>> >>>>>>>>> (online=5,
>> >>>>>>>>>> exclude=null) available servers
>> >>>>>>>>>> 2011-06-29 16:37:12,576 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>> >>>>> region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> to hadoop1-s05.farm-ny.gigya.com,60020,1307349217076
>> >>>>>>>>>> 2011-06-29 16:37:13,102 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
>> >>>>>>>>>> transition=RS_ZK_REGION_OPENED,
>> >>>>>>>>>> server=hadoop1-s05.farm-ny.gigya.com,60020,1307349217076,
>> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
>> >>>>>>>>>> 2011-06-29 16:39:54,075 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
>> >>>>>> Handling
>> >>>>>>>>> OPENED
>> >>>>>>>>>> event for 584dac5cc70d8682f71c4675a843c309; deleting
>> >>> unassigned
>> >>>>>> node
>> >>>>>>>>>> 2011-06-29 16:39:54,075 DEBUG
>> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>>>>> master:60000-0x13004a31d7804c4 Deleting existing unassigned
>> >>>> node
>> >>>>>> for
>> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 that is in expected state
>> >>>>>>>>>> RS_ZK_REGION_OPENED
>> >>>>>>>>>> 2011-06-29 16:39:54,192 DEBUG
>> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>>>>> master:60000-0x13004a31d7804c4 Successfully deleted
>> >>> unassigned
>> >>>>> node
>> >>>>>>> for
>> >>>>>>>>>> region 584dac5cc70d8682f71c4675a843c309 in expected state
>> >>>>>>>>>> RS_ZK_REGION_OPENED
>> >>>>>>>>>> 2011-06-29 16:39:54,326 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
>> >>>>> Opened
>> >>>>>>>> region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> on hadoop1-s05.farm-ny.gigya.com,60020,1307349217076
>> >>>>>>>>>> 2011-06-29 16:40:00,598 DEBUG
>> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>>>>> master:60000-0x13004a31d7804c4 Creating (or updating)
>> >>>> unassigned
>> >>>>>> node
>> >>>>>>>> for
>> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 with OFFLINE state
>> >>>>>>>>>> 2011-06-29 16:40:00,877 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: No
>> >> previous
>> >>>>>>>> transition
>> >>>>>>>>>> plan was found (or we are ignoring an existing plan) for
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> so generated a random one;
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> hri=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
>> >>>>>>>>>> src=, dest=hadoop1-s05.farm-ny.gigya.com
>> >>> ,60020,1307349217076;
>> >>>> 5
>> >>>>>>>>> (online=5,
>> >>>>>>>>>> exclude=null) available servers
>> >>>>>>>>>> 2011-06-29 16:40:00,877 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>> >>>>> region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> to hadoop1-s05.farm-ny.gigya.com,60020,1307349217076
>> >>>>>>>>>> 2011-06-29 16:43:57,879 INFO
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Regions
>> >> in
>> >>>>>>> transition
>> >>>>>>>>>> timed out:
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> state=PENDING_OPEN, ts=1309380052723
>> >>>>>>>>>> 2011-06-29 16:43:57,880 INFO
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Region
>> >> has
>> >>>> been
>> >>>>>>>>>> PENDING_OPEN for too long, reassigning
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> region=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> 2011-06-29 16:43:57,936 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Forcing
>> >>>>> OFFLINE;
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> was=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> state=PENDING_OPEN, ts=1309380052723
>> >>>>>>>>>> 2011-06-29 16:43:57,936 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: No
>> >> previous
>> >>>>>>>> transition
>> >>>>>>>>>> plan was found (or we are ignoring an existing plan) for
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> so generated a random one;
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> hri=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
>> >>>>>>>>>> src=, dest=hadoop1-s02.farm-ny.gigya.com
>> >>> ,60020,1306919627544;
>> >>>> 5
>> >>>>>>>>> (online=5,
>> >>>>>>>>>> exclude=null) available servers
>> >>>>>>>>>> 2011-06-29 16:43:57,936 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
>> >>>>> region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> to hadoop1-s02.farm-ny.gigya.com,60020,1306919627544
>> >>>>>>>>>> 2011-06-29 16:43:59,022 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
>> >>>>>>>>>> transition=RS_ZK_REGION_OPENING,
>> >>>>>>>>>> server=hadoop1-s02.farm-ny.gigya.com,60020,1306919627544,
>> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
>> >>>>>>>>>> 2011-06-29 16:43:59,221 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
>> >>>>>>>>>> transition=RS_ZK_REGION_OPENING,
>> >>>>>>>>>> server=hadoop1-s02.farm-ny.gigya.com,60020,1306919627544,
>> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
>> >>>>>>>>>> 2011-06-29 16:43:59,226 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
>> >>>>>>>>>> transition=RS_ZK_REGION_OPENED,
>> >>>>>>>>>> server=hadoop1-s02.farm-ny.gigya.com,60020,1306919627544,
>> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
>> >>>>>>>>>> 2011-06-29 16:43:59,274 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
>> >>>>>> Handling
>> >>>>>>>>> OPENED
>> >>>>>>>>>> event for 584dac5cc70d8682f71c4675a843c309; deleting
>> >>> unassigned
>> >>>>>> node
>> >>>>>>>>>> 2011-06-29 16:43:59,274 DEBUG
>> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>>>>> master:60000-0x13004a31d7804c4 Deleting existing unassigned
>> >>>> node
>> >>>>>> for
>> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 that is in expected state
>> >>>>>>>>>> RS_ZK_REGION_OPENED
>> >>>>>>>>>> 2011-06-29 16:43:59,296 DEBUG
>> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
>> >>>>>>>>>> master:60000-0x13004a31d7804c4 Successfully deleted
>> >>> unassigned
>> >>>>> node
>> >>>>>>> for
>> >>>>>>>>>> region 584dac5cc70d8682f71c4675a843c309 in expected state
>> >>>>>>>>>> RS_ZK_REGION_OPENED
>> >>>>>>>>>> 2011-06-29 16:43:59,375 WARN
>> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager:
>> >> Overwriting
>> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 on
>> >>>>>>>>>> serverName=hadoop1-s05.farm-ny.gigya.com
>> >>> ,60020,1307349217076,
>> >>>>>>>>>> load=(requests=0, regions=1273, usedHeap=2676,
>> >> maxHeap=12483)
>> >>>>>>>>>> 2011-06-29 16:43:59,375 DEBUG
>> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
>> >>>>> Opened
>> >>>>>>>> region
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>> >>>>>>>>>> on hadoop1-s02.farm-ny.gigya.com,60020,1306919627544
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks.
>> >>>>>>>>>>
>> >>>>>>>>>> -eran
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> On Fri, Jul 1, 2011 at 09:05, Stack <stack@duboce.net>
>> >>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> So, Eran, it seems as though two RegionServers were
>> >> carrying
>> >>>> the
>> >>>>>>>>>>> region?  One deleted a file (compaction on its side)?  Can
>> >>> you
>> >>>>>>> figure
>> >>>>>>>>>>> if indeed two servers had same region?  (Check master logs
>> >>> for
>> >>>>>> this
>> >>>>>>>>>>> regions assignments).
>> >>>>>>>>>>>
>> >>>>>>>>>>> What version of hbase?
>> >>>>>>>>>>>
>> >>>>>>>>>>> St.Ack
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Thu, Jun 30, 2011 at 3:58 AM, Eran Kutner <
>> >>> eran@gigya.com>
>> >>>>>>> wrote:
>> >>>>>>>>>>>> Hi,
>> >>>>>>>>>>>> I have a cluster of 5 nodes with one large table that
>> >>>>> currently
>> >>>>>>> has
>> >>>>>>>>>>> around
>> >>>>>>>>>>>> 12000 regions. Everything was working fine for
>> >> relatively
>> >>>> long
>> >>>>>>> time,
>> >>>>>>>>>>> until
>> >>>>>>>>>>>> now.
>> >>>>>>>>>>>> Yesterday I significantly reduced the TTL on the table
>> >> and
>> >>>>>>> initiated
>> >>>>>>>>>>> major
>> >>>>>>>>>>>> compaction. This should have reduced the table size to
>> >>> about
>> >>>>> 20%
>> >>>>>>> of
>> >>>>>>>>> its
>> >>>>>>>>>>>> original size.
>> >>>>>>>>>>>> Today, I'm getting errors of inaccessible files on HDFS,
>> >>> for
>> >>>>>>>> example:
>> >>>>>>>>>>>> java.io.IOException: Got error in response to
>> >>> OP_READ_BLOCK
>> >>>>>> self=/
>> >>>>>>>>>>>> 10.1.104.2:58047, remote=/10.1.104.2:50010 for file
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> /hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360
>> >>>>>>>>>>>> for block 3674866614142268536_674205
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1487)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1811)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948)
>> >>>>>>>>>>>>       at
>> >>>>> java.io.DataInputStream.read(DataInputStream.java:132)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:105)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>
>> >> java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>
>> >> java.io.BufferedInputStream.read(BufferedInputStream.java:237)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:128)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:68)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>
>> >> java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>
>> >> java.io.BufferedInputStream.read(BufferedInputStream.java:317)
>> >>>>>>>>>>>>       at
>> >>>>>> org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:102)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1094)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:1036)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1433)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:139)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:96)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:77)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>
>> >>>>>
>> >> org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1341)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.<init>(HRegion.java:2269)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.regionserver.HRegion.instantiateInternalScanner(HRegion.java:1126)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1118)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1102)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1781)
>> >>>>>>>>>>>>       at
>> >>>> sun.reflect.GeneratedMethodAccessor46.invoke(Unknown
>> >>>>>>>> Source)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >>>>>>>>>>>>       at
>> >> java.lang.reflect.Method.invoke(Method.java:597)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>
>> >> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
>> >>>>>>>>>>>>       at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I checked and the file, indeed doesn't exist on HDFS,
>> >> here
>> >>>> is
>> >>>>>> the
>> >>>>>>>> name
>> >>>>>>>>>>> node
>> >>>>>>>>>>>> logs for this block, apparently because it was deleted:
>> >>>>>>>>>>>> 2011-06-19 21:39:36,651 INFO
>> >>>>> org.apache.hadoop.hdfs.StateChange:
>> >>>>>>>>> BLOCK*
>> >>>>>>>>>>>> NameSystem.allocateBlock:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> /hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/.tmp/2096863423111131624.
>> >>>>>>>>>>>> blk_3674866614142268536_674205
>> >>>>>>>>>>>> 2011-06-19 21:40:11,954 INFO
>> >>>>> org.apache.hadoop.hdfs.StateChange:
>> >>>>>>>>> BLOCK*
>> >>>>>>>>>>>> NameSystem.addStoredBlock: blockMap updated:
>> >>>> 10.1.104.2:50010
>> >>>>> is
>> >>>>>>>>> added
>> >>>>>>>>>>> to
>> >>>>>>>>>>>> blk_3674866614142268536_674205 size 67108864
>> >>>>>>>>>>>> 2011-06-19 21:40:11,954 INFO
>> >>>>> org.apache.hadoop.hdfs.StateChange:
>> >>>>>>>>> BLOCK*
>> >>>>>>>>>>>> NameSystem.addStoredBlock: blockMap updated:
>> >>>> 10.1.104.3:50010
>> >>>>> is
>> >>>>>>>>> added
>> >>>>>>>>>>> to
>> >>>>>>>>>>>> blk_3674866614142268536_674205 size 67108864
>> >>>>>>>>>>>> 2011-06-19 21:40:11,955 INFO
>> >>>>> org.apache.hadoop.hdfs.StateChange:
>> >>>>>>>>> BLOCK*
>> >>>>>>>>>>>> NameSystem.addStoredBlock: blockMap updated:
>> >>>> 10.1.104.5:50010
>> >>>>> is
>> >>>>>>>>> added
>> >>>>>>>>>>> to
>> >>>>>>>>>>>> blk_3674866614142268536_674205 size 67108864
>> >>>>>>>>>>>> 2011-06-29 20:20:01,662 INFO
>> >>>>> org.apache.hadoop.hdfs.StateChange:
>> >>>>>>>>> BLOCK*
>> >>>>>>>>>>> ask
>> >>>>>>>>>>>> 10.1.104.2:50010 to delete
>> >>> blk_3674866614142268536_674205
>> >>>>>>>>>>>> 2011-06-29 20:20:13,671 INFO
>> >>>>> org.apache.hadoop.hdfs.StateChange:
>> >>>>>>>>> BLOCK*
>> >>>>>>>>>>> ask
>> >>>>>>>>>>>> 10.1.104.5:50010 to delete
>> >>> blk_-4056387895369608597_675174
>> >>>>>>>>>>>> blk_-5017882805850873821_672281
>> >>>> blk_702373987100607684_672288
>> >>>>>>>>>>>> blk_-5357157478043290010_668506
>> >>>> blk_7118175133735412789_674903
>> >>>>>>>>>>>> blk_-3569812563715986384_675231
>> >>>> blk_8296855057240604851_669285
>> >>>>>>>>>>>> blk_-6483679172530609101_674268
>> >>>> blk_8738539715363739108_673682
>> >>>>>>>>>>>> blk_1744841904626813502_675238
>> >>>> blk_-6035315106100051103_674266
>> >>>>>>>>>>>> blk_-1789501623010070237_674908
>> >>>> blk_1944054629336265129_673689
>> >>>>>>>>>>>> blk_3674866614142268536_674205
>> >>>> blk_7930425446738143892_647410
>> >>>>>>>>>>>> blk_-3007186753042268449_669296
>> >>>>> blk_-5482302621772778061_647416
>> >>>>>>>>>>>> blk_-3765735404924932181_672004
>> >>>> blk_7476090998956811081_675169
>> >>>>>>>>>>>> blk_7862291659285127712_646890
>> >>>> blk_-2666244746343584727_672013
>> >>>>>>>>>>>> blk_6039172613960915602_674206
>> >>>> blk_-8470884397893086564_646899
>> >>>>>>>>>>>> blk_4558230221166712802_668510
>> >>>>>>>>>>>> 2011-06-29 20:20:46,698 INFO
>> >>>>> org.apache.hadoop.hdfs.StateChange:
>> >>>>>>>>> BLOCK*
>> >>>>>>>>>>> ask
>> >>>>>>>>>>>> 10.1.104.3:50010 to delete
>> >>> blk_-7851606440036350812_671552
>> >>>>>>>>>>>> blk_9214649160203453845_647566
>> >>> blk_702373987100607684_672288
>> >>>>>>>>>>>> blk_5958099369749234073_668143
>> >>>> blk_-5172218034084903173_673109
>> >>>>>>>>>>>> blk_-2934555181472719276_646476
>> >>>>> blk_-1409986679370073931_672552
>> >>>>>>>>>>>> blk_-2786034325506235869_669086
>> >>>> blk_3674866614142268536_674205
>> >>>>>>>>>>>> blk_510158930393283118_673225
>> >>> blk_916244738216205237_677068
>> >>>>>>>>>>>> blk_-4317027806407316617_670379
>> >>>> blk_8555705688850972639_673485
>> >>>>>>>>>>>> blk_-3765735404924932181_672004
>> >>>>> blk_-5482302621772778061_647416
>> >>>>>>>>>>>> blk_-2461801145731752623_674605
>> >>>>> blk_-8737702908048998927_672549
>> >>>>>>>>>>>> blk_-8470884397893086564_646899
>> >>>> blk_4558230221166712802_668510
>> >>>>>>>>>>>> blk_-4056387895369608597_675174
>> >>>>> blk_-8675430610673886073_647695
>> >>>>>>>>>>>> blk_-6642870230256028318_668211
>> >>>>> blk_-3890408516362176771_677483
>> >>>>>>>>>>>> blk_-3569812563715986384_675231
>> >>>>> blk_-5007142629771321873_674548
>> >>>>>>>>>>>> blk_-3345355191863431669_667066
>> >>>> blk_8296855057240604851_669285
>> >>>>>>>>>>>> blk_-6595462308187757470_672420
>> >>>>> blk_-2583945228783203947_674607
>> >>>>>>>>>>>> blk_-346988625120916345_677063
>> >>>> blk_4449525876338684218_674496
>> >>>>>>>>>>>> blk_2617172363857549730_668201
>> >>>> blk_8738539715363739108_673682
>> >>>>>>>>>>>> blk_-208904675456598428_679286
>> >>>> blk_-497549341281882641_646477
>> >>>>>>>>>>>> blk_-6035315106100051103_674266
>> >>>>> blk_-2356539038067297411_672388
>> >>>>>>>>>>>> blk_-3881703084497103249_668137
>> >>>> blk_2214397881104950315_646643
>> >>>>>>>>>>>> blk_-5907671443455357710_673223
>> >>>>> blk_-2431880309956605679_669204
>> >>>>>>>>>>>> blk_6039172613960915602_674206
>> >>>> blk_5053643911633142711_669194
>> >>>>>>>>>>>> blk_-2636977729205236686_674664
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I assume the file loss is somehow related to this change
>> >>> and
>> >>>>> the
>> >>>>>>>> major
>> >>>>>>>>>>>> compaction that followed because the same scan that is
>> >>>> failing
>> >>>>>> now
>> >>>>>>>> was
>> >>>>>>>>>>>> working fine yesterday and that is the only changed that
>> >>>>>> happened
>> >>>>>>> on
>> >>>>>>>>> the
>> >>>>>>>>>>>> cluster. Any suggestions what to do now?
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Thanks.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> -eran
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message