accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis <de...@camfex.cz>
Subject Re: Offline tables on adding a tserver (Accumulo 1.6 regression?)
Date Tue, 13 Jan 2015 17:44:15 GMT
> Do you have any warnings/errors in the new server's logs?

On smaller cluster where I try to reproduce the problem - no

On big cluster, unfortunately, there are no local logs as the tserver
logs were sent to the monitor :(
At the moment I cannot add a new tserver there to collect new logs as
clients are using the cluster.

On 1/13/15, Eric Newton <eric.newton@gmail.com> wrote:
> The fact that the tablets are being taken offline means that the master is
> actively trying to balance.
>
> The master will periodically ask the new server to host the tablets.  Do
> you have any warnings/errors in the new server's logs?
>
> -Eric
>
>
> On Tue, Jan 13, 2015 at 11:48 AM, Denis <denis@camfex.cz> wrote:
>
>> >  If you jstack your new tablet server, does it show a deadlock?
>>
>> No
>>
>> On 1/13/15, Eric Newton <eric.newton@gmail.com> wrote:
>> > This may be a result of ACCUMULO-3372.  If you jstack your new tablet
>> > server, does it show a deadlock?
>> >
>> > $ jps -m
>> > 12345 Main tserver --address host:9997
>> >
>> > $ jstack 12345 | grep -i deadlock
>> > Deadlock detected
>> >
>> > This particular bug only happens at start-up.  There's a trivial patch
>> > (which you can find through the bug report), which will be in accumulo
>> > 1.6.2.
>> >
>> > -Eric
>> >
>> >
>> > On Mon, Jan 12, 2015 at 4:06 PM, Denis <denis@camfex.cz> wrote:
>> >
>> >> I have not tried yet anything newer than 1.6.1
>> >>
>> >> On 1/12/15, Josh Elser <elserj@apache.org> wrote:
>> >> > Denis wrote:
>> >> >> created https://issues.apache.org/jira/browse/ACCUMULO-3471
>> >> >
>> >> > Thanks a bunch!
>> >> >
>> >> >> BTW, In 1.6.1 also balancing may get stuck until the master server
>> >> >> is
>> >> >> restarted.
>> >> >
>> >> > Is this a known issue in 1.6.1 that's been since fixed or is it
>> >> > still
>> >> > outstanding?
>> >> >
>> >> >> But then, after the master restart, balancing works very
>> >> >> "aggressively", putting many tablets offline for quite long time
>> >> >> (minutes)
>> >> >>
>> >> >> On 1/11/15, Denis<denis@camfex.cz>  wrote:
>> >> >>> Sometimes it left unbalanced with new tserver hosts zero tablets
>> >> >>> or
>> >> >>> much less that others.
>> >> >>> So I had to restart master to initiate the balancing process.
>> >> >>> Then balancing was performed slowly without putting thousands
of
>> >> >>> tablets offline.
>> >> >>>
>> >> >>> On 1/11/15, John Vines<vines@apache.org>  wrote:
>> >> >>>> I have a hunch that the 1.4 version being used possibly
had one
>> >> >>>> or
>> >> more
>> >> >>>> of
>> >> >>>> the many bugs regarding balancing getting 'stuck', which
was
>> >> >>>> typically
>> >> >>>> resolved via bouncing the master. Denis, in 1.4 when you
brought
>> you
>> >> >>>> tserver back online, did you find that things were then
balanced
>> >> >>>> or
>> >> did
>> >> >>>> you
>> >> >>>> just have a tserver up and things were left unbalanced?
>> >> >>>>
>> >> >>>> On Sun, Jan 11, 2015 at 11:30 AM, Denis<denis@camfex.cz>
 wrote:
>> >> >>>>
>> >> >>>>> yes, per server
>> >> >>>>>
>> >> >>>>> On 1/11/15, Sean Busbey<busbey@cloudera.com>
 wrote:
>> >> >>>>>> On Sat, Jan 10, 2015 at 3:42 PM, Denis<denis@camfex.cz>
 wrote:
>> >> >>>>>> On 1/10/15, Christopher<ctubbsii@apache.org>
 wrote:
>> >> >>>>>>
>> >> >>>>>>> ...
>> >> >>>>>>> 3) how many tablets do you have per server?....
>> >> >>>>>> 3. about 6000
>> >> >>>>>>
>> >> >>>>>> Just to confirm, this is 6000 tablets per-server
and not 6000
>> >> tablets
>> >> >>>>>> per-table or overall, right?
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> --
>> >> >>>>>> Sean
>> >> >>>>>>
>> >> >
>> >>
>> >
>>
>

Mime
View raw message