hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Cockrill <jamie.cockr...@gmail.com>
Subject Re: Regionserver tanked, can't seem to get master back up fully
Date Tue, 03 Aug 2010 13:22:51 GMT
PS, yes that was coming from master

On 3 August 2010 14:22, Jamie Cockrill <jamie.cockrill@gmail.com> wrote:
> Hi JD,
>
> The cluster is on a separated network, I'll see if any of the traces
> remain. As for the ulimit and xceivers bit, those are setup correctly
> as per the API doc you mention.
>
> Thanks
>
> Jamie
>
> On 2 August 2010 19:18, Jean-Daniel Cryans <jdcryans@apache.org> wrote:
>> Is that coming from the master? If so, it means that it was trying to
>> write recovered data from a failed region server and wasn't able to do
>> so. It sounds bad.
>>
>> - Can we get full stack traces of that error?
>> - Did you check the datanode logs for any exception? Very often
>> (strong emphasis on "very"), it's an issue with either ulimit or
>> xcievers. Is your cluster configured per the last bullet on that page?
>> http://hbase.apache.org/docs/r0.20.6/api/overview-summary.html#requirements
>>
>> Thx
>>
>> J-D
>>
>> On Mon, Aug 2, 2010 at 6:16 AM, Jamie Cockrill <jamie.cockrill@gmail.com> wrote:
>>> Hi All,
>>>
>>> I set off a long-running loading job over the weekend and it seems to
>>> have rather destroyed my hbase cluster. Most of the nodes were down
>>> this morning and upon restarting them, I'm now persistently getting
>>> the following message every few ms in the master logs:
>>>
>>> DfsClient: Could not complete file
>>> /hbase/.logs/compute17.cluster1.lan,60020,1280518716613/a filename
>>>
>>> That file is a zero-byte file on the HDFS. The data-nodes all look
>>> fine and don't seem to have had any trouble. I'm not especially fussed
>>> about having to rebuild that table and reload it, but the trouble is
>>> now that I can't start the cluster properly so I can drop the table.
>>>
>>> Does anyone know how I can remove the table/fix these errors manually.
>>> As I said, I'm not fussed about data-loss.
>>>
>>> thanks
>>>
>>> Jamie
>>>
>>
>

Mime
View raw message