hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Harkenrider <nathan.harkenri...@gmail.com>
Subject Re: Data Loss During Bulk Load
Date Wed, 24 Mar 2010 19:25:25 GMT
Yeah, it's a MR job. We're running CDH2 0.20.1+169.56.

Based on the release notes below, this includes HDFS-127.
http://archive.cloudera.com/cdh/2/hadoop-0.20.1+169.56.releasenotes.html

Thanks,

Nathan

On Wed, Mar 24, 2010 at 11:47 AM, Stack <saint.ack@gmail.com> wrote:

> Is this mr?  If so do u have hdfs-127 applied to your cluster?
>
>
>
>
> On Mar 24, 2010, at 11:22 AM, Nathan Harkenrider <
> nathan.harkenrider@gmail.com> wrote:
>
>  Thanks Ryan.
>>
>> We have this config setting in place and are currently running an insert
>> of
>> 40 million rows into an empty pair of tables. The job has inserted 25
>> million rows so far, and we are not seeing any failed compact/split errors
>> in the log. I'll report back after the import is complete and we've
>> verified
>> integrity of the data.
>>
>> Regards,
>>
>> Nathan
>>
>> On Wed, Mar 24, 2010 at 11:12 AM, Ryan Rawson <ryanobjc@gmail.com> wrote:
>>
>>  You'll want this one:
>>>
>>> <property>
>>> <name>dfs.datanode.socket.write.timeout</name>
>>> <value>0</value>
>>> </property>
>>>
>>> A classic standby from just over a year ago.  It should be in the
>>> recommended config - might not be anymore, but I am finding it
>>> necessary now.
>>>
>>> On Wed, Mar 24, 2010 at 11:06 AM, Rod Cope <rod.cope@openlogic.com>
>>> wrote:
>>>
>>>> This describes my situation, too.  I never could get rid of the
>>>> SocketTimeoutException's, even after dozens of hours of research and
>>>> applying every tuning and configuration suggestion I could find.
>>>>
>>>> Rod
>>>>
>>>>
>>>> On 3/24/10 Wednesday, March 24, 201011:45 AM, "Tuan Nguyen"
>>>> <tuan08@gmail.com> wrote:
>>>>
>>>>  Hi Nathan,
>>>>>
>>>>> We recently run a performance test again hbase 0.20.3 and hadoop
>>>>> 0.20.2.
>>>>>
>>>> We
>>>
>>>> have a quite similar problem to your.  At the first scan test ,  we
>>>>>
>>>> notice
>>>
>>>> that we loose some data on certain column in certain row and out log
>>>>>
>>>> have
>>>
>>>> the errors such Error Recovery for block, Coul Not get the block,
>>>>> IOException, SocketTimeoutException: 480000 millis timeout... And the
>>>>>
>>>> test
>>>
>>>> completely fail at the middle. After various tuning the GC, caching,
>>>>> xcievier... We can finish the test without any data loss. Our log have
>>>>>
>>>> only
>>>
>>>> SocketTimeoutException: 480000 millis timeout error left.
>>>>>
>>>>> Tuan Nguyen!
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Rod Cope | CTO and Founder
>>>> rod.cope@openlogic.com
>>>> Follow me on Twitter @RodCope
>>>>
>>>> 720 240 4501    |  phone
>>>> 720 240 4557    |  fax
>>>> 1 888 OpenLogic    |  toll free
>>>> www.openlogic.com
>>>> Follow OpenLogic on Twitter @openlogic
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message