hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Meil <doug.m...@explorysmedical.com>
Subject RE: Insert a lot of data in HBase
Date Tue, 21 Jun 2011 02:26:06 GMT
Just be aware that the data in the write-buffer is in the client - so it hasn't been sent to
the RegionServers yet.  

So as long as your client doesn't die, you should be ok.


-----Original Message-----
From: saurabh.r.s@gmail.com [mailto:saurabh.r.s@gmail.com] On Behalf Of Sam Seigal
Sent: Monday, June 20, 2011 10:17 PM
To: user@hbase.apache.org
Subject: Re: Insert a lot of data in HBase

When using the write cache and setting setAutoFlush() to false, is there a risk of data loss,
even if WAL is enabled ?

On Mon, Jun 20, 2011 at 12:27 PM, Jeff Whiting <jeffw@qualtrics.com> wrote:

> There is the possibility that your keys have the same timestamp -- 
> especially if you are running multi-threaded.  If the puts are 
> buffered then it isn't outside the realm of possibility that they are 
> executed within the same millisecond.  If you have the same keys for 
> multiple puts you would "loose" data as you describe because it would 
> just update the row rather than inserting a new one.
>
> ~Jeff
>
>
> On 6/20/2011 1:16 PM, Laurent Hatier wrote:
>
>> I think that there is a solution in your link, i will check it ! :)
>>
>> 2011/6/20 Laurent 
>> Hatier<laurent.hatier@gmail.**com<laurent.hatier@gmail.com>
>> >
>>
>>  my keys are the moment where the data is inserted into HBase (so
>>> System.currentTimeMillis()***1000). As you can see, i use the put 
>>> method which insert data... there is an another way to insert data ?
>>>
>>>
>>> 2011/6/20 Doug 
>>> Meil<doug.meil@**explorysmedical.com<doug.meil@explorysmedical.com>
>>> >
>>>
>>>  Look here in the HBase book for these, and other, tips.
>>>>
>>>> http://hbase.apache.org/book.**html#performance<http://hbase.apache
>>>> .org/book.html#performance>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of 
>>>> Jean-Daniel Cryans
>>>> Sent: Monday, June 20, 2011 2:03 PM
>>>> To: user@hbase.apache.org
>>>> Subject: Re: Insert a lot of data in HBase
>>>>
>>>> 4M is small data :)
>>>>
>>>> Could there be an overlap in the keys? Are you disabling autoflush 
>>>> and not flushing the write buffer? These are common 
>>>> errors/misconceptions.
>>>>
>>>> J-D
>>>>
>>>> On Mon, Jun 20, 2011 at 10:36 AM, Laurent Hatier< 
>>>> laurent.hatier@gmail.com>  wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I'm new in HBase. I want to insert 4'000'000 rows in HBase (each 
>>>>> row has 4 columns). I have already looked the HBase wiki to insert 
>>>>> data, but i've a problem : i loss data. When i do a COUNT with the 
>>>>> shell, there is approximativly 1'500'000 in the DB...
>>>>> I've tested to create multiple Put and insert it with a List, i've 
>>>>> already tested a simple Put with four add functions, open and 
>>>>> close the socket it each time i put the line or i read the file don't
run...
>>>>> If anyone have an idea.
>>>>>
>>>>> here we go my code if you want to see :
>>>>>
>>>>> List<Put>  arrayPut = new ArrayList<Put>();
>>>>>
>>>>> arrayPut.add(new Put(Bytes.toBytes(id)));
>>>>> arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, 
>>>>> QUALIFIER_START, Bytes.toBytes(tStart)); arrayPut.add(new 
>>>>> Put(Bytes.toBytes(id)));
>>>>> arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, 
>>>>> QUALIFIER_END, Bytes.toBytes(tEnd)); arrayPut.add(new 
>>>>> Put(Bytes.toBytes(id)));
>>>>> arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, 
>>>>> QUALIFIER_COUNTRY, Bytes.toBytes(countryCode)); arrayPut.add(new 
>>>>> Put(Bytes.toBytes(id)));
>>>>> arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, 
>>>>> QUALIFIER_REGION, Bytes.toBytes(regionCode)); table.put(arrayPut);
>>>>>
>>>>> --
>>>>> Laurent HATIER
>>>>> Étudiant en 2e année du Cycle Ingénieur ą l'EISTI
>>>>>
>>>>>
>>>
>>> --
>>> Laurent HATIER
>>> Étudiant en 2e année du Cycle Ingénieur ą l'EISTI
>>>
>>>
>>
>>
> --
> Jeff Whiting
> Qualtrics Senior Software Engineer
> jeffw@qualtrics.com
>
>
Mime
View raw message