accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dlmar...@comcast.net
Subject Re: Ingest speed
Date Tue, 05 May 2015 15:48:59 GMT

Your process seems sound, it's likely that you just need to scale it up. If you are not seeing
wait times on the Accumulo monitor, then you have the ability to push more data. Are you processing
multiple JSON files concurrently? 

----- Original Message -----

From: "Andrea Leoni" <andrealeoni88@gmail.com> 
To: dev@accumulo.apache.org 
Sent: Tuesday, May 5, 2015 11:32:54 AM 
Subject: Re: Ingest speed 

Thank you for your answer. 
Today i tried to create a big command file and push it to shell (about 300k 
insert per file). As you said it is too slow for me (about 600 inserted 
row/sec) 

I'm on Accumulo by just one week. I'm a noob but i'm learning. 

Actually my app has to store a large number of data. 

The row is the timestamp and the family/qualif are the column... I catch my 
data from a JSON file, so my app scan it for new records, parse it and once 
for record create a mutation and push it on Accumulo with batchWriter... 

Maybe I wrong something that can increase the speed of my inserts. 

Actually I: 

LOOP 
1) read a json line 
2) parse it 
3) create a mutation 
4) put in this mutation the line's information 
5) use batchWriter to insert mutation in Accumulo 
END LOOP 

Is it all right? I now that point 1) and 2) are slow but it's necessary and 
i use the fastest json parser i've found online. 

Thank you so much again! 
(and sorry again for my bad english!) 



----- 
Andrea Leoni 
Italy 
Computer Engineering 
-- 
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/Ingest-speed-tp14005p14013.html

Sent from the Developers mailing list archive at Nabble.com. 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message