Try running it without threading to see if it's a cassandra problem or an issue with your threading.
Perhaps split the file and run many single threaded processes to load the data.
Aaron
On 27 Jul, 2010,at 07:14 AM, Rana Aich <aichrana@gmail.com> wrote:
> Hi All,
>
> I have to load huge quantity of data into Cassandra (~10Billion rows).
>
> I'm trying to load the Data from files using multithreading.
>
> The idea is each thread will read the TAB delimited file and process chunk of records.
>
> For example Thread1 reads line 1-1000 lines
> Thread 2 reads line 1001-2000 and insert into Cassandra.
> Thread 3 reads line 2001-3000 and insert into Cassandra.
>
> Thread 10 reads line 9001-10000 and insert into Cassandra.
> Thread 1 reads line 10001-11000 and insert into Cassandra.
> Thread 2 reads line 11001-12000 and insert into Cassandra.
>
> and so on...
>
> I'm testing with a small file size with 200000 records.
>
> But somehow the process gets stuck and doesn't proceed any further after processing say
16,000 records.
>
> I've attached my working file.
>
> Any help will be very much appreciated.
>
> Regards
>
> raich
|
Mime |
- Unnamed multipart/alternative (inline, None, 0 bytes)
- Unnamed multipart/related (inline, None, 0 bytes)
|