cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Forsberg (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3859) Add Progress Reporting to Cassandra OutputFormats
Date Wed, 22 Feb 2012 08:22:49 GMT


Erik Forsberg commented on CASSANDRA-3859:

bq. I am not seeing this on our end. Our job is running 50 reducers on our end, and it certainly
takes > timeout seconds (600 for us). It's progressing ...

Just to make sure we're measuring the same thing - are your reducers taking more than 600
seconds *after* the creation of sstables have finished? 

For us, the creation of sstables take ~10 minutes - and during that period the job is consuming
input, so Hadoop knows it's active, and then it's the loading phase that takes much longer,
and gets killed if I don't set mapred.task.timeout seconds to a very high value.

bq. Brandon, one thing I could think of, is if they are adding a lot of batches, we don't
actually call progress until the loop is over.

Hmm.. what is "a batch" in this context?

> Add Progress Reporting to Cassandra OutputFormats
> -------------------------------------------------
>                 Key: CASSANDRA-3859
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop, Tools
>    Affects Versions: 1.1.0
>            Reporter: Samarth Gahire
>            Assignee: Brandon Williams
>            Priority: Minor
>              Labels: bulkloader, hadoop, mapreduce, sstableloader
>             Fix For: 1.1.0
>         Attachments: 0001-add-progress-reporting-to-BOF.txt, 0002-Add-progress-to-CFOF.txt
>   Original Estimate: 48h
>  Remaining Estimate: 48h
> When we are using the BulkOutputFormat to load the data to cassandra. We should use the
progress reporting to Hadoop Job within Sstable loader because while loading the data for
particular task if streaming is taking more time and progress is not reported to Job it may
kill the task with timeout exception. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message