cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Samarth Gahire (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3740) While using BulkOutputFormat unneccessarily look for the cassandra.yaml file.
Date Thu, 02 Feb 2012 07:39:55 GMT


Samarth Gahire commented on CASSANDRA-3740:

Cool! Its Working Perfect with the updated patches.
Can you please explain 
1) what is the significance of "INPUT_INITIAL_THRIFT_ADDRESS" for BulkOutPutFormat.
2) What am I suppose to provide there?(If it is needed)
3) Is there any need to provide Listen address of the Hadoop Nodes for BulkOutputFormat if
yes How to provide the same?

Actually we are experiencing the problem while loading the data where it fails to connect
if the host the M/R job is running on is dualstack, i.e. has both IPv4 and IPv6. 
Also it works when cassandra.yaml is provided ,may be it is reading listen address or something
from cassandra.yaml.
> While using BulkOutputFormat  unneccessarily look for the cassandra.yaml file.
> ------------------------------------------------------------------------------
>                 Key: CASSANDRA-3740
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.1
>            Reporter: Samarth Gahire
>            Assignee: Brandon Williams
>              Labels: cassandra, hadoop, mapreduce
>             Fix For: 1.1
>         Attachments: 0001-Make-DD-the-canonical-partitioner-source.txt, 0002-Prevent-loading-from-yaml.txt,
0003-use-output-partitioner.txt, 0004-update-BOF-for-new-dir-layout.txt
> I am trying to use BulkOutputFormat to stream the data from map of Hadoop job. I have
set the cassandra related configuration using ConfigHelper ,Also have looked into Cassandra
code seems Cassandra has taken care that it should not look for the cassandra.yaml file.
> But still when I run the job i get the following error:
> {
> 12/01/13 11:30:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments.
Applications should implement Tool for the same.
> 12/01/13 11:30:04 INFO input.FileInputFormat: Total input paths to process : 1
> 12/01/13 11:30:04 INFO mapred.JobClient: Running job: job_201201130910_0015
> 12/01/13 11:30:05 INFO mapred.JobClient:  map 0% reduce 0%
> 12/01/13 11:30:23 INFO mapred.JobClient: Task Id : attempt_201201130910_0015_m_000000_0,
Status : FAILED
> java.lang.Throwable: Child Error
>         at
> Caused by: Task process exit with nonzero status of 1.
>         at
> attempt_201201130910_0015_m_000000_0: Cannot locate cassandra.yaml
> attempt_201201130910_0015_m_000000_0: Fatal configuration error; unable to start server.
> }
> Also let me know how can i make this cassandra.yaml file available to Hadoop mapreduce

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message