Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4231D86F7 for ; Mon, 5 Sep 2011 13:08:52 +0000 (UTC) Received: (qmail 90313 invoked by uid 500); 5 Sep 2011 13:08:49 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 89950 invoked by uid 500); 5 Sep 2011 13:08:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 89941 invoked by uid 99); 5 Sep 2011 13:08:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Sep 2011 13:08:36 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of memoleaf@gmail.com designates 209.85.210.172 as permitted sender) Received: from [209.85.210.172] (HELO mail-iy0-f172.google.com) (209.85.210.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 Sep 2011 13:08:29 +0000 Received: by iakc1 with SMTP id c1so7758847iak.31 for ; Mon, 05 Sep 2011 06:08:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=mWkqr2Rs6Bh4KipnmkyIrGurcujqi4vDNBQJCjL4kUA=; b=Utg33fSTwa8L/24b1yyML3KGJiRCzH2zRlLCXWM5l0RmMXqhC9D/Lw29vgqbKtR9sU TcI7w3TRcwkPh5RNVJXDjlXiGOq/BUaf4Klp5XiklPWHVG13GBxiXZqKH6Rt1m+niMNd GfDcyQ8gl6k1VU9dA77r/5yrHT4MZ5MxZejs4= Received: by 10.231.69.69 with SMTP id y5mr7390012ibi.95.1315228088142; Mon, 05 Sep 2011 06:08:08 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.13.197 with HTTP; Mon, 5 Sep 2011 06:07:38 -0700 (PDT) In-Reply-To: References: From: Ji Cheng Date: Mon, 5 Sep 2011 21:07:38 +0800 Message-ID: Subject: Re: java.io.IOException: Could not get input splits To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0015176f0464fe0a5804ac31663f X-Virus-Checked: Checked by ClamAV on apache.org --0015176f0464fe0a5804ac31663f Content-Type: text/plain; charset=UTF-8 Hi. We got the same problem here. Even the wordcount map/reduce example in the source tar works fine with one node, but fails with the same exception on a two node cluster. CASSANDRA-3044 mentioned that a temporary work around is to disable node auto discovery. Can anyone tell me how to do that in the wordcount example? Thanks. On Fri, Sep 2, 2011 at 12:10 AM, Jian Fang wrote: > Thanks. How soon 0.8.5 will be out? Is there any 0.8.5 snapshot version > available? > > > On Thu, Sep 1, 2011 at 11:57 AM, Jonathan Ellis wrote: > >> Sounds like https://issues.apache.org/jira/browse/CASSANDRA-3044, >> fixed for 0.8.5 >> >> On Thu, Sep 1, 2011 at 10:54 AM, Jian Fang >> wrote: >> > Hi, >> > >> > I upgraded Cassandra from 0.8.2 to 0.8.4 and run a hadoop job to read >> data >> > from Cassandra, but >> > got the following errors: >> > >> > 11/09/01 11:42:46 INFO hadoop.SalesRankLoader: Start Cassandra reader... >> > Exception in thread "main" java.io.IOException: Could not get input >> splits >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:157) >> > at >> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885) >> > at >> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779) >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) >> > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447) >> > at >> > com.barnesandnoble.hadoop.SalesRankLoader.run(SalesRankLoader.java:359) >> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> > at >> > com.barnesandnoble.hadoop.SalesRankLoader.main(SalesRankLoader.java:408) >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > at >> > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> > at >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> > at java.lang.reflect.Method.invoke(Method.java:597) >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> > Caused by: java.util.concurrent.ExecutionException: >> > java.lang.IllegalArgumentException: protocol = socket host = null >> > at >> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) >> > at java.util.concurrent.FutureTask.get(FutureTask.java:83) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:153) >> > ... 12 more >> > Caused by: java.lang.IllegalArgumentException: protocol = socket host = >> null >> > at >> > sun.net.spi.DefaultProxySelector.select(DefaultProxySelector.java:151) >> > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:358) >> > at java.net.Socket.connect(Socket.java:529) >> > at org.apache.thrift.transport.TSocket.open(TSocket.java:178) >> > at >> > >> org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.createConnection(ColumnFamilyInputFormat.java:243) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:217) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:70) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:190) >> > at >> > >> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:175) >> > at >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> > at java.lang.Thread.run(Thread.java:662) >> > >> > The code used to work for 0.8.2 and it is really strange to see the host >> = >> > null. My code is very similar to the word count example, >> > >> > logger.info("Start Cassandra reader..."); >> > Job job2 = new Job(getConf(), "SalesRankCassandraReader"); >> > job2.setJarByClass(SalesRankLoader.class); >> > job2.setMapperClass(CassandraReaderMapper.class); >> > job2.setReducerClass(CassandraToFilesystem.class); >> > job2.setOutputKeyClass(Text.class); >> > job2.setOutputValueClass(IntWritable.class); >> > job2.setMapOutputKeyClass(Text.class); >> > job2.setMapOutputValueClass(IntWritable.class); >> > FileOutputFormat.setOutputPath(job2, new Path(outPath)); >> > >> > job2.setInputFormatClass(ColumnFamilyInputFormat.class); >> > >> > ConfigHelper.setRpcPort(job2.getConfiguration(), "9260"); >> > ConfigHelper.setInitialAddress(job2.getConfiguration(), >> > "dnjsrcha02"); >> > ConfigHelper.setPartitioner(job2.getConfiguration(), >> > "org.apache.cassandra.dht.RandomPartitioner"); >> > ConfigHelper.setInputColumnFamily(job2.getConfiguration(), >> KEYSPACE, >> > columnFamily); >> > // ConfigHelper.setInputSplitSize(job2.getConfiguration(), 5000); >> > ConfigHelper.setRangeBatchSize(job2.getConfiguration(), >> batchSize); >> > SlicePredicate predicate = new >> > >> SlicePredicate().setColumn_names(Arrays.asList(ByteBufferUtil.bytes(columnName))); >> > ConfigHelper.setInputSlicePredicate(job2.getConfiguration(), >> > predicate); >> > >> > job2.waitForCompletion(true); >> > >> > The Cassandra cluster includes 6 nodes and I am pretty sure they work >> fine. >> > >> > Please help. >> > >> > Thanks, >> > >> > John >> > >> > >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of DataStax, the source for professional Cassandra support >> http://www.datastax.com >> > > --0015176f0464fe0a5804ac31663f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi.=C2=A0We got the same problem here. Even the wordcount map/reduce exampl= e in the source tar works fine with one node, but fails with the same excep= tion on a two node cluster. CASSANDRA-3044 mentioned that a temporary work = around is to disable node auto discovery. Can anyone tell me how to do that= in the wordcount example? Thanks.


On Fri, Sep 2, 2011 at 12:10 AM, = Jian Fang <jian.fang.subscribe@gmail.com> wrote:=
Thanks. How soon 0.8.5 will be out? Is there any 0.8.5 snapshot version ava= ilable?


On Thu, Sep = 1, 2011 at 11:57 AM, Jonathan Ellis <jbellis@gmail.com> wrot= e:
Sounds like https://issues.apache.or= g/jira/browse/CASSANDRA-3044,
fixed for 0.8.5

On Thu, Sep 1, 2011 at 10:54 AM, Jian Fang
<jian= .fang.subscribe@gmail.com> wrote:
> Hi,
>
> I upgraded Cassandra from 0.8.2 to 0.8.4 and run a hadoop job to read = data
> from Cassandra, but
> got the following errors:
>
> 11/09/01 11:42:46 INFO hadoop.SalesRankLoader: Start Cassandra reader.= ..
> Exception in thread "main" java.io.IOException: Could not ge= t input splits
> =C2=A0=C2=A0=C2=A0 at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFa= milyInputFormat.java:157)
> =C2=A0=C2=A0=C2=A0 at org.apache.hadoop.mapred.JobClient.writeNewSplit= s(JobClient.java:885)
> =C2=A0=C2=A0=C2=A0 at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:77= 9)
> =C2=A0=C2=A0=C2=A0 at org.apache.hadoop.mapreduce.Job.submit(Job.java:= 432)
> =C2=A0=C2=A0=C2=A0 at org.apache.hadoop.mapreduce.Job.waitForCompletio= n(Job.java:447)
> =C2=A0=C2=A0=C2=A0 at
> com.barnesandnoble.hadoop.SalesRankLoader.run(SalesRankLoader.java:359= )
> =C2=A0=C2=A0=C2=A0 at org.apache.hadoop.util.ToolRunner.run(ToolRunner= .java:65)
> =C2=A0=C2=A0=C2=A0 at
> com.barnesandnoble.hadoop.SalesRankLoader.main(SalesRankLoader.java:40= 8)
> =C2=A0=C2=A0=C2=A0 at sun.reflect.NativeMethodAccessorImpl.invoke0(Nat= ive Method)
> =C2=A0=C2=A0=C2=A0 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j= ava:39)
> =C2=A0=C2=A0=C2=A0 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess= orImpl.java:25)
> =C2=A0=C2=A0=C2=A0 at java.lang.reflect.Method.invoke(Method.java:597)=
> =C2=A0=C2=A0=C2=A0 at org.apache.hadoop.util.RunJar.main(RunJar.java:1= 56)
> Caused by: java.util.concurrent.ExecutionException:
> java.lang.IllegalArgumentException: protocol =3D socket host =3D null<= br> > =C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTask$Sync.innerGet(Fu= tureTask.java:222)
> =C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTask.get(FutureTask.j= ava:83)
> =C2=A0=C2=A0=C2=A0 at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFa= milyInputFormat.java:153)
> =C2=A0=C2=A0=C2=A0 ... 12 more
> Caused by: java.lang.IllegalArgumentException: protocol =3D socket hos= t =3D null
> =C2=A0=C2=A0=C2=A0 at
> sun.net.spi.DefaultProxySelector.select(DefaultProxySelector.java:151)=
> =C2=A0=C2=A0=C2=A0 at java.net.SocksSocketImpl.connect(SocksSocketImpl= .java:358)
> =C2=A0=C2=A0=C2=A0 at java.net.Socket.connect(Socket.java:529)
> =C2=A0=C2=A0=C2=A0 at org.apache.thrift.transport.TSocket.open(TSocket= .java:178)
> =C2=A0=C2=A0=C2=A0 at
> org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.jav= a:81)
> =C2=A0=C2=A0=C2=A0 at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.createConnection(C= olumnFamilyInputFormat.java:243)
> =C2=A0=C2=A0=C2=A0 at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(Colum= nFamilyInputFormat.java:217)
> =C2=A0=C2=A0=C2=A0 at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnF= amilyInputFormat.java:70)
> =C2=A0=C2=A0=C2=A0 at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call= (ColumnFamilyInputFormat.java:190)
> =C2=A0=C2=A0=C2=A0 at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call= (ColumnFamilyInputFormat.java:175)
> =C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTask$Sync.innerRun(Fu= tureTask.java:303)
> =C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTask.run(FutureTask.j= ava:138)
> =C2=A0=C2=A0=C2=A0 at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecu= tor.java:886)
> =C2=A0=C2=A0=C2=A0 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.= java:908)
> =C2=A0=C2=A0=C2=A0 at java.lang.Thread.run(Thread.java:662)
>
> The code used to work for 0.8.2 and it is really strange to see the ho= st =3D
> null. My code is very similar to the word count example,
>
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 logger.info("Start Cassandra reader..."= );
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Job job2 =3D new Job(getCon= f(), "SalesRankCassandraReader");
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.setJarByClass(SalesRan= kLoader.class);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.setMapperClass(Cassand= raReaderMapper.class);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.setReducerClass(Cassan= draToFilesystem.class);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.setOutputKeyClass(Text= .class);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.setOutputValueClass(In= tWritable.class);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.setMapOutputKeyClass(T= ext.class);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.setMapOutputValueClass= (IntWritable.class);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 FileOutputFormat.setOutputP= ath(job2, new Path(outPath));
>
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.setInputFormatClass(Co= lumnFamilyInputFormat.class);
>
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ConfigHelper.setRpcPort(job= 2.getConfiguration(), "9260");
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ConfigHelper.setInitialAddr= ess(job2.getConfiguration(),
> "dnjsrcha02");
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ConfigHelper.setPartitioner= (job2.getConfiguration(),
> "org.apache.cassandra.dht.RandomPartitioner");
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ConfigHelper.setInputColumn= Family(job2.getConfiguration(), KEYSPACE,
> columnFamily);
> //=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ConfigHelper.setInputSpli= tSize(job2.getConfiguration(), 5000);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ConfigHelper.setRangeBatchS= ize(job2.getConfiguration(), batchSize);
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 SlicePredicate predicate = =3D new
> SlicePredicate().setColumn_names(Arrays.asList(ByteBufferUtil.bytes(co= lumnName)));
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ConfigHelper.setInputSliceP= redicate(job2.getConfiguration(),
> predicate);
>
> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 job2.waitForCompletion(true= );
>
> The Cassandra cluster includes 6 nodes and I am pretty sure they work = fine.
>
> Please help.
>
> Thanks,
>
> John
>
>
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.c= om


--0015176f0464fe0a5804ac31663f--