Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5113317F0E for ; Mon, 27 Apr 2015 21:36:41 +0000 (UTC) Received: (qmail 23069 invoked by uid 500); 27 Apr 2015 21:36:34 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 22969 invoked by uid 500); 27 Apr 2015 21:36:34 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 22959 invoked by uid 99); 27 Apr 2015 21:36:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2015 21:36:34 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: message received from 54.164.171.186 which is an MX secondary for user@hadoop.apache.org) Received: from [54.164.171.186] (HELO mx1-us-east.apache.org) (54.164.171.186) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Apr 2015 21:36:27 +0000 Received: from mail-wg0-f50.google.com (mail-wg0-f50.google.com [74.125.82.50]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 9D3D3428ED for ; Mon, 27 Apr 2015 21:36:06 +0000 (UTC) Received: by wgso17 with SMTP id o17so130681792wgs.1 for ; Mon, 27 Apr 2015 14:36:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-type; bh=hDkI2u4w72Xu6AQ++2p0WrTW5XDjfy4ETJB1Wh72AFQ=; b=0kA6YgFVH3PVryKTs71ERIpAY1xpWvawHe2QBnGsM4p2ND/1uBfzg31yqoxWoG0lmH NrrdAAipnyhhOuBh54njtgFKjvLFeTND0Vx+ZC1OT6+wdIH5QfphBEi5exi7kO5lASgT OrnFivb6hvmLDla8BEGlfgGwzHAFM21sxmew92biNS7uhewbVLk02KYLe8rsMXvz5xOQ corSg/BqBkGhIdqjKWosBRWBLIqd9g8g+B9c1+eabI4S1bFIebFbXubB5kAdR77FxnUz DhVo2otm5bALJAqTNSDbt7HXuji22jcUShCndUWeUNXdCJxsrql9Um3k3WZCBzNxJVA5 C63g== X-Received: by 10.194.59.46 with SMTP id w14mr26519252wjq.106.1430170565866; Mon, 27 Apr 2015 14:36:05 -0700 (PDT) Received: from [192.168.10.107] (bl14-64-170.dsl.telepac.pt. [85.247.64.170]) by mx.google.com with ESMTPSA id u6sm31111533wjy.13.2015.04.27.14.36.04 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Apr 2015 14:36:05 -0700 (PDT) Message-ID: <553EABC3.7070109@gmail.com> Date: Mon, 27 Apr 2015 22:36:03 +0100 From: xeonmailinglist-gmail User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: "user@hadoop.apache.org" Subject: how to merge sequence files? Content-Type: multipart/alternative; boundary="------------040700080903000601010901" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------040700080903000601010901 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Hi, I have several directories that contain several files of |SequenceFileOutputFormat| with |org.apache.hadoop.io.Text| as key and value. I want to merge all these files into one. I have looked to the join example [1], but it is not working. How I merge SequenceFiles? |+ /root/Programs/hadoop/bin/hadoop jar /root/Programs/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar join -inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text /wiki-output,/wiki-output2 /aggregate-output 15/04/27 17:34:47 INFO client.RMProxy: Connecting to ResourceManager at /172.16.100.11:8040 java.lang.ClassCastException: class org.apache.hadoop.mapred.SequenceFileOutputFormat at java.lang.Class.asSubclass(Class.java:3168) at org.apache.hadoop.examples.Join.run(Join.java:112) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.Join.main(Join.java:173) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) | — ​ --------------040700080903000601010901 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit

Hi,

I have several directories that contain several files of SequenceFileOutputFormat with org.apache.hadoop.io.Text as key and value. I want to merge all these files into one.

I have looked to the join example [1], but it is not working. How I merge SequenceFiles?

+ /root/Programs/hadoop/bin/hadoop jar /root/Programs/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar join -inFormat org.apache.hadoop.mapred.SequenceFileInputFormat -outFormat org.apache.hadoop.mapred.SequenceFileOutputFormat -outKey org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text /wiki-output,/wiki-output2 /aggregate-output
15/04/27 17:34:47 INFO client.RMProxy: Connecting to ResourceManager at /172.16.100.11:8040
java.lang.ClassCastException: class org.apache.hadoop.mapred.SequenceFileOutputFormat
        at java.lang.Class.asSubclass(Class.java:3168)
        at org.apache.hadoop.examples.Join.run(Join.java:112)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.examples.Join.main(Join.java:173)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:622)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:622)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

--------------040700080903000601010901--