Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3AD6F10355 for ; Fri, 27 Dec 2013 13:38:53 +0000 (UTC) Received: (qmail 6290 invoked by uid 500); 27 Dec 2013 13:38:35 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 6211 invoked by uid 500); 27 Dec 2013 13:38:33 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 6204 invoked by uid 99); 27 Dec 2013 13:38:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Dec 2013 13:38:32 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of nitinpawar432@gmail.com designates 209.85.216.175 as permitted sender) Received: from [209.85.216.175] (HELO mail-qc0-f175.google.com) (209.85.216.175) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Dec 2013 13:38:25 +0000 Received: by mail-qc0-f175.google.com with SMTP id e9so8639284qcy.34 for ; Fri, 27 Dec 2013 05:38:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=q8VtqsfTJXmS5CrMEAs2lt+GXqpbvxuQNoxSCtUAvwY=; b=worAfrDbR0ZQOICofRUEheffLZwsXEv+Hi7oW+SKoIthVuOggjZZVryQVHct+Dtkni vhOwDhiGU5Pylc+shv19F0gEgZYGD8lckggq68W7iplDbicBMzPYyuHUckbF12PpKhQ8 GNigqAnBaZSgPjOHgQJz7RnLTKkx0TIiwndQ+CyLQqip9tlVKuXTpuUf8+cZHHywY/m+ xxD4T+yX2hekV6xyNN67K8d6o7jEdqnDjSMMWu2igNKg4ZwnZXwCcGE3XTKdREO5DXlJ UxuTPXoeKgKHT1oIJmolkAFS4ERX8TdysRE527e/LQmM4pxXMYXmrzzA6ajmr075LtyO 306A== MIME-Version: 1.0 X-Received: by 10.224.51.18 with SMTP id b18mr37457231qag.19.1388151484826; Fri, 27 Dec 2013 05:38:04 -0800 (PST) Received: by 10.224.194.5 with HTTP; Fri, 27 Dec 2013 05:38:04 -0800 (PST) In-Reply-To: References: Date: Fri, 27 Dec 2013 19:08:04 +0530 Message-ID: Subject: Re: Split the File using mapreduce From: Nitin Pawar To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7bdc8cf625e5bb04ee8434bd X-Virus-Checked: Checked by ClamAV on apache.org --047d7bdc8cf625e5bb04ee8434bd Content-Type: text/plain; charset=ISO-8859-1 1)if you have a csv file and do it often without writing a lot of code then create a hive table with "," delimiter and then select from table columns you want and write to the file 2) you are good at script, then look at pig scripting, and then write to files 3) you want to do it through mapreduce program of your own, take a look at multioutputformat and textinputformat On Fri, Dec 27, 2013 at 6:56 PM, Ranjini Rathinam wrote: > Hi, > > I have a file with 16 fields such as > id,name,sa,dept,exp,address,company,phone,mobile,project,redk,........ so on > > My scenaraio is to split the first eight attributes in one file and > another eight attributes in another file using MapReduce program. > > so first eight attributes and its value in one file as > id,name,sa,dept,exp,address,company,phone > > and the rest of attributes and its value in another file. Using Mapreduce > Program. > > I am using Hadoop 0.20 version and java 1.6 > Thanks in advance > > Regards, > Ranjini.R > > > > -- Nitin Pawar --047d7bdc8cf625e5bb04ee8434bd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
1)if you have a csv file and do it often without writing a= lot of code=A0
then create a hive table with "," delimi= ter and then select from table columns you want and write to the file=A0

2) you are good at script, then look at pig scrip= ting, and then write to files=A0

3) you want to do= it through mapreduce program of your own, take a look at multioutputformat= and textinputformat=A0
--047d7bdc8cf625e5bb04ee8434bd--