Return-Path: X-Original-To: apmail-mahout-user-archive@www.apache.org Delivered-To: apmail-mahout-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AE383ECE6 for ; Sat, 16 Feb 2013 15:56:33 +0000 (UTC) Received: (qmail 78547 invoked by uid 500); 16 Feb 2013 15:56:32 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 78135 invoked by uid 500); 16 Feb 2013 15:56:26 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 78112 invoked by uid 99); 16 Feb 2013 15:56:25 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Feb 2013 15:56:25 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nophiq@gmail.com designates 209.85.128.174 as permitted sender) Received: from [209.85.128.174] (HELO mail-ve0-f174.google.com) (209.85.128.174) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Feb 2013 15:56:19 +0000 Received: by mail-ve0-f174.google.com with SMTP id pb11so3764767veb.19 for ; Sat, 16 Feb 2013 07:55:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=Q/EL5BieMvsO12Tyrz6Dz74ZeE/whRdLalqYaUoCEuA=; b=qpxuzp2bejPjM3j/hsZzatXAW2ave6urrQzrsW5VNf1CMXt9P4idsJFZnjZqq3BXOX I1W2Fon57AzNv481eiojDtYqPyAF7m2k08XZGEQh8l2V123Pjs28nG6bqfPa0N3PLhqx Ny5ru9fLz70WMRC0ca3WN7aO3pSWTFYvMBnkZ/HGINqYxoNpZ8VOXytMGAWaNuoWnWNm fJ4C8BOJXKLdFi7bc9cK6tGaoE1GerND1aCxyEZdWDFkH8/OsKErAACRmGhNlsI0TGWZ a2vXeZEqviGvry2EHJoW506J2DXUIS3pB6SBwrCJCS1R5Sx7LO3es3HROH2Xle+PbGQF Zf3A== MIME-Version: 1.0 X-Received: by 10.220.153.2 with SMTP id i2mr7968453vcw.53.1361030158474; Sat, 16 Feb 2013 07:55:58 -0800 (PST) Received: by 10.220.176.197 with HTTP; Sat, 16 Feb 2013 07:55:58 -0800 (PST) Date: Sat, 16 Feb 2013 16:55:58 +0100 Message-ID: Subject: seqdirectory command in MapReduce From: Claudio Reggiani To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=f46d043891071fde2704d5d98746 X-Virus-Checked: Checked by ClamAV on apache.org --f46d043891071fde2704d5d98746 Content-Type: text/plain; charset=ISO-8859-1 Hello, I have a text dataset. Running "seqdirectory" command on it I see it's not written in MapReduce style (looking at the source code of SequenceFilesFromDirectory confirms that). What if I have a big dataset stored in HDFS and I would like to convert it in SequenceFile format? Do I need to create my own custom job or seqdirectory does that? Thanks Claudio Reggiani --f46d043891071fde2704d5d98746--