Return-Path: X-Original-To: apmail-any23-dev-archive@www.apache.org Delivered-To: apmail-any23-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 47EA4109C9 for ; Tue, 2 Jul 2013 20:21:06 +0000 (UTC) Received: (qmail 58133 invoked by uid 500); 2 Jul 2013 20:21:06 -0000 Delivered-To: apmail-any23-dev-archive@any23.apache.org Received: (qmail 58093 invoked by uid 500); 2 Jul 2013 20:21:06 -0000 Mailing-List: contact dev-help@any23.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@any23.apache.org Delivered-To: mailing list dev@any23.apache.org Received: (qmail 58082 invoked by uid 99); 2 Jul 2013 20:21:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Jul 2013 20:21:06 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lewis.mcgibbney@gmail.com designates 209.85.160.50 as permitted sender) Received: from [209.85.160.50] (HELO mail-pb0-f50.google.com) (209.85.160.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Jul 2013 20:20:59 +0000 Received: by mail-pb0-f50.google.com with SMTP id wz7so6576608pbc.37 for ; Tue, 02 Jul 2013 13:20:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=QVfyY2LZse4OKtB8sPcbwhx28Bnu4g4LR7ARpQmhWNc=; b=qORW8s326QgPRs6ENMdNlQkfipF4wuDUOpV91wDrkzmBDBhQLfwiUi9FnVP77yCQwR hwsavK0zx/aha3gdx4iBeJdMLTeKzLo5AGG7bO/EjFxpYCrsFxAlRjW1gLkFTLiCXILC liKWdMUMXYrAuHqVfkb1jjCqIOM4z1KcAoVNWWAE6Qaa/e+vmsvPGUIsYlxUBOsJX8xW x9HmiNVUiYfWaeFCsnAIiIEPUbj4YueKdSb/vsrlaqJRF7JvDOapMSYlIcilRCJ4Kvei 4TdnhenpI3kE13Y8hbXyeIY6/LN6kgCfFEU41snbypaaRFMoxuJ1yiJnnsjw78zunJK2 lhlw== MIME-Version: 1.0 X-Received: by 10.67.3.99 with SMTP id bv3mr30841839pad.140.1372796438050; Tue, 02 Jul 2013 13:20:38 -0700 (PDT) Received: by 10.66.235.99 with HTTP; Tue, 2 Jul 2013 13:20:37 -0700 (PDT) In-Reply-To: References: Date: Tue, 2 Jul 2013 13:20:37 -0700 Message-ID: Subject: Re: Observation with office-scraper plugin From: Lewis John Mcgibbney To: "dev@any23.apache.org" Content-Type: multipart/alternative; boundary=047d7b15fbc909fe4d04e08d1453 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b15fbc909fe4d04e08d1453 Content-Type: text/plain; charset=ISO-8859-1 This is all utter rubbish. Please see https://issues.apache.org/jira/browse/ANY23-164 On Tue, Jul 2, 2013 at 12:47 PM, Lewis John Mcgibbney < lewis.mcgibbney@gmail.com> wrote: > Hi, > For the first time today I have a use case of the office-scraper plugin > [0]. > The command line tools come in pretty handy here and I made the following > observation. > If you are working with xsl (older formats) or xlsx (newer 2007-2010) > formats they need to be ***originally*** written in Microsoft Excel. I can > only assume that this is because the mimetype MD is written and maintained > based on the original editor. > For example I created two excel documents on Libra Office (ouch) as I am > using Ubuntu... I save tho my desktop and use > > law@CEE279Law3-Linux:~/Desktop$ any23 mimes > file:///home/law/spec_table.xls > Display all 190 possibilities? (y or n) > Linux:~/Desktop$ any23 mimes file:///home/law/Desktop/spec_table.xls > > ------------------------------------------------------------------------ > Apache Any23 :: mimes > ------------------------------------------------------------------------ > > application/x-tika-msoffice > > ------------------------------------------------------------------------ > Apache Any23 SUCCESS > Total time: 0s > Finished at: Tue Jul 02 12:37:20 PDT 2013 > Final Memory: 25M/479M > ------------------------------------------------------------------------ > Linux:~/Desktop$ any23 mimes file:///home/law/Desktop/spec_table.xlsx > > ------------------------------------------------------------------------ > Apache Any23 :: mimes > ------------------------------------------------------------------------ > > application/x-tika-ooxml > > ------------------------------------------------------------------------ > Apache Any23 SUCCESS > Total time: 0s > Finished at: Tue Jul 02 12:37:29 PDT 2013 > Final Memory: 25M/479M > ------------------------------------------------------------------------ > > When I do > > Linux:~/Desktop$ any23 verify ~/.any23/plugins > ------------------------------------------------------------------------ > Apache Any23 :: verify > ------------------------------------------------------------------------ > > Plugin author : > Plugin factory : class > org.apache.any23.plugin.officescraper.ExcelExtractorFactory > Plugin mime-types: application/vnd.ms-excel;q=0.1 > application/msexcel;q=0.1 application/x-msexcel;q=0.1 > application/x-ms-excel;q=0.1 > ------------------------------------------------------------------------ > > The plugin will ***only*** work with document formats > application/vnd.ms-excel;q=0.1 application/msexcel;q=0.1 > application/x-msexcel;q=0.1 application/x-ms-excel;q=0.1 > > So I am running between the library and my office punching in trivial > spreadsheets to achieve what I want to do... the joys. > > Thanks > Lewis > > [0] *http://s.apache.org/UaG* > > -- > *Lewis* > -- *Lewis* --047d7b15fbc909fe4d04e08d1453--