any23-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
Subject Re: Observation with office-scraper plugin
Date Tue, 02 Jul 2013 20:20:37 GMT
This is all utter rubbish.
Please see
https://issues.apache.org/jira/browse/ANY23-164


On Tue, Jul 2, 2013 at 12:47 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Hi,
> For the first time today I have a use case of the office-scraper plugin
> [0].
> The command line tools come in pretty handy here and I made the following
> observation.
> If you are working with xsl (older formats) or xlsx (newer 2007-2010)
> formats they need to be ***originally*** written in Microsoft Excel. I can
> only assume that this is because the mimetype MD is written and maintained
> based on the original editor.
> For example I created two excel documents on Libra Office (ouch) as I am
> using Ubuntu... I save tho my desktop and use
>
> law@CEE279Law3-Linux:~/Desktop$ any23 mimes
> file:///home/law/spec_table.xls
> Display all 190 possibilities? (y or n)
> Linux:~/Desktop$ any23 mimes file:///home/law/Desktop/spec_table.xls
>
> ------------------------------------------------------------------------
> Apache Any23 :: mimes
> ------------------------------------------------------------------------
>
> application/x-tika-msoffice
>
> ------------------------------------------------------------------------
> Apache Any23 SUCCESS
> Total time: 0s
> Finished at: Tue Jul 02 12:37:20 PDT 2013
> Final Memory: 25M/479M
> ------------------------------------------------------------------------
> Linux:~/Desktop$ any23 mimes file:///home/law/Desktop/spec_table.xlsx
>
> ------------------------------------------------------------------------
> Apache Any23 :: mimes
> ------------------------------------------------------------------------
>
> application/x-tika-ooxml
>
> ------------------------------------------------------------------------
> Apache Any23 SUCCESS
> Total time: 0s
> Finished at: Tue Jul 02 12:37:29 PDT 2013
> Final Memory: 25M/479M
> ------------------------------------------------------------------------
>
> When I do
>
> Linux:~/Desktop$ any23 verify ~/.any23/plugins
> ------------------------------------------------------------------------
> Apache Any23 :: verify
> ------------------------------------------------------------------------
>
> Plugin author    : <unknown>
> Plugin factory   : class
> org.apache.any23.plugin.officescraper.ExcelExtractorFactory
> Plugin mime-types: application/vnd.ms-excel;q=0.1
> application/msexcel;q=0.1 application/x-msexcel;q=0.1
> application/x-ms-excel;q=0.1
> ------------------------------------------------------------------------
>
> The plugin will ***only*** work with document formats
> application/vnd.ms-excel;q=0.1 application/msexcel;q=0.1
> application/x-msexcel;q=0.1 application/x-ms-excel;q=0.1
>
> So I am running between the library and my office punching in trivial
> spreadsheets to achieve what I want to do... the joys.
>
> Thanks
> Lewis
>
> [0] *http://s.apache.org/UaG*
>
> --
> *Lewis*
>



-- 
*Lewis*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message