tika-dev mailing list archives: October 2010

Site index · List index
Message list1 · 2 · Next »Thread · Author · Date
Geoff Jarrad (JIRA) [jira] Commented: (TIKA-506) Improve doc and docx parsing to include more things Fri, 01 Oct, 00:05
Nick Burch (JIRA) [jira] Commented: (TIKA-506) Improve doc and docx parsing to include more things Fri, 01 Oct, 11:39
Dennis Adler (JIRA) [jira] Created: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Fri, 01 Oct, 17:50
Dennis Adler (JIRA) [jira] Updated: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Fri, 01 Oct, 19:30
Ken Krugler (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Fri, 01 Oct, 19:52
Ken Krugler (JIRA) [jira] Assigned: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Fri, 01 Oct, 19:52
Jukka Zitting (JIRA) [jira] Commented: (TIKA-241) Rar archive support Sun, 03 Oct, 19:48
Jukka Zitting (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Sun, 03 Oct, 19:58
Chris A. Mattmann (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Sun, 03 Oct, 20:02
Grant Ingersoll (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Sun, 03 Oct, 20:10
Chris A. Mattmann (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Sun, 03 Oct, 20:14
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-169) Tika Web Service Servlet Sun, 03 Oct, 20:40
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-426) Parsing javascript as XML Sun, 03 Oct, 21:10
Jukka Zitting (JIRA) [jira] Commented: (TIKA-429) Error parsing DTD Sun, 03 Oct, 21:19
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-427) Parsing CSS as XML Sun, 03 Oct, 21:21
Jan Høydahl (JIRA) [jira] Created: (TIKA-523) Add application/ms-tnef as alias to application/vnd.ms-tnef Mon, 04 Oct, 14:56
Dennis Adler (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Mon, 04 Oct, 17:54
Jan Høydahl (JIRA) [jira] Commented: (TIKA-490) Support for adding language profiles dynamically Mon, 04 Oct, 19:13
Chris A. Mattmann (JIRA) [jira] Commented: (TIKA-490) Support for adding language profiles dynamically Mon, 04 Oct, 20:00
Dennis Adler (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Mon, 04 Oct, 21:42
Nick Burch (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Mon, 04 Oct, 21:52
Dennis Adler (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Mon, 04 Oct, 22:53
Geoff Jarrad (JIRA) [jira] Created: (TIKA-524) Unification of HTML output from Office, OOXML and Open Document parsers Mon, 04 Oct, 23:18
Sjoerd Smeets (JIRA) [jira] Commented: (TIKA-521) OutOfMemoryError Parsing XSLX File Mon, 04 Oct, 23:45
Geoff Jarrad (JIRA) [jira] Created: (TIKA-525) Mismatched start and end elements in HtmlParser Tue, 05 Oct, 00:35
Geoff Jarrad (JIRA) [jira] Created: (TIKA-526) OOXMLParser fails to extract text from within smart tags Tue, 05 Oct, 03:59
Geoff Jarrad (JIRA) [jira] Updated: (TIKA-526) OOXMLParser fails to extract text from within smart tags Tue, 05 Oct, 04:01
Nick Burch (JIRA) [jira] Commented: (TIKA-521) OutOfMemoryError Parsing XSLX File Tue, 05 Oct, 08:45
Sjoerd Smeets (JIRA) [jira] Commented: (TIKA-521) OutOfMemoryError Parsing XSLX File Tue, 05 Oct, 16:17
Dennis Adler (JIRA) [jira] Issue Comment Edited: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Tue, 05 Oct, 19:56
Jukka Zitting (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Tue, 05 Oct, 20:04
Dennis Adler (JIRA) [jira] Updated: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Tue, 05 Oct, 23:35
Dennis Adler (JIRA) [jira] Issue Comment Edited: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Tue, 05 Oct, 23:41
Dennis Adler (JIRA) [jira] Issue Comment Edited: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Tue, 05 Oct, 23:41
Dennis Adler (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Tue, 05 Oct, 23:58
Andrey Sidorenko (JIRA) [jira] Updated: (TIKA-516) Excel 5 files are inconsistently detected as either "application/msword" or "application/vnd.ms-excel" Wed, 06 Oct, 12:14
Nick Burch (JIRA) [jira] Resolved: (TIKA-516) Excel 5 files are inconsistently detected as either "application/msword" or "application/vnd.ms-excel" Wed, 06 Oct, 12:40
Staffan Olsson (JIRA) [jira] Updated: (TIKA-482) Refactor image and jpeg parsers for access to MetadataExtractor API Thu, 07 Oct, 09:15
Maxim Valyanskiy (JIRA) [jira] Commented: (TIKA-521) OutOfMemoryError Parsing XSLX File Thu, 07 Oct, 14:53
Dennis Adler (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Thu, 07 Oct, 18:58
Nick Burch (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Thu, 07 Oct, 20:31
Nick Burch (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Thu, 07 Oct, 20:38
Jan Høydahl (JIRA) [jira] Created: (TIKA-527) Allow override mapping mime<-->parsers through config Fri, 08 Oct, 13:43
Bruno Dumon (JIRA) [jira] Created: (TIKA-528) Reuse tagsoup HtmlSchema instance across HtmlParsers (performance improvement) Sat, 09 Oct, 11:07
Bruno Dumon (JIRA) [jira] Updated: (TIKA-528) Reuse tagsoup HtmlSchema instance across HtmlParsers (performance improvement) Sat, 09 Oct, 11:09
Ken Krugler (JIRA) [jira] Assigned: (TIKA-528) Reuse tagsoup HtmlSchema instance across HtmlParsers (performance improvement) Sat, 09 Oct, 20:19
Ken Krugler (JIRA) [jira] Assigned: (TIKA-525) Mismatched start and end elements in HtmlParser Sat, 09 Oct, 20:23
Shinsuke Sugaya (JIRA) [jira] Updated: (TIKA-422) Wrong charset conversion in some RTF documents. Sun, 10 Oct, 01:12
Jukka Zitting (JIRA) [jira] Commented: (TIKA-527) Allow override mapping mime<-->parsers through config Sun, 10 Oct, 19:17
Dennis Adler (JIRA) [jira] Commented: (TIKA-522) AutoDetectParser treats HTML/XML files as Audio Sun, 10 Oct, 22:55
Jan Høydahl (JIRA) [jira] Updated: (TIKA-527) Allow override mapping mime<-->parsers through config Mon, 11 Oct, 06:22
Jan Høydahl (JIRA) [jira] Commented: (TIKA-527) Allow override mapping mime<-->parsers through config Mon, 11 Oct, 07:48
Sjoerd Smeets (JIRA) [jira] Commented: (TIKA-521) OutOfMemoryError Parsing XSLX File Tue, 12 Oct, 00:01
Sjoerd Smeets (JIRA) [jira] Updated: (TIKA-521) OutOfMemoryError Parsing XSLX File Tue, 12 Oct, 00:01
Radek (JIRA) [jira] Created: (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy Tue, 12 Oct, 01:01
Radek (JIRA) [jira] Updated: (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy Tue, 12 Oct, 01:29
Radek (JIRA) [jira] Updated: (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy Tue, 12 Oct, 01:41
Radek (JIRA) [jira] Updated: (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy Tue, 12 Oct, 01:41
Ken Krugler (JIRA) [jira] Assigned: (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy Tue, 12 Oct, 02:50
Radek (JIRA) [jira] Updated: (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy Tue, 12 Oct, 03:09
Radek (JIRA) [jira] Updated: (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy Tue, 12 Oct, 03:09
Radek (JIRA) [jira] Commented: (TIKA-529) IBM420 charset detection's isLamAlef is allocation-happy Tue, 12 Oct, 04:31
Sjoerd Smeets (JIRA) [jira] Created: (TIKA-530) InvalidFormatException on a PackagePart in OOXML Tue, 12 Oct, 16:59
Ken Krugler (JIRA) [jira] Resolved: (TIKA-528) Reuse tagsoup HtmlSchema instance across HtmlParsers (performance improvement) Tue, 12 Oct, 20:27
Ken Krugler (JIRA) [jira] Issue Comment Edited: (TIKA-528) Reuse tagsoup HtmlSchema instance across HtmlParsers (performance improvement) Tue, 12 Oct, 20:31
Cristian Vat (JIRA) [jira] Updated: (TIKA-422) Wrong charset conversion in some RTF documents. Tue, 12 Oct, 21:42
Cristian Vat (JIRA) [jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents. Tue, 12 Oct, 21:48
Cristian Vat (JIRA) [jira] Updated: (TIKA-422) Wrong charset conversion in some RTF documents. Wed, 13 Oct, 00:54
Jukka Zitting (JIRA) [jira] Reopened: (TIKA-446) Upgrade to PDFBox 1.2.1 Thu, 14 Oct, 09:11
Jukka Zitting (JIRA) [jira] Updated: (TIKA-446) Upgrade to PDFBox 1.3.0 Thu, 14 Oct, 09:13
Cristian Vat (JIRA) [jira] Updated: (TIKA-422) Wrong charset conversion in some RTF documents. Thu, 14 Oct, 18:54
Cristian Vat (JIRA) [jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents. Thu, 14 Oct, 20:34
Sjoerd Smeets (JIRA) [jira] Created: (TIKA-531) xmpTPg:NPages creates invalid XML Fri, 15 Oct, 22:28
Reinhard Schwab (JIRA) [jira] Created: (TIKA-532) missing spaces in text extraction of BodyContentHandler Sat, 16 Oct, 17:50
qubit configuration file Sat, 16 Oct, 19:27
Geoff Jarrad (JIRA) [jira] Created: (TIKA-533) Mis-detection of zip-within-zip as application/vnd.apple.iwork, with no output by CLI app Sun, 17 Oct, 22:23
Geoff Jarrad (JIRA) [jira] Updated: (TIKA-533) Mis-detection of zip-within-zip as application/vnd.apple.iwork, with no output by CLI app Sun, 17 Oct, 22:25
Nick Burch (JIRA) [jira] Commented: (TIKA-533) Mis-detection of zip-within-zip as application/vnd.apple.iwork, with no output by CLI app Sun, 17 Oct, 22:51
Geoff Jarrad (JIRA) [jira] Commented: (TIKA-533) Mis-detection of zip-within-zip as application/vnd.apple.iwork, with no output by CLI app Mon, 18 Oct, 00:21
Jukka Zitting (JIRA) [jira] Commented: (TIKA-533) Mis-detection of zip-within-zip as application/vnd.apple.iwork, with no output by CLI app Mon, 18 Oct, 10:06
Geoff Jarrad (JIRA) [jira] Created: (TIKA-534) MetadataException: Unsupported component id error parsing jpg Mon, 18 Oct, 22:53
Geoff Jarrad (JIRA) [jira] Updated: (TIKA-534) MetadataException: Unsupported component id error parsing jpg Mon, 18 Oct, 22:55
Geoff Jarrad (JIRA) [jira] Updated: (TIKA-534) MetadataException: Unsupported component id error parsing jpg Mon, 18 Oct, 23:46
Staffan Olsson (JIRA) [jira] Commented: (TIKA-482) Refactor image and jpeg parsers for access to MetadataExtractor API Tue, 19 Oct, 05:56
Jukka Zitting (JIRA) [jira] Created: (TIKA-535) Implement Apache project branding requirements Tue, 19 Oct, 09:53
Nick Burch (JIRA) [jira] Commented: (TIKA-482) Refactor image and jpeg parsers for access to MetadataExtractor API Tue, 19 Oct, 12:20
Nick Burch (JIRA) [jira] Commented: (TIKA-533) Mis-detection of zip-within-zip as application/vnd.apple.iwork, with no output by CLI app Tue, 19 Oct, 14:59
Nick Burch (JIRA) [jira] Commented: (TIKA-533) Mis-detection of zip-within-zip as application/vnd.apple.iwork, with no output by CLI app Tue, 19 Oct, 15:56
Jukka Zitting (JIRA) [jira] Updated: (TIKA-533) Mis-detection of zip files as application/vnd.apple.iwork Tue, 19 Oct, 16:30
Nick Burch (JIRA) [jira] Commented: (TIKA-533) Mis-detection of zip files as application/vnd.apple.iwork Tue, 19 Oct, 17:52
Alex Skochin (JIRA) [jira] Updated: (TIKA-422) Wrong charset conversion in some RTF documents. Thu, 21 Oct, 14:37
Alex Skochin (JIRA) [jira] Issue Comment Edited: (TIKA-422) Wrong charset conversion in some RTF documents. Thu, 21 Oct, 14:41
Alex Skochin (JIRA) [jira] Issue Comment Edited: (TIKA-422) Wrong charset conversion in some RTF documents. Thu, 21 Oct, 15:15
Jukka Zitting Gearing up for Tika 0.8 Thu, 21 Oct, 19:28
Mattmann, Chris A (388J) Re: Gearing up for Tika 0.8 Thu, 21 Oct, 22:29
Jukka Zitting (JIRA) [jira] Created: (TIKA-536) Updated site layout Fri, 22 Oct, 13:25
Jukka Zitting (JIRA) [jira] Commented: (TIKA-536) Updated site layout Fri, 22 Oct, 13:29
Nick Burch Re: Gearing up for Tika 0.8 Fri, 22 Oct, 15:11
Nick Burch (JIRA) [jira] Commented: (TIKA-536) Updated site layout Fri, 22 Oct, 15:24
Nick Burch (JIRA) [jira] Commented: (TIKA-530) InvalidFormatException on a PackagePart in OOXML Fri, 22 Oct, 16:55
Message list1 · 2 · Next »Thread · Author · Date
Box list
Sep 2014186
Aug 2014393
Jul 2014328
Jun 2014671
May 2014298
Apr 2014161
Mar 2014226
Feb 2014293
Jan 2014150
Dec 2013155
Nov 201384
Oct 2013100
Sep 201386
Aug 2013103
Jul 2013146
Jun 2013138
May 2013126
Apr 201374
Mar 201370
Feb 2013174
Jan 2013205
Dec 2012109
Nov 2012124
Oct 2012118
Sep 201261
Aug 2012173
Jul 2012274
Jun 2012102
May 2012174
Apr 2012180
Mar 2012200
Feb 2012125
Jan 2012189
Dec 2011287
Nov 2011259
Oct 2011336
Sep 2011356
Aug 2011197
Jul 2011120
Jun 2011122
May 2011184
Apr 2011137
Mar 2011161
Feb 2011111
Jan 201185
Dec 201099
Nov 2010252
Oct 2010144
Sep 2010168
Aug 2010253
Jul 2010192
Jun 2010154
May 2010132
Apr 2010115
Mar 201090
Feb 201062
Jan 2010134
Dec 2009125
Nov 2009179
Oct 200989
Sep 2009115
Aug 200946
Jul 200977
Jun 200994
May 200981
Apr 200936
Mar 200996
Feb 200974
Jan 200993
Dec 2008112
Nov 2008147
Oct 200854
Sep 2008108
Aug 200826
Jul 200817
Jun 200820
May 200816
Apr 200844
Mar 200873
Feb 200836
Jan 200888
Dec 200785
Nov 2007100
Oct 2007424
Sep 2007265
Aug 200719
Jul 200730
Jun 200751
May 200721
Apr 200712
Mar 200712