tika-dev mailing list archives: May 2010

Site index · List index
Message list1 · 2 · Next »Thread · Author · Date
Christian Kohlschütter (JIRA) [jira] Created: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Fri, 07 May, 20:15
Christian Kohlschütter (JIRA) [jira] Updated: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Fri, 07 May, 20:20
Christian Kohlschütter (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Fri, 14 May, 11:07
Alex Ott Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents Mon, 31 May, 07:23
Alex Ott Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents Mon, 31 May, 07:30
Alex Ott Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents Mon, 31 May, 09:13
Alex Ott Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents Mon, 31 May, 19:17
Andrew Khoury (JIRA) [jira] Created: (TIKA-434) Bug in TagSoup causes IOException Thu, 27 May, 21:09
Andrew Khoury (JIRA) [jira] Commented: (TIKA-434) Bug in TagSoup causes IOException Thu, 27 May, 21:47
Andrew Khoury (JIRA) [jira] Updated: (TIKA-434) Bug in TagSoup causes IOException Thu, 27 May, 21:47
Andrew Khoury (JIRA) [jira] Updated: (TIKA-434) Bug in TagSoup causes IOException Thu, 27 May, 21:49
Andrew Khoury (JIRA) [jira] Updated: (TIKA-434) Bug in TagSoup causes IOException Thu, 27 May, 21:49
Andrew Khoury (JIRA) [jira] Updated: (TIKA-434) Bug in TagSoup causes IOException Thu, 27 May, 21:51
Andrzej Bialecki Re: Attributes in XHTML output Tue, 11 May, 09:40
Andrzej Bialecki Re: Attributes in XHTML output Tue, 11 May, 15:04
Apache Hudson Server Hudson build is back to normal : Tika-trunk #312 Tue, 11 May, 14:20
Chris A. Mattmann (JIRA) [jira] Assigned: (TIKA-379) Html elements and attributes not available in XHTML representation Wed, 05 May, 04:44
Chris A. Mattmann (JIRA) [jira] Created: (TIKA-421) DOAP file to recognize Tika on projects.a.o Sat, 08 May, 17:44
Chris A. Mattmann (JIRA) [jira] Resolved: (TIKA-421) DOAP file to recognize Tika on projects.a.o Sat, 08 May, 17:56
Chris A. Mattmann (JIRA) [jira] Commented: (TIKA-421) DOAP file to recognize Tika on projects.a.o Sat, 08 May, 18:00
Chris A. Mattmann (JIRA) [jira] Assigned: (TIKA-391) Intermittent errors detecting xls files Fri, 21 May, 17:41
Chris A. Mattmann (JIRA) [jira] Commented: (TIKA-391) Intermittent errors detecting xls files Fri, 21 May, 19:42
Chris A. Mattmann (JIRA) [jira] Created: (TIKA-432) Include NOTICE and LICENSE file updates for NCAR NetCDF parser lib Fri, 21 May, 19:48
Chris A. Mattmann (JIRA) [jira] Resolved: (TIKA-432) Include NOTICE and LICENSE file updates for NCAR NetCDF parser lib Fri, 21 May, 19:54
Chris A. Mattmann (JIRA) [jira] Updated: (TIKA-391) Intermittent errors detecting xls files Fri, 21 May, 20:02
Chris A. Mattmann (JIRA) [jira] Resolved: (TIKA-379) Html elements and attributes not available in XHTML representation Sun, 30 May, 23:51
Chris A. Mattmann (JIRA) [jira] Commented: (TIKA-402) Support for iWork documents Mon, 31 May, 16:48
Christoph Weidling (JIRA) [jira] Created: (TIKA-435) After using the GUI part of the cli sometimes temporary files are not removed. Mon, 31 May, 12:18
Daan de Wit Re: Tika now listed on projects.a.o Wed, 12 May, 06:37
Dave Meikle Re: [jira] Commented: (TIKA-396) Parser Attachements from Outlook Messages Sun, 02 May, 16:40
David Tran (JIRA) [jira] Created: (TIKA-423) Parse docx and output to text file missing words Mon, 17 May, 03:04
David Tran (JIRA) [jira] Updated: (TIKA-423) Parse docx and output to text file missing words Mon, 17 May, 03:04
David Tran (JIRA) [jira] Updated: (TIKA-423) Parse docx and output to text file missing words Mon, 17 May, 03:06
Erik Hetzner (JIRA) [jira] Created: (TIKA-425) Exception parsing mp3 Wed, 19 May, 00:10
Erik Hetzner (JIRA) [jira] Created: (TIKA-426) Parsing javascript as XML Wed, 19 May, 00:30
Erik Hetzner (JIRA) [jira] Created: (TIKA-427) Parsing CSS as XML Wed, 19 May, 00:34
Erik Hetzner (JIRA) [jira] Created: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file Wed, 19 May, 01:30
Erik Hetzner (JIRA) [jira] Commented: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file Wed, 19 May, 01:30
Erik Hetzner (JIRA) [jira] Created: (TIKA-429) Error parsing DTD Wed, 19 May, 20:33
Erik Hetzner (JIRA) [jira] Created: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. Fri, 21 May, 17:59
Erik Hetzner (JIRA) [jira] Commented: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. Fri, 21 May, 18:01
Gerd Bremer (JIRA) [jira] Commented: (TIKA-425) Exception parsing mp3 Wed, 19 May, 11:14
Gerd Bremer (JIRA) [jira] Issue Comment Edited: (TIKA-425) Exception parsing mp3 Wed, 19 May, 11:16
Gerd Bremer (JIRA) [jira] Updated: (TIKA-425) Exception parsing mp3 Wed, 19 May, 12:33
Grant Ingersoll (JIRA) [jira] Created: (TIKA-433) Tika + Hadoop Tue, 25 May, 21:13
Grant Ingersoll (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Wed, 26 May, 10:38
Grant Ingersoll (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Wed, 26 May, 12:43
Ian Holsman Re: confirm unsubscribe from dev@tika.apache.org Thu, 27 May, 18:52
Jukka Zitting Alternative RTF parsers (Was: [jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents.) Wed, 12 May, 14:34
Jukka Zitting Re: Boilerpipe issue with Maven central repository Fri, 21 May, 08:20
Jukka Zitting Re: Improved handling of attributes Wed, 26 May, 15:02
Jukka Zitting Re: Improved handling of attributes Wed, 26 May, 15:28
Jukka Zitting (JIRA) [jira] Created: (TIKA-419) Allow parser lookup from a custom class loader Tue, 04 May, 15:56
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-419) Allow parser lookup from a custom class loader Tue, 04 May, 16:09
Jukka Zitting (JIRA) [jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents. Wed, 12 May, 13:06
Jukka Zitting (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Wed, 12 May, 13:25
Jukka Zitting (JIRA) [jira] Commented: (TIKA-242) Incremental configuration AutoDetectParser Wed, 12 May, 13:41
Jukka Zitting (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Wed, 12 May, 13:58
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-415) Findbugs: XHTMLDowngradeHandler equals() comparing different types Wed, 12 May, 14:38
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-417) Unable to parse the content for UCS2 Litte Endian encoded file Wed, 12 May, 15:34
Jukka Zitting (JIRA) [jira] Commented: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types Wed, 12 May, 15:46
Jukka Zitting (JIRA) [jira] Commented: (TIKA-402) Support for Keynote and Pages documents Wed, 12 May, 16:44
Jukka Zitting (JIRA) [jira] Commented: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. Wed, 26 May, 08:45
Jukka Zitting (JIRA) [jira] Commented: (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents Wed, 26 May, 08:47
Jukka Zitting (JIRA) [jira] Commented: (TIKA-429) Error parsing DTD Wed, 26 May, 08:51
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-425) Exception parsing mp3 Wed, 26 May, 09:28
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file Wed, 26 May, 09:34
Jukka Zitting (JIRA) [jira] Commented: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types Wed, 26 May, 09:46
Jukka Zitting (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Wed, 26 May, 10:16
Jukka Zitting (JIRA) [jira] Commented: (TIKA-427) Parsing CSS as XML Wed, 26 May, 10:20
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-424) Avoid ArrayIndexOutOfBoundsException on some mp3 files Wed, 26 May, 10:24
Jukka Zitting (JIRA) [jira] Commented: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types Wed, 26 May, 10:26
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-413) DWG Parser Wed, 26 May, 12:14
Jukka Zitting (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Wed, 26 May, 12:49
Jukka Zitting (JIRA) [jira] Commented: (TIKA-402) Support for Keynote and Pages documents Wed, 26 May, 15:13
Jukka Zitting (JIRA) [jira] Commented: (TIKA-416) Out-of-process text extraction Thu, 27 May, 21:51
Jukka Zitting (JIRA) [jira] Updated: (TIKA-402) Support for iWork documents Mon, 31 May, 15:04
Jukka Zitting (JIRA) [jira] Commented: (TIKA-402) Support for iWork documents Mon, 31 May, 16:38
Jukka Zitting (JIRA) [jira] Resolved: (TIKA-435) After using the GUI part of the cli sometimes temporary files are not removed. Mon, 31 May, 17:00
Julien Nioche (JIRA) [jira] Updated: (TIKA-379) Html elements and attributes not available in XHTML representation Tue, 04 May, 09:19
Julien Nioche (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Wed, 26 May, 07:34
Julien Nioche (JIRA) [jira] Commented: (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents Wed, 26 May, 09:28
Julien Nioche (JIRA) [jira] Commented: (TIKA-433) Tika + Hadoop Wed, 26 May, 11:14
Ken Krugler Attributes in XHTML output Tue, 11 May, 00:56
Ken Krugler Re: Attributes in XHTML output Tue, 11 May, 13:22
Ken Krugler Html5 parsing spec Tue, 18 May, 19:54
Ken Krugler Boilerpipe issue with Maven central repository Fri, 21 May, 00:58
Ken Krugler Improved handling of attributes Fri, 21 May, 01:08
Ken Krugler Re: Boilerpipe issue with Maven central repository Fri, 21 May, 03:50
Ken Krugler Re: Improved handling of attributes Thu, 27 May, 16:16
Ken Krugler (JIRA) [jira] Assigned: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Fri, 07 May, 20:45
Ken Krugler (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Fri, 07 May, 20:47
Ken Krugler (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Wed, 12 May, 13:43
Ken Krugler (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Wed, 12 May, 16:34
Ken Krugler (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Sun, 16 May, 13:36
Ken Krugler (JIRA) [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages Sun, 16 May, 13:42
Ken Krugler (JIRA) [jira] Created: (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents Fri, 21 May, 01:04
Ken Krugler (JIRA) [jira] Assigned: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. Wed, 26 May, 17:10
Ken Krugler (JIRA) [jira] Commented: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. Wed, 26 May, 17:12
Martijn v Groningen Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents Mon, 31 May, 09:10
Message list1 · 2 · Next »Thread · Author · Date
Box list
Aug 2015375
Jul 2015323
Jun 2015307
May 2015317
Apr 2015475
Mar 2015891
Feb 2015445
Jan 2015601
Dec 2014253
Nov 2014389
Oct 2014481
Sep 2014364
Aug 2014393
Jul 2014328
Jun 2014671
May 2014298
Apr 2014161
Mar 2014226
Feb 2014293
Jan 2014150
Dec 2013155
Nov 201384
Oct 2013100
Sep 201386
Aug 2013103
Jul 2013146
Jun 2013138
May 2013126
Apr 201374
Mar 201370
Feb 2013174
Jan 2013205
Dec 2012109
Nov 2012124
Oct 2012118
Sep 201261
Aug 2012173
Jul 2012274
Jun 2012102
May 2012174
Apr 2012180
Mar 2012200
Feb 2012125
Jan 2012189
Dec 2011287
Nov 2011259
Oct 2011336
Sep 2011356
Aug 2011197
Jul 2011120
Jun 2011122
May 2011184
Apr 2011137
Mar 2011161
Feb 2011111
Jan 201185
Dec 201099
Nov 2010252
Oct 2010144
Sep 2010168
Aug 2010253
Jul 2010192
Jun 2010154
May 2010132
Apr 2010115
Mar 201090
Feb 201062
Jan 2010134
Dec 2009125
Nov 2009179
Oct 200989
Sep 2009115
Aug 200946
Jul 200977
Jun 200994
May 200981
Apr 200936
Mar 200996
Feb 200974
Jan 200993
Dec 2008112
Nov 2008147
Oct 200854
Sep 2008108
Aug 200826
Jul 200817
Jun 200820
May 200816
Apr 200844
Mar 200873
Feb 200836
Jan 200888
Dec 200785
Nov 2007100
Oct 2007424
Sep 2007265
Aug 200719
Jul 200730
Jun 200751
May 200721
Apr 200712
Mar 200712