[ https://issues.apache.org/jira/browse/TIKA-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856815#action_12856815 ] Julien Nioche commented on TIKA-379: ------------------------------------ There is actually a special treatment for the elements in HEAD done in the class HtmlHandler so simply adding *link* to the HTMLMapper does not solve the problem. > Html elements and attributes not available in XHTML representation > ------------------------------------------------------------------- > > Key: TIKA-379 > URL: https://issues.apache.org/jira/browse/TIKA-379 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.7 > Reporter: Julien Nioche > Priority: Critical > > The following HTML document : > document 1 titlejotain suomeksi > is rendered as the following xhtml by Tika : > </head><body>document 1 titlejotain suomeksi</body></html> > with the lang attribute getting lost. The lang is not stored in the metadata either. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira