lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (SOLR-1020) PreAnalyzed field analyzer
Date Sat, 16 Mar 2013 19:02:12 GMT

     [ https://issues.apache.org/jira/browse/SOLR-1020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erick Erickson closed SOLR-1020.
--------------------------------


SPRING_CLEANING_2013 we can reopen if necessary. 
                
> PreAnalyzed field analyzer
> --------------------------
>
>                 Key: SOLR-1020
>                 URL: https://issues.apache.org/jira/browse/SOLR-1020
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>    Affects Versions: 1.3
>            Reporter: Karl Wettin
>            Priority: Minor
>         Attachments: SOLR-1020.txt
>
>
> An Analyzer that produce a TokenStream based on XML input that contains a marshalled
TokenStream. Also contains static TokenStream XML marshaller.
> I kind of pulled this out of my pocket without testing it in a real environment in order
to get some comments on the solution before I add it to my project. So cosider it a beta-patch.
> It use JSR173 XMLStream API available in Java 1.6, compatible with Java 1.5 and downloadable
from https://sjsxp.dev.java.net/
> XSD:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
>            xmlns:xs="http://www.w3.org/2001/XMLSchema">
>     <xs:element name="tokens" type="tokensType"/>
>     <xs:complexType name="tokensType">
>         <xs:sequence>
>             <xs:element type="tokenType" name="token"/>
>         </xs:sequence>
>     </xs:complexType>
>     <xs:complexType name="tokenType">
>         <xs:sequence>
>             <xs:element type="xs:int" name="positionIncrement" maxOccurs="1"/>
>             <xs:element type="xs:string" name="term" minOccurs="1" maxOccurs="1"/>
>             <xs:element type="xs:string" name="type" maxOccurs="1"/>
>             <xs:element type="xs:int" name="startOffset" maxOccurs="1"/>
>             <xs:element type="xs:int" name="endOffset" maxOccurs="1"/>
>             <xs:element type="xs:int" name="flags" maxOccurs="1"/>
>             <xs:element type="payloadType" name="payload" maxOccurs="1"/>
>         </xs:sequence>
>     </xs:complexType>
>     <xs:complexType name="payloadType">
>         <xs:choice maxOccurs="1" minOccurs="1">
>             <xs:element type="bytesType" name="bytes"/>
>             <xs:element type="xs:string" name="hex"/>
>             <xs:element type="xs:string" name="base64"/>
>         </xs:choice>
>     </xs:complexType>
>     <xs:complexType name="bytesType">
>         <xs:sequence>
>             <xs:element type="xs:byte" name="byte" maxOccurs="unbounded" minOccurs="1"/>
>         </xs:sequence>
>     </xs:complexType>
> </xs:schema>
> {code}
> Even though I've added a couple of variants to how to handle a Payload in the XSD only
<hex> is supported.
> Example XML:
> {code:xml}
> <tokens>
>   <token>
>     <positionIncrement>1</positionIncrement>
>     <term>term</term>
>     <type>type</type>
>     <startOffset>0</startOffset>
>     <endOffset>3</endOffset>
>     <flags>65535</flags>
>     <payload><hex>fffefd</hex></payload>
>   </token>
> </tokens>
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message