lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "JsonPreAnalyzedParser" by AndrzejBialecki
Date Fri, 11 May 2012 20:04:31 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "JsonPreAnalyzedParser" page has been changed by AndrzejBialecki:
http://wiki.apache.org/solr/JsonPreAnalyzedParser

New page:
= JsonPreAnalyzedParser format =

This is the default serialization format used by PreAnalyzedField type. It uses a top-level
JSON map with the following keys:

 * `v` - version key (required). Currently the supported version is `1`.
 * `str` - stored string value of a field (optional). You can use at most one of `str` or
`bin`.
 * `bin` - stored binary value of a field (optional). The binary value has to be Base64 encoded.
 * `tokens` - serialized token stream (optional). This is a JSON list.

Any other top-level key is silently ignored.

== Token stream serialization ==

Token stream is expressed as a JSON list of JSON maps. Each map consists of the following
keys and values:

 * `t` - token key (required). The value is a UTF-8 string that represents the current token.
 * `s`, `e` - start / end offset keys (optional - either none or both must be present). The
value is the start and end offset of the token, respectively - both non-negative integers.
 * `i` - position increment key (optional - if missing a value of `1` is assumed). The value
is non-negative integer that represent the position increment attribute.
 * `p` - payload key (optional). The value is a Base64 encoded payload value.
 * `y` - type key (optional). The value is a string, which is the token type name.
 * `f` - flags key (optional). The value is a string representing integer value in hexadecimal
format.

== Example ==

{{{#!json
{
 "v":"1",
 "str":"test ąćęłńóśźż",
 "tokens":[
  {
   "e":128,
   "i":22,
   "p":"DQ4KDQsODg8=",
   "s":123,
   "t":"one",
   "y":"word"
  },
  {
   "e":8,
   "i":1,
   "s":5,
   "t":"two",
   "y":"word"
  },
  {
   "e":22,
   "i":1,
   "s":20,
   "t":"three",
   "y":"foobar"
  }
 ]
}
}}}

Mime
View raw message