lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 秋水 <>
Subject Re:[java-user]How did you guys store category info
Date Thu, 20 Sep 2012 03:36:33 GMT
forward myself...
lack of exploration about the apidocs.
some interesting analyzer found after the last email.

At 2012-09-20 10:23:09,"秋水" <> wrote:
>my project may require the tree style category info, how to store it so all leaf docs
under some category node could be retrieved ?
>in thought, planing to store the vertical category info in field : "level 1", "level 2",
>with the "level last" field appended. no ideas about the ease of use yet.
>before that, I'd like to store the layered category info in one field, like "/usr/bin/...",
which seems not working well, if the info is a "term" or "phrase" that contains spaces.
>not-analyzed fields can only be acquired by precisely matched terms.
>while constructing field value in the manner of custom terms sequences impossible.
>also it could not do query by matching initial terms from the beginning of fields.
>I found there is a "regexQuery" contrib, not trailed yet again.. no detailed demo not
Howtos ..
>neither the term vector storage, nor self-defined byte stream studied yet, which seems
too complicated, as well as not a wise option upon Lucene project.
>maybe I could store the root category with a "root" word, and force all subcategories
using other words, or just store all this kind of info in a DB, and references by id. Or just
by some dirty way, using "encodeURIComponent"-like functions, or reversible encryption transforming..
>then the approximation query, "1 2 3"~0
>I've read some article on IBM about geography search in Lucene, that guy reffered a geohash
function, that make hash value in the same prefix from positions inside the same district.
a good way ha. but the wildcard query seems not working very well in some situation. also
considered the extra analyzing and search consuming, that Lucene is especial for "full text
search", not field value String-begin-with search.
>ah, the prefixQuery .. how to using different analyzer for not normal English words segmentation
? eg. "a1b2c3" , is always turned into tems "a 1 b 2 3", or index with not-analyzed switch
and not probable for wildcard matching as well as "PrefixQuery".
>in my project, the demands isn't clear yet. such a joke ha ..
>thanks for sparing time with my nonsense.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message