manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jens Jahnke (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CONNECTORS-1081) Documentation: elasticsearch index creation.
Date Thu, 23 Oct 2014 13:52:34 GMT
Jens Jahnke created CONNECTORS-1081:
---------------------------------------

             Summary: Documentation: elasticsearch index creation.
                 Key: CONNECTORS-1081
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1081
             Project: ManifoldCF
          Issue Type: Improvement
          Components: Documentation
            Reporter: Jens Jahnke
            Priority: Minor


Hi,

this may be useful for the documentation.

Here are some simple steps for creating an elasticsearch index.
{code}
% curl -XPUT 'http://localhost:9200/manifoldcf'
% curl -XPUT 'http://localhost:9200/manifoldcf/attachment/_mapping' -d '
{
  "attachment" : {
    "_source" : {
      "excludes" : [ "file" ]
    },
    "properties": { 
      "allow_token_document" : { 
        "type" : "string" 
      },
      "allow_token_parent" : { 
        "type" : "string" 
      },
      "allow_token_share" : { 
        "type" : "string" 
      },
      "attributes" : {
        "type" : "string"
      },
      "createdOn" : {
        "type" : "string"
      },
      "deny_token_document" : {
        "type" : "string"
      },
      "deny_token_parent" : {
        "type" : "string"
      },
      "deny_token_share" : {
        "type" : "string"
      },
      "lastModified" : {
        "type" : "string"
      },
      "shareName" : {
        "type" : "string"
      },
      "file" : {
        "type" : "attachment",
        "path" : "full",
        "fields" : {
          "file" : {
            "store" : true,
            "term_vector" : "with_positions_offsets",
            "type" : "string"
          }
        }
      }
    }
  }
}'
{code}

This creates an index called {{manifoldcf}} with a mapping named {{attachment}} which has
some generic fields for access tokens and a field {{file}} which makes use of the elasticsearch
attachment mapper plugin. It is configured for highlighting ({{"term_vector" : "with_positions_offsets"}}).

The following part is useful for not saving the source json on the index which reduces the
index size significantly. Be aware that you shouldn't do this if you need to re-index data
on the elasticsearch side or you want access to the whole document.

{code}
"_source" : {
  "excludes" : [ "file" ]
},
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message