lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From startrekfan <startrekfa...@freenet.de>
Subject Re: Solr related questions
Date Fri, 13 Oct 2017 11:50:08 GMT
Thank you for your answer.

To 3.)
The file is on server A, my program is on server B and  solr is on server
C. If I use a normal http(rest) post, my program has to fetch the file
content from server A to Server B and then post it from server B to server
C as there is no open connection between A and C. So the file has to be
transmitted two times.
Is there a way to tell solr to read the file _directly_ from Server A (e.g.
via SMB)

Thank you


Amrit Sarkar <sarkaramrit2@gmail.com> schrieb am Fr., 13. Okt. 2017 um
12:51 Uhr:

> Hi,
>
> 1.) I created a core and tried to simplify the managed-schema file. But if
> > I remove all "unecessary" fields/fieldtypes, I get errors like: field
> > "_version_" is missing, type "boolean" is missing and so on. Why do I
> have
> > to define this types/fields? Which fields/fieldtypes are required?
>
>
> Solr expects the primitive field names and types in the schema. Though a
> better explanation should be there. "_version_" and a unique id field is
> mandatory for each document as "_version_" contains the current version of
> the document utilised in sync across nodes and atomic updation of the
> documents.
>
>  2.) Can I modify the managed-schema remotly/by program e.g. with a post
>
> request or only by editing the managed-schema file directly?
>
> Sure, Schema API is available to us for a while:
> https://lucene.apache.org/solr/guide/6_6/schema-api.html
>
> 3.) When I have a service(solrnet client) that pushes a file from a
> > fileserver to solr, will it cause two times traffic? (from the fileserver
> > to my service and from the service to solr?) Is there a chance to index
> the
> > file direct? (I need to add additional attributes to the index document)
>
>
> Two times traffic? where? Solr will receive the docs once so we are good at
> that part. Please utilize the SolrJ to index documents if possible, as it
> is most updates one, if you are on solrcloud, use CloudSolrJClient.
> Regarding index files direct, you can utilize the DIH (DataImportHandler),
> depends on the file format, its csv, xml, json, but mind it is single
> threaded.
>
> Hope this clarifies some of it.
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269 <(415)%20589-9269>
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>
> On Fri, Oct 13, 2017 at 3:10 PM, startrekfan <startrekfan75@freenet.de>
> wrote:
>
> > Hello,
> >
> > I have some Solr related questions:
> >
> > 1.) I created a core and tried to simplify the managed-schema file. But
> if
> > I remove all "unecessary" fields/fieldtypes, I get errors like: field
> > "_version_" is missing, type "boolean" is missing and so on. Why do I
> have
> > to define this types/fields? Which fields/fieldtypes are required?
> >
> > 2.) Can I modify the managed-schema remotly/by program e.g. with a post
> > request or only by editing the managed-schema file directly?
> >
> > 3.) When I have a service(solrnet client) that pushes a file from a
> > fileserver to solr, will it cause two times traffic? (from the fileserver
> > to my service and from the service to solr?) Is there a chance to index
> the
> > file direct? (I need to add additional attributes to the index document)
> >
> > Thank you
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message