lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Help needed in breaking large index file into smaller ones
Date Mon, 09 Jan 2017 23:54:36 GMT
Why do you have a requirement that the indexes be < 4G? If it's
arbitrarily imposed why bother?

Or is it a non-negotiable requirement imposed by the platform you're on?

Because just splitting the files into a smaller set won't help you if
you then start to index into it, the merge process will just recreate
them.

You might be able to do something with the settings in
TieredMergePolicy in the first place to stop generating files > 4g..

Best,
Erick

On Mon, Jan 9, 2017 at 3:27 PM, Anshum Gupta <anshum@anshumgupta.net> wrote:
> Can you provide more information about:
> - Are you using Solr in standalone or SolrCloud mode? What version of Solr?
> - Why do you want this? Lack of disk space? Uneven distribution of data on
> shards?
> - Do you want this data together i.e. as part of a single collection?
>
> You can check out the following APIs:
> SPLITSHARD:
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3
> MIGRATE:
> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api12
>
> Among other things, make sure you have enough spare disk-space before
> trying out the SPLITSHARD API in particular.
>
> -Anshum
>
>
>
> On Mon, Jan 9, 2017 at 12:08 PM Mikhail Khludnev <mkhl@apache.org> wrote:
>
>> Perhaps you can copy this index into a separate location. Remove odd and
>> even docs into former and later indexes consequently, and then force merge
>> to single segment in both locations separately.
>> Perhaps shard splitting in SolrCloud does something like that.
>>
>> On Mon, Jan 9, 2017 at 1:12 PM, Narsimha Reddy CHALLA <
>> chnreddy15@gmail.com>
>> wrote:
>>
>> > Hi All,
>> >
>> >       My solr server has a few large index files (say ~10G). I am looking
>> > for some help on breaking them it into smaller ones (each < 4G) to
>> satisfy
>> > my application requirements. Are there any such tools available?
>> >
>> > Appreciate your help.
>> >
>> > Thanks
>> > NRC
>> >
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>

Mime
View raw message