lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luca Quarello <lucaquare...@gmail.com>
Subject Re: Solr5.X document loss in splitting shards
Date Tue, 29 Dec 2015 18:37:28 GMT
Hi,
the only way that i find to solve my problem is to do the split using a
solr instance configured in standalone mode.

curl
http://localhost:8983/solr/admin/cores?action=SPLIT&core=sepa&path=/nas_perf_2/FRAGMENTS/17MINDEXES/1/index&path=/nas_perf/FRAGMENTS/17MINDEXES/2/index

In solr_cloud mode does the shards splitting action work properly for large
shards?

Thanks!

On Mon, Dec 28, 2015 at 2:58 PM, GW <thegeoforce@gmail.com> wrote:

> I don't use Curl but there are a couple of things that come to mind
>
> 1: Maybe use document routing with the shards. Use an "!" in your unique
> ID. I'm using gmail to read this and it sucks for searching content so if
> you have done this please ignore this point. Example: If you were storing
> documents per domain you unique field values would look like
> www.domain1.com!123,  www.domain1.com!124,
>    www.domain2.com!35, etc.
>
> This should create a two segment hash for searching shards. I do this in
> blind faith as a best practice as it is mentioned in the docs.
>
> 2: Curl works best with URL encoding. I was using Curl at one time and I
> noticed some strange results w/o url encoding
>
> What are you using to write your client?
>
> Best,
>
> GW
>
>
>
> On 27 December 2015 at 19:35, Shawn Heisey <apache@elyograg.org> wrote:
>
> > On 12/26/2015 11:21 AM, Luca Quarello wrote:
> > > I have a SOLR 5.3.1 CLOUD with two nodes and 8 shards per node.
> > >
> > > Each shard is about* 35 million documents (**35025882**) and 16GB
> sized.*
> > >
> > >
> > >    - I launch the SPLIT command on a shard (shard 13) in the ASYNC way:
> >
> > <snip>
> >
> > > The new created shards have:
> > > *13430316 documents (5.6 GB) and 13425924 documents (5.59 GB**)*.
> >
> > Where are you looking that shows you the source shard has 35 million
> > documents?  Be extremely specific.
> >
> > The following screenshot shows one place you might be looking for this
> > information -- the core overview page:
> >
> >
> https://www.dropbox.com/s/311n49wkp9kw7xa/admin-ui-core-overview.png?dl=0
> >
> > Is the core overview page where you are looking, or is it somewhere else?
> >
> > I'm asking because "Max Doc" and "Num Docs" on the core overview page
> > mean very different things.  The difference between them is the number
> > of deleted docs, and the split shards are probably missing those deleted
> > docs.
> >
> > This is the only idea that I have.  If it's not that, then I'm as
> > clueless as you are.
> >
> > Thanks,
> > Shawn
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message