incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Garrett Barton <garrett.bar...@gmail.com>
Subject Re: Reworking the data model
Date Sun, 13 Oct 2013 15:26:18 GMT
How would you update a single document then without a unique key?
On Oct 13, 2013 9:53 AM, "Aaron McCurry" <amccurry@gmail.com> wrote:

> On Sun, Oct 13, 2013 at 9:45 AM, Tim Williams <williamstw@gmail.com>
> wrote:
>
> > On Sun, Oct 13, 2013 at 8:19 AM, Aaron McCurry <amccurry@gmail.com>
> wrote:
> > > I had another thought yesterday that might be even simpler while being
> > able
> > > to maintain all current features.
> > >
> > > Instead of having:
> > >
> > > Row with rowId (DocumentCollection with docCollectionId)
> > > Record with recordId (Document with docId)
> > >    - Dropping Family
> > > Column with name and value (Field with name and value)
> > >
> > > We drop Row/DocumentCollection altogether and we don't require docId to
> > be
> > > unique.
> > >
> > > So it would be:
> > >
> > > Document with docId
> > > Field with name and value
> > >
> > > And the new rule would be that wherever there are documents that share
> > the
> > > same document id, you get the same effects as the
> Row/DocumentCollection.
> > >  This would remove the need for multiple ids (rowId and recordId), and
> it
> > > would be logically the same as normal Lucene.  The difference that Blur
> > > would add is the ability to join on documentId by default.  We could
> also
> > > configure the table to allow for duplicate document ids or not, that
> way
> > > users can choose whether or not they need the document id join
> > capability.
> > >
> > > What do you all think?
> >
> > The idea of getting rid of the "container" as a first class construct
> > is compelling.  I don't find grouping by docid intuitive.  Maybe leave
> > docid as a user field - typically distinct - and use a docGroupId to
> > bind them?
> >
>
> Can't really do that, because the docGroupId (in your suggestion) has to be
> used to distribute on so all the documents are co-located in the same
> shard.  And I feel that having 2 different ids is a big part of the
> confusion, what they are and when to use what.
>
> If added an attribute in the table to allow for duplicate docIds or not
> that would at least let end user to decide whether it's a grouping table or
> not.
>
> I don't know where to go with this, I'm trying to make this as intuitive as
> possible for the typical case, which is just plain old Lucene documents and
> still support the current features.
>
> Aaron
>
>
> >
> > --tim
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message