manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: MongoDB Repository and Output Connectors
Date Tue, 27 May 2014 17:55:39 GMT
The GridFS connector has now been committed to trunk.

Karl


On Tue, May 27, 2014 at 9:13 AM, Piergiorgio Lucidi
<piergiorgio@apache.org>wrote:

> Hi Karl,
>
> 2014-05-27 15:00 GMT+02:00 Karl Wright <daddywri@gmail.com>:
>
> > Hi Piergiorgio,
> >
> > For a MongoDB output connector, I take it the goal is to deliver binary
> > documents to Mongo DB?  I'm a bit confused then -- what do you mean by
> > metadata?  Or do you intend to deliver both?
> >
>
> This GridFS connector is a repository connector and it is only considering
> to read binaries without considering all the other types of values that we
> can use in MongoDB without GridFS.
>
> I mean that we need a MongoDB Repository Connector to get contents that are
> not binaries and we also need a MongoDB Output Connector to store values in
> MongoDB.
>
> Piergiorgio
>
>
> >
> > I'm trying to figure out if you will wind up in a position similar to the
> > Amazon Cloud Search connector, which needs a metadata-extraction pipeline
> > in order to work at all.
> >
> > Karl
> >
> >
> >
> > On Tue, May 27, 2014 at 8:42 AM, Piergiorgio Lucidi
> > <piergiorgio@apache.org>wrote:
> >
> > > WOW! Great work!
> > >
> > > Anyway this is only related to GridFS and it is considering only
> binaries
> > > without considering all the other values such as strings.
> > > I think that we can add a similar connector for any type of metadata
> for
> > > indexing all the values stored in the database.
> > >
> > > This means that we should add a dedicated MongoDB Connector in addition
> > to
> > > this one for GridFS.
> > >
> > > What do you think?
> > > Please let me know.
> > >
> > > Piergiorgio
> > >
> > >
> > > 2014-05-27 13:41 GMT+02:00 Karl Wright <daddywri@gmail.com>:
> > >
> > > > The license is apparently Apache 2.0.
> > > >
> > > > I've created CONNECTORS-945 to track this; calling it the
> > MongoDB/GridFS
> > > > connector.  I would be very interested in hearing whether this has
> > > anything
> > > > to do with Piergiorgio's thoughts on the matter.
> > > >
> > > > Karl
> > > >
> > > >
> > > >
> > > > On Tue, May 27, 2014 at 6:56 AM, Karl Wright <daddywri@gmail.com>
> > wrote:
> > > >
> > > > > Looks good!
> > > > >
> > > > > What is the license on mongo-java-driver?
> > > > >
> > > > > Karl
> > > > >
> > > > > Sent from my Windows Phone
> > > > > From: Muhammed Olgun
> > > > > Sent: 5/27/2014 5:19 AM
> > > > > To: dev@manifoldcf.apache.org
> > > > > Subject: Re: MongoDB Repository and Output Connectors
> > > > > Hi,
> > > > >
> > > > > I worked a MongoDB related work too. It’s GridFS specific and I
> wrote
> > > > > a repository connector. I would like to share my code. May be we
> can
> > > > > combine it with MongoDB connector. They can be separate too.
> > > > >
> > > > > https://github.com/molgun/MCF-GridFS-Connector
> > > > >
> > > > > On 26 May 2014, at 15:42, Piergiorgio Lucidi <
> piergiorgio@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi Karl,
> > > > > >
> > > > > >
> > > > > > 2014-05-26 13:17 GMT+02:00 Wright, Karl <karl.wright@here.com>:
> > > > > >
> > > > > >>  Hi Piergiorgio,
> > > > > >>
> > > > > >> When you say 'huge', how many documents do you mean?  Do
you
> know
> > of
> > > > any
> > > > > >> specific installations, and if so, how many documents do
they
> work
> > > > with?
> > > > > >>
> > > > > >
> > > > > >> I think it is fine to develop connectors for mongodb, but
if the
> > > > number
> > > > > of
> > > > > >> documents is truly huge, there will need to be a way of
dividing
> > up
> > > > the
> > > > > >> task.
> > > > > >>
> > > > > >
> > > > > > I imagine the typical usage of MongoDB, for example to keep
User
> > > > > Generated
> > > > > > Contents (UGC), this means that the information is just a little
> > > > section
> > > > > of
> > > > > > data for implementing social features dedicated to a website
or a
> > > > portal.
> > > > > >
> > > > > > I'm sorry I don't have real-world numbers now but we could
> estimate
> > > > > > thousands of contents. Actually I'm working on a project that
is
> > > > starting
> > > > > > to use MongoDB and the use case is UGC.
> > > > > >
> > > > > > Please consider that MongoDB divide its contents into databases,
> > and
> > > > for
> > > > > > each database you have different collections of contents.
> > > > > > This means that the connector should ask for a database and
a
> > > specific
> > > > > > collection to work, this because MongoDB doesn't support the
join
> > of
> > > > data
> > > > > > from different collections, it is not a relational database.
> > > > > >
> > > > > > So I think that it should be very easy to implement in
> ManifoldCF.
> > > > > >
> > > > > > Piergiorgio
> > > > > >
> > > > > >
> > > > > >>
> > > > > >> Karl
> > > > > >>
> > > > > >> Sent from my Windows Phone
> > > > > >> ------------------------------
> > > > > >> From: Piergiorgio Lucidi
> > > > > >> Sent: 5/26/2014 6:01 AM
> > > > > >> To: dev@manifoldcf.apache.org
> > > > > >> Subject: MongoDB Repository and Output Connectors
> > > > > >>
> > > > > >> Hi guys,
> > > > > >>
> > > > > >> I think that it could be very useful to add in ManifoldCF
both
> the
> > > > > >> connectors related to MongoDB.
> > > > > >> During these days I have taken a look at MongoDB and I think
> that
> > I
> > > > can
> > > > > >> start to implement these connectors.
> > > > > >>
> > > > > >> The repository connector could make sense because MongoDB
is a
> > > > document
> > > > > >> repository so some people could need to create smart indexes
in
> a
> > > > search
> > > > > >> server, I'm thinking about huge repositories.
> > > > > >> The MongoDB search engine could be very slow in some scenarios
> > > > compared
> > > > > to
> > > > > >> the modern search engines.
> > > > > >>
> > > > > >> The output connector could make sense because you probably
want
> to
> > > > > convert
> > > > > >> your wide data to a flat view in MongoDB to execute queries
in a
> > > flat
> > > > > way.
> > > > > >> Here you probably don't need performance but some good queries
> for
> > > > > analysis
> > > > > >> to create your monthly or weekly reports with MongoDB.
> > > > > >>
> > > > > >> I know that MongoDB is not a search server but it is used
a lot
> > for
> > > > > >> creating dashboards and reports and the usage is similar
to a
> > search
> > > > > >> engine.
> > > > > >>
> > > > > >> What do you think about this?
> > > > > >>
> > > > > >> Please let me know.
> > > > > >> Thank you all.
> > > > > >>
> > > > > >> Cheers,
> > > > > >> Piergiorgio
> > > > > >>
> > > > > >> --
> > > > > >> Piergiorgio Lucidi
> > > > > >> Open Source ECM Specialist
> > > > > >> http://www.open4dev.com
> > > > > >>
> > > > > >> --
> > > > > >> <http://www.open4dev.com>
> > > > > >> Piergiorgio Lucidi <http://www.open4dev.com>
> > > > > >> Open Source ECM Specialist
> > > > > >> <http://www.open4dev.com>http://www.open4dev.com
> > > > >
> > > >
> > > > --
> > > > Piergiorgio Lucidi
> > > > Open Source ECM Specialist
> > > > http://www.open4dev.com
> > > >
> > >
> >
> > --
> > Piergiorgio Lucidi
> > Open Source ECM Specialist
> > http://www.open4dev.com
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message