manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: MongoDB Repository and Output Connectors
Date Tue, 27 May 2014 13:00:23 GMT
Hi Piergiorgio,

For a MongoDB output connector, I take it the goal is to deliver binary
documents to Mongo DB?  I'm a bit confused then -- what do you mean by
metadata?  Or do you intend to deliver both?

I'm trying to figure out if you will wind up in a position similar to the
Amazon Cloud Search connector, which needs a metadata-extraction pipeline
in order to work at all.

Karl



On Tue, May 27, 2014 at 8:42 AM, Piergiorgio Lucidi
<piergiorgio@apache.org>wrote:

> WOW! Great work!
>
> Anyway this is only related to GridFS and it is considering only binaries
> without considering all the other values such as strings.
> I think that we can add a similar connector for any type of metadata for
> indexing all the values stored in the database.
>
> This means that we should add a dedicated MongoDB Connector in addition to
> this one for GridFS.
>
> What do you think?
> Please let me know.
>
> Piergiorgio
>
>
> 2014-05-27 13:41 GMT+02:00 Karl Wright <daddywri@gmail.com>:
>
> > The license is apparently Apache 2.0.
> >
> > I've created CONNECTORS-945 to track this; calling it the MongoDB/GridFS
> > connector.  I would be very interested in hearing whether this has
> anything
> > to do with Piergiorgio's thoughts on the matter.
> >
> > Karl
> >
> >
> >
> > On Tue, May 27, 2014 at 6:56 AM, Karl Wright <daddywri@gmail.com> wrote:
> >
> > > Looks good!
> > >
> > > What is the license on mongo-java-driver?
> > >
> > > Karl
> > >
> > > Sent from my Windows Phone
> > > From: Muhammed Olgun
> > > Sent: 5/27/2014 5:19 AM
> > > To: dev@manifoldcf.apache.org
> > > Subject: Re: MongoDB Repository and Output Connectors
> > > Hi,
> > >
> > > I worked a MongoDB related work too. It’s GridFS specific and I wrote
> > > a repository connector. I would like to share my code. May be we can
> > > combine it with MongoDB connector. They can be separate too.
> > >
> > > https://github.com/molgun/MCF-GridFS-Connector
> > >
> > > On 26 May 2014, at 15:42, Piergiorgio Lucidi <piergiorgio@apache.org>
> > > wrote:
> > >
> > > > Hi Karl,
> > > >
> > > >
> > > > 2014-05-26 13:17 GMT+02:00 Wright, Karl <karl.wright@here.com>:
> > > >
> > > >>  Hi Piergiorgio,
> > > >>
> > > >> When you say 'huge', how many documents do you mean?  Do you know
of
> > any
> > > >> specific installations, and if so, how many documents do they work
> > with?
> > > >>
> > > >
> > > >> I think it is fine to develop connectors for mongodb, but if the
> > number
> > > of
> > > >> documents is truly huge, there will need to be a way of dividing up
> > the
> > > >> task.
> > > >>
> > > >
> > > > I imagine the typical usage of MongoDB, for example to keep User
> > > Generated
> > > > Contents (UGC), this means that the information is just a little
> > section
> > > of
> > > > data for implementing social features dedicated to a website or a
> > portal.
> > > >
> > > > I'm sorry I don't have real-world numbers now but we could estimate
> > > > thousands of contents. Actually I'm working on a project that is
> > starting
> > > > to use MongoDB and the use case is UGC.
> > > >
> > > > Please consider that MongoDB divide its contents into databases, and
> > for
> > > > each database you have different collections of contents.
> > > > This means that the connector should ask for a database and a
> specific
> > > > collection to work, this because MongoDB doesn't support the join of
> > data
> > > > from different collections, it is not a relational database.
> > > >
> > > > So I think that it should be very easy to implement in ManifoldCF.
> > > >
> > > > Piergiorgio
> > > >
> > > >
> > > >>
> > > >> Karl
> > > >>
> > > >> Sent from my Windows Phone
> > > >> ------------------------------
> > > >> From: Piergiorgio Lucidi
> > > >> Sent: 5/26/2014 6:01 AM
> > > >> To: dev@manifoldcf.apache.org
> > > >> Subject: MongoDB Repository and Output Connectors
> > > >>
> > > >> Hi guys,
> > > >>
> > > >> I think that it could be very useful to add in ManifoldCF both the
> > > >> connectors related to MongoDB.
> > > >> During these days I have taken a look at MongoDB and I think that
I
> > can
> > > >> start to implement these connectors.
> > > >>
> > > >> The repository connector could make sense because MongoDB is a
> > document
> > > >> repository so some people could need to create smart indexes in a
> > search
> > > >> server, I'm thinking about huge repositories.
> > > >> The MongoDB search engine could be very slow in some scenarios
> > compared
> > > to
> > > >> the modern search engines.
> > > >>
> > > >> The output connector could make sense because you probably want to
> > > convert
> > > >> your wide data to a flat view in MongoDB to execute queries in a
> flat
> > > way.
> > > >> Here you probably don't need performance but some good queries for
> > > analysis
> > > >> to create your monthly or weekly reports with MongoDB.
> > > >>
> > > >> I know that MongoDB is not a search server but it is used a lot for
> > > >> creating dashboards and reports and the usage is similar to a search
> > > >> engine.
> > > >>
> > > >> What do you think about this?
> > > >>
> > > >> Please let me know.
> > > >> Thank you all.
> > > >>
> > > >> Cheers,
> > > >> Piergiorgio
> > > >>
> > > >> --
> > > >> Piergiorgio Lucidi
> > > >> Open Source ECM Specialist
> > > >> http://www.open4dev.com
> > > >>
> > > >> --
> > > >> <http://www.open4dev.com>
> > > >> Piergiorgio Lucidi <http://www.open4dev.com>
> > > >> Open Source ECM Specialist
> > > >> <http://www.open4dev.com>http://www.open4dev.com
> > >
> >
> > --
> > Piergiorgio Lucidi
> > Open Source ECM Specialist
> > http://www.open4dev.com
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message