lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Bernstein <joels...@gmail.com>
Subject Re: Solr and hadoop
Date Thu, 25 Sep 2014 17:58:16 GMT
Hi Tom,

I am not aware of a Solr InputFormat implementation yet. The /export
handier, which outputs entire sorted results sets, was designed to support
these types of bulk export operations efficiently. I think a Solr
InputFormat would be excellent project to begin working on.

Also SOLR-6526 is underway to provide SolrCloud with native streaming
aggregation capabilities.


Joel Bernstein
Search Engineer at Heliosearch

On Thu, Sep 25, 2014 at 12:34 PM, Tom Chen <tomchen1000@gmail.com> wrote:

> I'm aware of the MapReduceIndexerTool (MRIT). That might be solving the
> indexing part -- the OutputFormat part.
>
> But what I asked for is more on the making Solr index data available to
> Hadoop MapReduce -- making Solr as a data store like what HDFS can provide.
> With a Solr InputFormat, we can make the Solr index data available to
> Hadoop MapReduce. Along the same line, we can also make Solr index data
> available to Hive, Spark and etc like what es-hadoop can do.
>
> Best,
> Tom
>
>
>
> On Thu, Sep 25, 2014 at 10:26 AM, Michael Della Bitta <
> michael.della.bitta@appinions.com> wrote:
>
> > Yes, there's SolrInputDocumentWritable and MapReduceIndexerTool, plus the
> > Morphline stuff (check out
> > https://github.com/markrmiller/solr-map-reduce-example).
> >
> > Michael Della Bitta
> >
> > Applications Developer
> >
> > o: +1 646 532 3062
> >
> > appinions inc.
> >
> > “The Science of Influence Marketing”
> >
> > 18 East 41st Street
> >
> > New York, NY 10017
> >
> > t: @appinions <https://twitter.com/Appinions> | g+:
> > plus.google.com/appinions
> > <
> >
> https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
> > >
> > w: appinions.com <http://www.appinions.com/>
> >
> > On Thu, Sep 25, 2014 at 9:58 AM, Tom Chen <tomchen1000@gmail.com> wrote:
> >
> > > I wonder if Solr has InputFormat and OutputFormat like the
> EsInputFormat
> > > and EsOutputFormat that are provided by Elasticserach for Hadoop
> > > (es-hadoop).
> > >
> > > Is it possible for Solr to provide such integration with Hadoop?
> > >
> > > Best,
> > > Tom
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message