mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject Re: Setting up a recommender
Date Wed, 31 Jul 2013 18:39:06 GMT
OK, looks like there *is* some magic in the Lucid config. I believe all I need to do is  write
out the docs using Solr XML defining fields for each similarity type and the doc name. The
rest can be done by standard Lucid hand configuration. I believe this will minimally handle
#3 below.

On Jul 31, 2013, at 11:20 AM, Pat Ferrel <> wrote:

A few architectural questions:

I created a local instance of the LucidWorks Search on my dev machine. I can quite easily
save the similarity vectors from the DRMs into docs at special locations and index them with
LucidWorks. But to ingest the docs and put them in separate fields of the same index we need
some new code (unless I've missed some Lucid config magic) that does the indexing and integrates
with LucidWorks. 

I imagine two indexes. One index for the similarity matrix and optionally the cross-similairty
matrix in two fields of type 'string'. Another index for users' history--we could put the
docs there for retrieval by user ID. The user history docs then become the query on the similarity
index and would return recommendations. Or any realtime collected or generated history could
be used too.

Is this what you imagined Ted? Especially WRT Lucid integration?

Someone could probably donate their free tier EC2 instance and set this up pretty easily.
Not sure if this would fit given free tier memory but maybe for small data sets.

To get this available for actual use we'd need:
1-- An instance with an IP address somewhere to run the ingestion and customized LucidWorks
2-- Synthetic data created using Ted's tool.
3-- Customized Solr indexing code for integration with LucidWorks? Not sure how this is done.
I can do the Solr part but have not looked into Lucid integration yet.
4-- Flesh out the rest of Ted's outline but 1-3 will give a minimally running example.

Assuming I've got this right, does someone want to help with these?

Another way to approach this is to create a stand alone codebase that requires Mahout and
Solr and supplies an API something like the proposed Mahout SGD online recommender or Myrrix.
This would be easier to consume but would lack all the UI and inspection code of LucidWorks.

View raw message