mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dileepa Jayakody <dileepajayak...@gmail.com>
Subject Building a reputation analysis engine for email using Mahout
Date Thu, 20 Mar 2014 10:59:14 GMT
Hi All,

My name is Dileepa Jayakody, a MSc research student from University of
Moratuwa, Sri Lanka. My research project (ReputationBox) is about
prediction the goodness of incoming emails (based on a calculated
reputation score) by analysing previous email conversations, email
correspondents and their interests etc. I think this will be more like a
recommendation engine for emails to rate and classify incoming emails based
on a reputation score.

The basic flow of my application is as follows;

1. User authorizes my application : ReputationBox to connect to his mailbox
to read email
2. ReputationBox performs an initial reputation-analysis process to build a
reputation-index over the past emails imported as a batch. (This initial
reputation-index will be used as the training-data to analyse new incoming
emails)
3. New emails are polled/ pushed to ReputationBox server and
reputation-analysis is performed real-time to predict the reputation.
4. Email reputation data is stored in the application
5. ReputationBox client web-app represents the reputation data of the new
emails (based on the reputation data in the email the client could be
implemented as a priority-inbox, spam-filter, email categorizer etc)

I would like to seek advice on how to develop the reputation-analysis
component of my application using Apache Mahout. I'm looking at the people,
topic and the actions mentioned in an email to derive the reputation. This
is the high level architecture diagram of ReputationBox system [1].

I also plan to deploy my application in Google AppEngine. Is Mahout GAE
deployable?
I'm also planning to use Apache Isis to develop ReputationBox as a
domain-driven application. This is a proposed project for GSoC.
For more information on my application please see the jira [2]

Looking forward to your suggestions.

Thanks,
Dileepa

[1]
https://issues.apache.org/jira/secure/attachment/12634802/EmailReputationSystem_v2.png
[2] https://issues.apache.org/jira/browse/ISIS-736

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message