mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <>
Subject Re: [slightly off topic] Determining Importance
Date Mon, 03 Jan 2011 18:48:52 GMT
I've got one word for you, Grant:



On Mon, Jan 3, 2011 at 8:54 AM, Grant Ingersoll <> wrote:

> Hi,
> I wanted to pick people's brains a little bit on the subject of determining
> importance.  This isn't necessarily Mahout related, although I think we have
> some tools that help in the area.
> One of the emerging trends it seems these days with all our connectivity
> and content is a notion of importance/priority.  Some examples:
> 1. Google now has "Priority Inbox" for instance and I think most would
> agree that for things like Twitter and Facebook it would be really nice if
> you could separate out the Important updates/people from the less important.
> 2. Identifying important phrases, etc. in text across a corpus.
> 3. One of the things I think most researchers do when exploring a new topic
> is to identify the one or two seminal papers in the field, read them, and
> then read the ones that cite those papers and so on.
> 4. Take in all the day's news and figure out what the key articles are to
> read (in some sense it's picking the most representative document in a
> cluster) or that the article talking about raising Federal income taxes is
> likely more important
> than the one talking about raising local sales tax (or vice versa!)
> 5. PageRank, TextRank, etc. and other approaches to calculating authority
> What I'm looking for is help in researching this area.  Is there a name for
> this (sub-)field (importance theory? prioritization theory?), particularly
> in mach. learning and NLP that is geared towards this?  I realize some
> (most) of these problems can be solved with classifiers amongst other things
> like graph algorithms (particularly ones that use the social graph), but it
> also seems like the area is bigger than a particular implementation, so I
> wanted to hear what others thought.  How would you go about solving these
> problems?  Do you have any pointers to useful references on the subject
> (theoretical or practical)?  What other examples have you run up against?
> Thanks,
> Grant

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message