mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <sro...@gmail.com>
Subject Re: GenericJDBCDataModel problem (getItems, getUsers)
Date Thu, 05 Nov 2009 10:33:26 GMT
On Thu, Nov 5, 2009 at 8:32 AM, Mirko
<idonthaveenoughinformation@googlemail.com> wrote:
> Hi Sean,
> I used v0.1. I got the latest snapshot (0.3) now and saw that a lot changed
> in the meanwhile; things seem a bit more complicated now. I will switch to
> 0.3 but that seems to be more tricky to implement on my data (e.g. I
> currently use Strings, not longs, for items and users in my data). Probably
> it'll be easier to copy my data to an already implemented DB (MySQL) and
> process them there instead of customizing the DataModel to fit my DB.

Yes I would encourage you to switch, since a lot has improved, but yes
the Strings-longs issue is the biggest one. It does enable quite a bit
of performance improvement.

Look for the "IDMigrator" class which can help create a temporary
solution. It's a helper class that helps maintain a mapping between
longs and Strings. This can be used with a JDBCDataModel to translate
back and forth for your database

Of course you'll probably want to ultimately use numbers in your
database too if possible, but this works in the short term. We can
discuss more how to do it if you go this way.


> As I understand the code, the GenericJDBCDatamodel works with any JDBC
> Datasource when adjusting the SQL_KEY values. Since my VirtuosoDB is a
> common JDBC javax.sql.Datasource it should not matter whether I send SQL or
> RDF-in-SQL, as long as the SQL ResultSets are of the expected structure.
> Would you agree on that? Or, what else do you think I do have to customize?

Yes that's right. One possibility is that there are no results because
the SQL isn't quite querying how you want it, or being executed as you
think it is, and is returning no results. For example if you don't
have the order of the placeholders right (the '?') things wouldn't
work.

This is why I wonder if you can look at the log statements from the
library (at debug level) which show the SQL statements being executed.
Or simply attach a debugger and find out that way.

The other possibility is, as I remember, the iterators can't throw an
exception when they hit a problem. They log the error and close the
iterator. This could also explain this output. Again, you'd have to
look into the log files, or a debugger, to confirm what's happening.

I'm suggesting one of these two things should be investigated.

Mime
View raw message