mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Blechschmidt <Manuel.Blechschm...@gmx.de>
Subject Re: R: R: Using recommenders with String identifiers
Date Thu, 08 Mar 2012 15:14:33 GMT
Hi Claudia,
actually a kind of. With the IDMigrator it depends how you store them. You can store them
in memory, in a database or in a file.

Further if you would use strings these strings would get copied multiple times and therefore
would use multiple times the amount of there memory.

So you could supply a recommender implementation which is doing the String Long mapping transparently
for the user and put in on github. Currently there is a lack of easy to understand examples.
I tried to help a little bit with my facebook-recommender-demo.

/Manuel

On 08.03.2012, at 15:52, Claudia Grieco wrote:

> I understand, but with IDMigrator I still need the memory to store the
> long-string mappings, isn't it?
> 
> -----Messaggio originale-----
> Da: Sebastian Schelter [mailto:ssc@apache.org] 
> Inviato: giovedì 8 marzo 2012 15.27
> A: user@mahout.apache.org
> Oggetto: Re: R: Using recommenders with String identifiers
> 
> Here's some details on the memory usage of Strings in Java:
> 
> http://www.javamex.com/tutorials/memory/string_memory_usage.shtml
> 
> On 08.03.2012 14:53, Manuel Blechschmidt wrote:
>> Hallo Claudia,
>> the reason why longs are use is pure efficiency. When you have a lot of
> things and a lot of users and you are using Strings as identifiers you will
> need a lot of memory just for saving them. Further processes like equals or
> hash codes will take longer.
>> 
>> So a long has 4 bytes (64 bits) a UUID string (e.g.
> 936DA01F-9ABD-4D9D-80C7-02AF85C822A8) encoded as utf-16 has 72 bytes that
> means that UUID would consume more then18x the memory that longs are taking.
>> 
>> /Manuel
>> 
>> 
>> On 08.03.2012, at 14:27, Claudia Grieco wrote:
>> 
>>> Do you think it's worth the work to change the internal code of Mahout in
>>> order to use string identifiers?
>>> Thanks 
>>> Claudia
>>> 
>>> -----Messaggio originale-----
>>> Da: Manuel Blechschmidt [mailto:Manuel.Blechschmidt@gmx.de] 
>>> Inviato: lunedì 5 marzo 2012 11.28
>>> A: user@mahout.apache.org
>>> Oggetto: Re: Using recommenders with String identifiers
>>> 
>>> Hi Claudia,
>>> you have to use an IDMigrator.
>>> 
>>> The following projects shows you an example:
>>> https://github.com/ManuelB/facebook-recommender-demo
>>> 
>>> 
> https://github.com/ManuelB/facebook-recommender-demo/blob/master/src/main/ja
>>> va/de/apaxo/bedcon/FacebookRecommender.java
>>> 
>>> Good luck
>>>   Manuel
>>> 
>>> On 05.03.2012, at 09:53, Claudia Grieco wrote:
>>> 
>>>> Hi guys,
>>>> 
>>>> I'd like to use mahout to implement a recommender but I'm encountering a
>>>> problem:
>>>> 
>>>> Ids of items and users are represented in Mahout as long integers, while
>>> my
>>>> data comes from an external database that uses strings to identify items
>>> and
>>>> users.
>>>> 
>>>> Any suggestion as to how I can fix this problem?
>>>> 
>>>> Thanks a lot
>>>> 
>>>> Claudia
>>>> 
>>> 
>>> -- 
>>> Manuel Blechschmidt
>>> Dortustr. 57
>>> 14467 Potsdam
>>> Mobil: 0173/6322621
>>> Twitter: http://twitter.com/Manuel_B
>>> 
>>> 
>> 
> 

-- 
Manuel Blechschmidt
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message