mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Mahout DSL vs Spark
Date Wed, 30 Apr 2014 04:57:45 GMT
No, i think my take is more common. i can't immediately find an
authoritative reference, but there's definition of vertical and horizontal
concatenation. So i assume it is intuitive to call blocks producing
vertical concatenation, vertical blocks. [1]

[1]
http://www-rohan.sdsu.edu/doc/matlab/toolbox/simulink/slref/matrixconcatenation.html


On Tue, Apr 29, 2014 at 9:52 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> hm . i really did not think of it. i thought vertical blocks are those
> that one on top the other. As if one is building a vertical tower.
>
> let me check what official math terminology is.
>
>
> On Tue, Apr 29, 2014 at 9:47 PM, Anand Avati <avati@gluster.org> wrote:
>
>>
>>
>>
>> On Tue, Apr 29, 2014 at 9:20 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>>
>>> actually I imply vertical slicing as A(100 to 200, ::). if it is the
>>> other way around it is a typo.
>>>
>>
>> Isn't that counter-intuitive? Isn't the syntax A(row,col), therefore
>> A(100 to 200, ::) mean all (columns) of rows 100 through 200 - so they are
>> horizontal slices, no?
>>
>>
>>
>>>
>>> strictly speaking this doc is working notes, not a manual (i.e. i just
>>> filled it in as i went with design so i don't forget myself). i guess
>>> there's a gap between it and an actual doc. I suggested to keep it for
>>> reference (since it exists) but rather create an html-based wiki/cms doc
>>> pages. this is todo.
>>>
>>>
>>> On Tue, Apr 29, 2014 at 7:19 PM, Anand Avati <avati@gluster.org> wrote:
>>>
>>>>
>>>> On Mon, Apr 28, 2014 at 11:15 PM, Dmitriy Lyubimov <dlieu.7@gmail.com>wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 28, 2014 at 7:23 PM, Anand Avati <avati@gluster.org>wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm not sure I completely understand mapBlock. Can you please give
a
>>>>>> concrete example (with a simple 2x3 matrix) of how mapblock works?
I have a
>>>>>> reasonable understanding of how Spark partitions and distributes
data of
>>>>>> its RDD. Based on that, and knowing how H2O distributes data, I feel
it is
>>>>>> a matter of providing thing logic and wrapper to make something built
on
>>>>>> Spark to be built on H2O. That being said, I want to make sure I
do not
>>>>>> misunderstand or make wrong assumptions about mapBlock, hence request
for a
>>>>>> concrete example.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>>
>>>>> Anand,
>>>>>
>>>>> concrete examples are given and explained in scala/spark bindings
>>>>> documentation on Mahout website.
>>>>>
>>>>> Also, there's a talk and slides from last Mahout meetup that also
>>>>> discuss Mahout DRM structure and access to it in case of sparkbindings.
>>>>>
>>>>> Come back if you still have questions after that (along with
>>>>> suggestions what can be improved in the docs to make things easier).
>>>>>
>>>>
>>>> Dmitry,
>>>> Thanks for the link, now I understand what's happening with mapBlock(),
>>>> and it is exactly how I had understood initially (before un-understanding
>>>> :p). I don't see it being a huge problem to provide a mapBlock() over H2O.
>>>> The part which confused me (both your email and in ScalaSparkBindings.pdf)
>>>> is this -
>>>>
>>>> page 17:
>>>>
>>>> ...
>>>> Vertical block
>>>>   A(::, 100 to 200)
>>>> ...
>>>> mapBlock provides ... "vertical blockiļ¬ed tuples of the matrix"
>>>>
>>>> The terminology of "Vertical block" describing as A(::, 100 to 200), is
>>>> intuitive and feels "right".
>>>>
>>>> But then when mapBlock is described as presenting "vertical block"ified
>>>> tuples, maybe it is just me, sounds as if mapBlock gives you a subset of
>>>> full columns in the form a Matrix (while it actually provides a subset of
>>>> full rows in the form of a Matrix). It was this interpretation of
>>>> orthogonal orientation associated with "vertical block"(ified tuples) which
>>>> caused my confusion.
>>>>
>>>> It would be very helpful if the documentation on that page explicitly
>>>> states that mapblock presents a subset of full rows. It feels obvious
>>>> looking backwards, but the terminology was confusing initially. It is
>>>> somewhat implied in a later statement "...should not change the height of
>>>> the block, in order to provide correct total matrix row count ...", but
>>>> that wasn't good enough in the first parse.
>>>>
>>>> Thanks!
>>>>
>>>> PS: It might be helpful if
>>>> http://mahout.apache.org/users/sparkbindings/ScalaSparkBindings.pdf is
>>>> made available under doc/ in the repository for future code inspectors.
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message