cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject [RT] from linkmaps to flowmaps
Date Mon, 05 Mar 2001 12:13:41 GMT
When evaluating the possible integration between web-app frameworks and
C2, I came out with the feeling that 'sitemaps' were an hyperprojection
of the 'site' multidimensional object, but forced some aspects to be
crosscut across resources.

Sitemaps are based on matching, it's a declarative language to describe
a site and how its resources are generated out of logic components and
their connection.

Sitemaps are great for publishing since all the concerns get separated
cleanly and elegantly for publishing needs (that is, stateless
requests), but is not so elegant for other types of serving (stetefull
requests).

Let's make a geometrical example: take a transparent cube. Draw a red
"+" sign on one side and a green "-" sign on the opposite.

Suppose that you want your friend on the other side of the world to
recreate the exact cube. This requires you to transmit all the
information of the cube, but you only have 2D pictures you can send (no
3D models or such).

Suppose also that these pictures are expensive and you'd like to use the
least number to transmit the information. Of course, the pictures must
completely determine the cube.

It can be easily shown that in this case, one carefully choosen
projection of the cube contains all the required information. The SoC
concept is here translated into "no side overlap". Rotating the cube in
front of the camera gives you at least one position where there is no
overlap (try this at home, if you wish :)

What does this mean for us? well, the sitemap is a "picture" (hyperplane
projection) of the site "cube" (multidimensional object) where all the
"overlap" (crosscutting aspects) is removed for drawn sides (stateless
resource serving).

Is this the only possible projection that has such properties? No, it's
just one of the possible, but it's one of the few to have been designed
with this notion of SoC right into the core design.

Now, let us add the need for statefull resource serving (that is,
web-apps), this is equivalent (in our cube example) to draw the a blue
"#" sign and a gree "$" sign on other two opposite faces.

Can the same picture remove all overlaps and explain the cube to our
friend with no possible problems? It might be the case for some signs,
but generally speaking, no, you need at least two pictures.

This example is very simple but clearly shows what happens when we
project an object onto less dimensional surfaces: overlaps is generated
and information is lost, thus multiple views (rotations!) are required
to eliminate ambiguities.

digital 3D modelers know this very well: their tools give them full
ability to move the view, change projection and do a bunch of visual
tricks to help them understanding a 3D system on a 2D window (the
screen).

For Cocoon, we are providing only one of such view: it would be like
forcing Pixar artists to design their ToyStory characters on one single
view. Clearly a nightmare.

                                  - o -

But let's move on: C1 is so limited in functionality that is not even
useful to comment on it, so let's skip it.

C2 gives you a clear and elegant projection of the site using a
"rotation" (given by the pipeline model and the declarative matching
architecture) that is very effective in separating crosscutting
concerns.

But is it really so?

A while ago, somebody (Jeremy?) outlined the need for "linkmaps", since
"linking" could be identified as a crosscutting aspect of resources.

I've been thinking about this very much lately and I came to the
conclusion that there is some truth in this, even if the concept of
considering linking as crosscutting is not, by itself, useful.

Let me explain: link information can be stored internally or externally.
HTML is only able to express linking information internally, while other
hypermedia system (like XLink, for example) are able to provide external
linking information.

The web showed that external linking information is not that useful for
massive scalability of hypermedia, but there are cases where it could be
useful.

The concept of external linking is very similar to the relational
concept for RDBMS. Given two sets of resources, another set adds
information about the relations (links) between the two sets.

So, for example, given a bunch of professor home pages and a bunch of
student homepages, a "course" could be seen as a temporally limited
external link set between these resources.

But the distinction between internal links and external links is purely
semantical: a page that lists internally the link to the professor page
and the links to the student pages is both internal, yet totally removes
crosscutting.

Google shows that is also possible to "harvest" link information by
crawling and provide useful information by the analysis of the link
topology.

I have come to the conclusion that simple internal unidirectional
hyperlinks are expressive enough to cover all the hypermedia
requirements and yet maintain SoC if well designed.

Well designed means: 

 1) plan and design the URI space!
 2) estimate link degradation
 3) agglomerate links dependening on their order of 
    degradation, possibly on different pages.

Links degrade differently depending on the quality of the information
they convey. Many believe that broken links are a web plague, but I
disagree: expired links are much worse and much more expensive for the
whole web society.

Back to the college course example: a professor might be linked to a
particular course and the year of the course may not be evident. If the
course is now given by another professor and there is no time indication
in the link page, the link is misleading, it conveys degraded
information. In fact, it's even worse than broken.

Linking creates a topology and enhances the information contained in the
system in such a way that can scale to billion of pages. Again SoC.

So, let's follow the projection example and ask ourselves: can linkmaps
provide help to isolate the linking aspects and focus more attention on
link design and degradation maintenance?

To answer such a question we must define what a "linkmap" is: a
collection of relations between resources.

XLink provides sufficient expressivity to write such a thing and
XPointer provides sufficient granularity to link very specific parts of
a resource.

But is this really useful?

How this impacts on the server side transformations? what "view" of the
resource does the XPointer map?

The act of linking requires two strong contracts: the resource starting
the link and the resource ending the link. For example

  /document/news/today#xpointer(//p[3]) --> /something

or

 /slides/12 --(next)--> /slides/13

where the link has a role.

Is it any useful to manage linking information externally from the
linked resources? is it any more scalable or less expensive?

I strongly doubt so for publishing needs: in fact, text and links all
reside in the same concern island (that of content) and the degradation
of text is very likely to go along with the degradation of the link
itself. Thus, separating them is not only useless in terms of SoC, but
also harmful since it requires unnecessary operation.

Linkmaps are useful to provide navigation help, provide topological
analysis and the like, but these linkmaps are obtained by crawling the
link space, not the other way around. (Cocoon2 already uses such a link
harvesting system for command line usage)

                                 - o -

If linkmaps don't seem that useful for publishing, is it still so for
webapps?

Let me pose a simple yet important consideration: what is a link?

Links are the difference between text and hypertext and provide a bunch
of useful features that mainly touch the SoC paradigm: as long as their
location remains the same (contract), two linked resources can
dynamically evolve with no impact one on the other.

Strangely enough, hypertext is as old as text itself: bibliographies
are, in fact, linkmaps and it is entire possible (look at
http://citeseer.nj.nec.com for an example) to analyze the link topology
between scientific articles using bibliographies as linkmaps.

So, a link is a context: it states normally "for more information, visit
the linked resource". Links are a sort of in-place-bibliographies++.

I'm sure you smell something burning at this point: is the above really
true?

HTML includes one hypertext semantic: the <a> tag and later on added a
few behavior modifiers, mostly due to the design deficiencies of the
'frame' concept.

But point&click (one of the most important HCI inventions of the last
decade) associated hyperlink behavior with the notion of "change this
current screen with the resource linked here".

But this is ENTIRELY ACCIDENTAL! There is nothing in the hyperlink
concept that dictates that you have to clear the current window and
redraw the linked page. This was something invented in Mosaic, along
with the 'back' button, the 'home' button, the 'bookmark', etc.

Links provide contextualization. IE was the first browser to understand
this concept well by the use of link popups, visual helps that guide the
user in the use of the link. Some research at Xerox PARC goes even
further, providing more useful pull-down information on the link itself,
presented in-place but extracted from the linked resource.

But 99.9% of the web population is so damn used to the idea that linking
means jumping that nobody even questions the notion.

Why am I bothering so much?

Well, since links provide navigation, thus flow, many don't even see the
difference.

I've come to the conclusion this implicit aspect overlap between
navigation and flow is creating much of the friction that is percieved
by "using" webapps, and it's directly translated into friction during
development, which makes the effort more frustrating and expensive.

The solution I envision is the creation of a "flowmap" where the web-app
designer is not focused on intermediate locations (resources), but in
the flow of the application, much like you procedural languages let you
program by writing a flow, not filling memory slots declaratively with
instructions.

The 'flow' of an application cannot be determined by looking at the
sitemap. The flow aspect of the site is crosscutting between sitemaps
and resources.

Here, the creation of solid URI contracts to enforce the overlap between
navigation and flow can help, but it's useless to provide a solid URI if
this cannot be bookmarked, indexed or referenced because completely
statefull, thus meaningful only if the requestor provides information on
its identity.

Following this train of thoughts, I have the perception that actions
belong to the flowmap rather than the sitemap: they are a great concept,
but they are used in the wrong place and this is evident since the
reusability of actions is normally much less than the reusability of
other sitemap components.

                                 - o -

So, what is a flowmap?

Take a piece of paper, avoid thinking you are writing an application for
the web, so avoid taking into consideration technical constraints (like
the request/response paradigm, intrinsic statelessness, low bandwidth,
proxying, etc..) and draw a diagram of the latest application you wrote.

What you should outline, at this stage, is the 'skeleton' of the flow:
this should be drawn to give the other people in the team the overlall
picture of what the application is doing.

So, let's see, let's take something like slashdot.org:
 

         identification
               ^
               |
               v
 (enter) ---> home ---> (exit)
               ^
               |
               +--> read article <-()-> post comment
               | 
               +-()-> set preferences
               | 
               +-()-> add article
               |
               +-(editor)-> accept submission

where:

 (enter) indicates the entrance of the application
 (exit) guess what? :)
 ---> indicates a 'general flow' [no restriction on identity]
 <---> indicates a 'bidirectional flow' 
 -(???)-> indicates a 'restricted flow'
 -()-> indicates a 'identificated flow'

an 'identificated flow' requires that the user is identified.

a 'restricted flow' is a specialized 'identification flow' that requires
identification and belonging to a specific identification group which
has enough rights to enter.

We can starting noting a few properties of a flowmap:

1) there must be no end stages, the flowmap must describe a fully
connected graph between the (enter) and (exit) dummy stages.

This is very important: the absence of such a fully connected graph does
*NOT* make the application unusable, but this is only because of "back"
functionalities in your browser.

In many devices, the back button might not be available, and, in any
case, the back button is an outside feature of the browsing experience
and it's useful only when the user makes a mistake, rather than forcing
him to do it and leaving him with no exit.

In short, a fully connected flowmap avoids the 'dead end' problem.

2) the site of the flowmap represents a good estimation of the web-app
complexity and gives an fuzzy metric.

This is very important: while the sitemap size gives a fuzzy metric on
the publishing effort and its growth over time rates site scalability on
production, the flowmap size gives an estimation of the effort required
to implement it.

3) the flowmap gives an immediate view of the application and, most
important, identifies immediately the 'restricted areas' and the groups
that are allowed to enter them.

This creates a contract between the administration concern island (who
is in charge of designing the restrictions) and the logic concern island
(who is in charge of implementing them).

                                 - o -

This is already long enough so I'll stop here.

A few questions need to be answered and I'll like to hear your comments
on them:

1) can the flowmap be used not only for description, but directly for
design like we do for the sitemap?

2) if so, is there a way to completely remove aspect overlap between
sitemaps and flowmaps and make them totally independent given a few
solid contracts between the two?

3) if so, which contracts? is URI space enough?

4) where would actions be described?

5) whose concern would flowmap maintenance be?

6) would the introduction of the flowmap concept require changes in
cocoon's core design?

Ok, your turn now :)

Thanks for your patience.
                
-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


Mime
View raw message