sling-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sam ” <skyn...@gmail.com>
Subject Re: Sling URL Mapping Questions
Date Tue, 07 Feb 2012 15:15:53 GMT
I do it like this:

/etc/map/adobe/sling:match = http/adobe.example.com/(.+)
/etc/map/adobe/sling:internalRedirect = /content/adobe/$1
/etc/map/day/sling:match = http/day.example.com/(.+)
/etc/map/day/sling:internalRedirect = /content/day/$1


And, I find all <a href="/content/adobe/foo/bar.html">  and transform it to
<a href="http://adobe.example.com/foo/bar.html"> . And,  <a
href="/content/day/foo/bar.html">  becomes <a href="
http://day.example.com/foo/bar.html">

@Component(immediate = true, label = "canonical url stuff")
@Service
@Properties({
    @Property(name = "pipeline.mode", value = "global"),
    @Property(name = "service.ranking", intValue = -1)
})
public class CanonicalHrefFactory implements TransformerFactory {
    @Override
    public Transformer createTransformer() {
        new CanonicalHref();
    }


    private static class CanonicalHref extends ContentHandlerDelegate {
        @Override
        public void startElement(String uri, String localName, String
qName, Attributes attributes) throws SAXException {
            final ContentHandler contentHandler = getContentHandler();
            final Attributes modified = "a".equalsIgnoreCase(localName) ?
DO_THE_HREF_URL_REWRITE_HERE_TO_YOUR_LIKING(attributes) : attributes;
            contentHandler.startElement(uri, localName, qName, modified);
        }
    }
}


In this method, DO_THE_HREF_URL_REWRITE_HERE_TO_YOUR_LIKING(),  I could
look into /etc/map  and find suitable entry (by looking at
sling:internalRedirect property).  If the longest match is found, I can
parse sling:match of the node to get hostname. But, as you can see,
sling:match regex might not contain hostname..  It could be something like
http/(adobe|day)\.com/(.+)

I don't think
org.apache.sling.jcr.resource.internal.JcrResourceResolver.resolve(HttpServletRequest,
String)   is injective (url a,b,c rewrites to d.  inverse of that isn't a
function).



If there is no instance where siteA links to siteB, you can just implement
TransformerFactory  and strip out the prefix, /content/<siteName>,   from
href  (as long as you structured your repository consistently).











On Tue, Feb 7, 2012 at 7:38 AM, David Gonzalez <davidjgonzalez@gmail.com>wrote:

> Sam, doesn't etc/map require a root mapping which can't be a regex
> (can't be regex for outgoing mapping atleast)? How would I structure
> the etc/map nodes to only match on the resource path? Would I just put
> the resource mapping directly under scheme (http) node I lieu of the
> root mapping?
>
> Thanks
>
>
>
> On Feb 7, 2012, at 7:18 AM, "sam ”" <skynare@gmail.com> wrote:
>
> > You can rewrite from http server.
> >
> > For the urls appearing in html, you can use rewriter:
> >
> http://sling.apache.org/site/output-rewriting-pipelines-orgapacheslingrewriter.html
> >
> > Or, since your mappings are simple, you can roll out your own utility
> that
> > walks /etc/map for sling:internalRedirect. And, find the longest matching
> > internalRedirect against resourcePath.
> > Once found, you can construct url from there.
> >
> >
> > On Mon, Feb 6, 2012 at 10:42 PM, David G. <davidjgonzalez@gmail.com>
> wrote:
> >
> >> Hey,
> >>
> >> I'm using dispatcher running under httpd as cache.
> >>
> >> One of the things I am trying to get around is serving pages from the
> >> usual /content/<site>/<lang>/page.html structure.
> >>
> >> I need to validate, but I think I could
> >>
> >> 1) handle incoming rewrites: mysite.com/page.html >
> >> /content/mysite/en/page.html
> >> 2) use the JCR Resource Resolver mappings to rewrite all my in-page
> links
> >> to point at /page.html
> >>
> >> I haven't looked at the source code to see why sling can't handle
> >> bi-directional mapping when using regex (it seems like it should be able
> >> to, but I must be missing something).
> >>
> >> Thanks
> >>
> >> --
> >> David Gonzalez
> >> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> >>
> >>
> >> On Monday, February 6, 2012 at 12:29 PM, James Stansell wrote:
> >>
> >>> On Mon, Feb 6, 2012 at 5:26 AM, David Gonzalez <
> davidjgonzalez@gmail.com(mailto:
> >> davidjgonzalez@gmail.com)>wrote:
> >>>
> >>>> Does mod-rewrite support rewriting all the links in the documents
> >>>> returned in the response?
> >>>>
> >>>
> >>>
> >>> Probably not. In fact right now a lot of our links are
> >>> /content/<site>/en/page.html and we have rewrite rule which gives
a
> >>> redirect to /page.html.
> >>>
> >>> It should be possible to use a sling filter to modify the links when
> >>> serving the page but we haven't looked into that yet.
> >>>
> >>>
> >>>> Have you seen perf hits doing this? (I'm assuming every html response
> >>>> must be parsed and rewritten.)
> >>>>
> >>>
> >>>
> >>> As far as I know our performance concerns are in other areas. Our sling
> >> is
> >>> actually part of CQ5 so we already were using httpd in order to host
> the
> >>> dispatcher plugin for caching the pages. Plus we are using mod_rewrite
> >> for
> >>> rewriting 1000s of legacy URLs so I don't think we ever considered
> >> another
> >>> option.
> >>>
> >>>
> >>>> Are there any gotchas w mod_rewrite that you've run into rewriting
> >>>> incoming and outgoing urls?
> >>>>
> >>>
> >>>
> >>> Our biggest problems have been with the legacy URLs. I guess a general
> >>> gotcha could be the regexes for the rewrite; not thinking of anything
> >> else.
> >>>
> >>> If we were using plain sling we would probably be caching with
> varnish. I
> >>> wonder if that has any rewrite support? Are you using a web cache?
> >>>
> >>>
> >>
> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message