hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brent Putman <putm...@georgetown.edu>
Subject Parsing of Link header elements containing query parameters
Date Fri, 11 Mar 2016 04:10:55 GMT
Hi,
I'm working with a REST API which returns a Link entity header to
indicate "rel" links (previous, next, etc) for pagination over more
results than are returned in a single call.  In their docs they
specifically reference this very outdated (and non-standard) spec [1],
but it seems to be quite similar to the more current RFC 5988 [2].

The individual URI values in the Link header value contain query
parameters.  Here is the HC library wire trace of the entire header:

2016-03-10 22:36:31.354 [DEBUG] : org.apache.http.wire: http-outgoing-0
<< "Link:
<https://georgetown.test.instructure.com/api/v1/accounts/self/users?page=1&per_page=10>;
rel="current",<https://georgetown.test.instructure.com/api/v1/accounts/self/users?page=2&per_page=10>;
rel="next",<https://georgetown.test.instructure.com/api/v1/accounts/self/users?page=1&per_page=10>;
rel="first",<https://georgetown.test.instructure.com/api/v1/accounts/self/users?page=559&per_page=10>;
rel="last"[\r][\n]"


When I attempt to extract this header from the HttpResponse and display
the individual element values using code similar to:

Header linkHeader = httpResponse.getFirstHeader("Link");
for (HeaderElement element : linkHeader.getElements()) {
    System.out.println("Saw HeaderElement: " + element.toString());
    System.out.println("HeaderElement name: " + element.getName());
    System.out.println("HeaderElement value: " + element.getValue());
}


I'm seeing output for example:

Saw HeaderElement:
<https://georgetown.test.instructure.com/api/v1/accounts/self/users?page=1&per_page=10>;
rel=current
HeaderElement name:
<https://georgetown.test.instructure.com/api/v1/accounts/self/users?page
HeaderElement value: 1&per_page=10>


So, it's splitting on the first '=' character to determine the element
name vs value, which looks odd.  And there doesn't seem to be a way in
the API to get the value of the HeaderElement minus the parameters.

Is this:
1) A bug in HttpClient's HeaderElement parsing?
2) A mistake on the part of the server sending these particular URL
values (i.e. perhaps should be encoded in some way)?
3) Neither: Perhaps given knowledge of the specific header syntax and
semantics, the name/value API is not appropriate for it, and I need to
handle these values manually by for example:
     A) Stitching the URI back together manually as the name + "=" + value
     B) Splitting the HeaderElement#toString() on the semi-colon

#3 makes me nervous at the moment since I don't fully understand the
issues at hand.

I'm trying to read through relevant HTTP specs to better understand the
nuance of the header value syntax.  But I know there are people on the
list who are knowledgeable on the specs and may have a quick answer, so
wanted to pose the question in the meantime.

Thanks,
Brent

[1] http://www.w3.org/Protocols/9707-link-header.html
[2] https://tools.ietf.org/html/rfc5988


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message