This document has annotations on an Annotea server: (hide/show)
Content Transformation Guidelines 1.0Would it perhaps be better to give this specification a more informative title, or at least some sort of informative subtitle? The phrase "Content Transformation" sounds to an uninitiated reader as if it could apply to anything from the use of the data manipulation language (e.g. SQL) in a database management system, to the use of XSLT, or the SAX or DOM interfaces, to transform XML documents, to the use of dynamic HTML techniques to transform data in the browser. Perhaps "Mobile Web Content Transformation"? Or "Content Transformation for Mobile Presentation"? Surely there are ways of making it easier for potential readers to see whether the document is relevant to their concerns or not. This isn't the first W3C spec to have such a generic title; the experience of the XML Schema specification, however, leads me to commend to you urgently the wisdom of have a more specific, more informative, less generic title for your document. --Michael Sperberg-McQueen W3C XML ActivityLC-2018 from C. M. Sperberg-McQueen <cmsmcq@acm.org>
Copyright © 2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document provides guidance to content transformation proxies and content providers as to how inter-work when delivering Web content.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a Last Call Working Draft of Content Transformation Guidelines 1.0, expected to become a W3C Recommendation. The W3C Membership and other interested parties are invited to review the document and send comments to public-bpwg-comments@w3.org (with public archive) through 16 September 2008.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document has been produced by the Mobile Web Best Practices Working Group as part of the Mobile Web Initiative.
Since its publication as a First Public Working Draft on 14 April 2008, the Content Transformation Guidelines 1.0 document has been almost entirely re-written. The guidelines were extended, precised and re-worded for clarity reasons. In particular:
"X-Device-"<original header name> HTTP header was confirmed.
link element were detailed.
The Working Group notes that it has already identified a guideline considered to be at risk: the notification of users as defined in section 4.1.4 Serving Cached Responses may turn out to be difficult to implement from a user experience's point of view and may be removed in future versions of the document based on the feedback received.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
1 Introduction (Non-Normative)
1.1 Purpose
1.2 Audience
1.3 Scope
1.4 Summary of Requirements
2 Terminology (Normative)
2.1 Types of Proxy
2.2 Types of Transformation
3 Conformance (Normative)
3.1 Classes of Product
3.2 Normative and Informative Parts
3.3 Normative Language for Conformance Requirements
3.4 Content Deployment Conformance
3.5 Transformation Deployment Conformance
4 Behavior of Components (Normative)
4.1 Proxy Forwarding of Request
4.1.1 Applicable HTTP Methods
4.1.2 no-transform directive in Request
4.1.3 Treatment of Requesters that are not Web browsers
4.1.4 Serving Cached Responses
4.1.5 Alteration of HTTP Header Values
4.1.5.1 Content Tasting
4.1.5.2 Avoiding "Request Unacceptable" Responses
4.1.5.3 User Selection of Restructured Experience
4.1.5.4 Sequence of Requests
4.1.5.5 Original Headers
4.1.6 Additional HTTP Headers
4.1.6.1 Proxy Treatment of Via Header
4.2 Server Response to Proxy
4.2.1 Use of HTTP 406 Status
4.2.2 Server Origination of Cache-Control: no-transform
4.2.3 Varying Representations
4.2.3.1 Use of Vary HTTP Header
4.2.3.2 Indication of Intended Presentation Media Type of Representation
4.3 Proxy Forwarding of Response to User Agent
4.3.1 Receipt of Cache-Control: no-transform
4.3.2 Receipt of Warning: 214 Transformation Applied
4.3.3 Server Rejection of HTTP Request
4.3.4 Receipt of Vary HTTP Header
4.3.5 Link to "handheld" Representation
4.3.6 Proxy Decision to Transform
4.3.6.1 Alteration of Response
4.3.6.2 HTTPS Link Re-writing
5 Testing (Normative)
A References
B Example Transformation Interactions (Non-Normative)
B.1 Basic Content Tasting by Proxy
B.2 Optimization based on Previous Server Interaction
B.3 Optimization based on Previous Server Interaction, Server has Changed its
Operation
B.4 Server Response Indicating that this Representation is Intended for the Target
Device
B.5 Server Response Indicating that another Representation is Intended for the
Target Device
C Applicability to Transforming Solutions which are Out of Scope (Non-Normative)
D Scope for Future Work (Non-Normative)
D.1 POWDER
D.2 link HTTP Header
D.3 Sources of Device Information
D.4 Inter Proxy Communication
D.5 Amendment to and Refinement of HTTP
E Administrative Arrangements (Non-Normative)
F Acknowledgments (Non-Normative)
1 Introduction (Non-Normative)Introduction "From the point of view of this document, Content Transformation is the manipulation in various ways, by proxies, of requests made to and content delivered by an origin server with a view to making it more suitable for mobile presentation." It took me three times to read the sentence in order to understand it. I think the sentence can be simplified Proposed Amendment Convert the sentence in something clearer.LC-2012 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
From the point of view of this document, Content Transformation is the manipulation in various ways, by proxies, of requests made to and content delivered by an origin server with a view to making it more suitable for mobile presentation.
The W3C Mobile Web Best Practices Working Group neither approves nor disapproves of Content Transformation, but recognizes that is being deployed widely across mobile data access networks. The deployments are widely divergent to each other, with many non-standard HTTP implications, and no well-understood means either of identifying the presence of such transforming proxies, nor of controlling their actions. This document establishes a framework to allow that to happen.
The overall objective of this document is to provide a means, as far as is practical, for users to be provided with at least a "functional user experience" [Device Independence Glossary] of the Web, when mobile, taking into account the fact that an increasing number of content providers create experiences specially tailored to the mobile context which they do not wish to be altered by third parties. Equally it takes into account the fact that there remain a very large number of Web sites that do not provide a functional user experience when perceived on many mobile devices.
The audience for this document is creators of Content Transformation proxies, purchasers and operators of such proxies and content providers whose services may be accessed by means of such proxies.
The recommendations in this document refer only to "Web browsing" - i.e. access by user agents that are intended primarily for interaction by users with HTML Web pages (Web browsers) using HTTP. Clients that interact with proxies using mechanisms other than HTTP (and that typically involve the download of a special client) are out of scope, and are considered to be a distributed user agent. Proxies which are operated in the control of or under the direction of the operator of an origin server are similarly considered to be a distributed origin server and hence out of scope.
The BPWG is not chartered to create new technology - its role is to advise on best practice for use of existing technology. In satisfying Content Transformation requirements, existing HTTP headers, directives and behaviors must be respected, and as far as is practical, no extensions to [RFC 2616 HTTP] are to be used.
This section summarizes the communication requirements of actors (users, user agents, transforming proxies and origin servers) to communicate with each other. It is recognised that several transformation proxies may be present but their interactions are not discussed in detail. The relevant scenario is as follows:
The needs of these actors are as follows:
The user agent needs to be able to tell the Content Transformation proxy and the origin server:
what type of mobile device and which user agent is being used;
that all Content Transformation should be avoided.
The Content Transformation proxy needs to be able to tell the origin server:
that some degree of Content Transformation (restructuring and recoding) can be performed;
that content is being requested on behalf of something else and what that something else is;
that the request headers have been altered and what the original ones were.
The origin server needs to be able to tell the Content Transformation proxy:
that it varies the representation of its responses according to device type and other factors;
that it is not permissible to perform Content Transformation;
that it has media-specific representations;
that is unable or unwilling to deal with the request in its present form.
The Content Transformation proxy needs to be able to tell the user agent:
that it has applied transformations of various kinds to the content.
The Content Transformation proxy needs to be able to interact with the user:
to allow the user to disable its features;
to alert the user to the fact that it has transformed content and to allow access to an untransformed representation of the content.
Note:
A more extensive discussion of the requirements for these guidelines can be found in "Content Transformation Landscape" [CT Landscape].
2.1 Types of Proxy* Section 2.1 - "Alteration of HTTP requests and responses is not prohibited by HTTP other than in the circumstances referred to in [RFC2616 HTTP] Section 13.5.2." This isn't true; section 14.9.5 needs to be referenced here as well.LC-2066 from Mark Nottingham <mnot@mnot.net>
Alteration of HTTP requests and responses is not prohibited by HTTP other than in the circumstances referred to in [RFC 2616 HTTP] Section 13.5.2.
HTTP defines two types of proxy: transparent proxies and non-transparent proxies. As discussed in [RFC 2616 HTTP] Section 1.3, Terminology:
"A transparent proxy is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification. A non-transparent proxy is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering. Except where either transparent or non-transparent behavior is explicitly stated, the HTTP proxy requirements apply to both types of proxies."
This document elaborates the behavior of non-transparent proxies, when used for Content Transformation in the context discussed in [CT Landscape].
2.2 Types of Transformation1) Section 2.2.1: The CTG distinguishes between retructuring, recoding and optimization. This is a useful approach, and the distinction could be used more systematically across the document. However, without a formal definition of these terms, various parties are left with too much leeway when classifying some operations one or the other of the categories. This may entail inconsistencies regarding the interpretation of the guidelines. The guidelines should: a) Define formally each of the three categories, possibly on the basis of language theory. As an example, optimization seems to be related to equivalent token streams (for textual content), whereas recoding seems to deal with equivalent parse trees. Some operations are reversible, others are not. The W3C is home to technologies such as XSLT, so there should be competence there to help ground definitions on solid formal concepts. Basing such definitions on formal language theory is a suggestion, not a requirement; other formally grounded definitions are possible. b) Define exactly how to classify an operation that spans several categories. As an example, converting HTML to XHTML while at the same time eliminating comments and redundant white space should amount to a recoding.LC-2050 from casays <casays@yahoo.com>
Transforming proxies can carry out a wide variety of operations. In this document we categorize these operations as follows:
Alteration of Requests
Transforming proxies process requests in a number of ways, especially replacement of various request headers to avoid HTTP 406 Status responses (if a server can not provide content that is compatible with the original HTTP request headers) and at user request.
Alteration of Responses
There are three classes of operation on responses:
Restructuring content
Recoding content
Optimizing content
The Content Transformation Guidelines specification has two classes of products:
A Content Deployment is the provision of resources intended for retrieval by user agents. Provisions that are applicable to a Content Deployment are identified in this document by use of the terms "origin server", "server" and "Web site" in the singular or plural.
A Transformation Deployment is the provision of non-transparent components in the path of HTTP requests and responses. Provisions that are applicable to a Transformation Deployment are identified in this document by use of the term "transforming proxy" or "proxy" in the singular or plural.
Normative parts of this document are identified by the use of "(Normative)" following the section name. Informative parts are identified by use of "(Non-Normative)" following the section name.
The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this Recommendation have the meaning defined in [RFC 2119].
3.4 Content Deployment Conformance* Section 3.4 / 3.5 "A [Content|Transformation] Deployment conforms to these guidelines if it follows the statements..." What does "follows" mean here -- if they conform to all MUST level requirements? SHOULD and MUST?LC-2067 from Mark Nottingham <mnot@mnot.net>
A Content Deployment conforms to these guidelines if it follows the statements in 4.2 Server Response to Proxy.
A Transformation Deployment conforms to these guidelines if it follows the statements in 4.1 Proxy Forwarding of Request, 4.3 Proxy Forwarding of Response to User Agent and 5 Testing (Normative).
4.1 Proxy Forwarding of RequestAlso, I see that CTG does not mention "whitelists". I think it should, since many transcoders manage that. The rule (consistently with the concept that transcoders must err on the side of not transcoding) should be that whitelists can only specify which potentially mobile sites can be forced to be trascoded (and not the other way around as happens to be common today, thus potentially forcing mobile developers to ask operators in different countries to whitelist their service, which is of course unacceptable).LC-2003 from Luca Passani <passani@eunet.no>
4.1.1 Applicable HTTP Methods "Proxies should not intervene in methods other than GET, POST, HEAD and PUT." I can't think of any good reason for that. If a request using an extension method wants to avoid transformation, it can always include the no-transform directive.LC-2034 from Mark Baker <distobj@acm.org>
4.1.1 Applicable HTTP Methods1) Section 4.1.1 Add to the section: Proxies MUST NOT convert POST methods into GET ones, or vice-versa. Rationale: This kind of transformation may make exchanges between clients and servers inoperative. In particular, this kind of substitution has been known to cause problems for content downloading applications in the mobile Web.LC-2019 from casays <casays@yahoo.com>
Proxies should not intervene in methods other than GET, POST, HEAD and PUT.
User agents sometimes issue HTTP HEAD requests in order to determine if a resource is of a type and/or size that they are capable of handling. A transforming proxy may convert a HEAD request into a GET request (in order to determine the characteristics of a transformed response that it would return if the user agent subsequently issued a GET request for the same resource).
If the HTTP method is altered from HEAD to GET, proxies should (providing such action is in accordance with normal HTTP caching rules) cache the response so that a second GET request for the same content is not required (see also 4.1.4 Serving Cached Responses).
4.1.2* Section 4.1.2 "If the request contains a Cache-Control: no-transform directive proxies must forward the request unaltered to the server, other than to comply with transparent HTTP behaviour and as noted below." I'm not sure what this sentence means.LC-2068 from Mark Nottingham <mnot@mnot.net>
no-transform directive in Request
If the request contains a Cache-Control: no-transform directive
proxies must forward the request unaltered to the server,
other than to comply with transparent HTTP behavior and as noted below (see 4.1.6 Additional HTTP Headers).
Note:
An example of the use of Cache-Control: no-transform is the
issuing of asynchronous HTTP requests, perhaps by means of
XMLHTTPRequest [XHR], which may include such a
directive in order to prevent transformation of both the request and the
response.
http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/ Section 4.1.3 ... ""' The mechanism by which a proxy recognizes the user agent as a Web browser should use evidence from the HTTP request, in particular the User-Agent and Accept headers. """ Please clarify -- is this just the *existence* of those headers, or the specific values? If it is the specific values, then please provide some guidance (or a normative alternative) that new user agents can use, before their names propagate to various whitelists. -jJLC-2044 from Jim Jewett <jimjjewett@gmail.com>
4.1.3 Treatment of Requesters that are not Web browsers* Section 4.1.3 "Proxies must act as though a no-transform directive is present (see 4.1.2 no-transform directive in Request) unless they are able positively to determine that the user agent is a Web browser." How do they positively" determine this? Using heuristics is far from a guaranteed mechanism. Moreover, what is the reasoning behind this? If the intent is to only allow transformation of content intended for presentation to humans, it would be better to say that. In any case, putting a MUST-level requirement on this seems strange.LC-2069 from Mark Nottingham <mnot@mnot.net>
Proxies must act as though a no-transform
directive is present (see 4.1.2 no-transform directive in Request) unless
they are able positively to determine that the user agent is a Web browser.
The mechanism by which a proxy recognizes the user agent as a Web browser
should use evidence from the HTTP request, in
particular the User-Agent and Accept headers.
4.1.4 Serving Cached Responses* Section 4.1.4 "Proxies should follow standard HTTP procedures in respect to caching..." This seems a strange way to phrase it, and I don't think it's useful to use RF2616 language here.LC-2070 from Mark Nottingham <mnot@mnot.net>
Proxies should follow standard HTTP procedures in respect of caching and should use cached copies of resources where this is in accordance with those procedures.
In some circumstances, proxies may paginate responses and where this is the case a request may be for a subsequent page of a previously requested resource. In this case proxies may for the sake of consistency of representation serve stale data but when doing so should notify the user that this is the case and should provide a simple means of retrieving a fresh copy.
The rest of the 4.1.5.* sections all seem to be basically "Here's some things that some proxies do". By listing them, are you saying these are good and useful things, i.e. best practices? If so, perhaps that should be made explicit.LC-2038 from Mark Baker <distobj@acm.org>
I posted the following message in the WMLProgramming mailing list.
People have suggested that I publish it as a formal comment to the CTG
draft, so here it is, under the heading "Allowing modifications of the
HTTP header field user-agent: rationale missing".
Eduardo Casais
areppim AG
Bern Switzerland
------------------
I would like to review (a last time) an issue that reoccurs in all
discussions about transcoders.
> Changing User Agent or other headers is not prohibited by HTTP
The first thing to stress is that the user-agent is essential to drive
content selection and generation processes, both in the mobile Web and
in the desktop Web.
a) In the mobile Web, the user-agent is directly associated to the
actual device, and hence serves as a key to characteristics such as
screen dimensions, preferred content types, etc. The advent of
uaprof/ccpp was supposed to make this mapping unncessary, but it is
not the case: uaprof descriptions are often missing, point to invalid
URL, omit important information, or are just plain unreliable. Device
databases like WURFL, based on user-agent mappings, thus remain
indispensible.
b) In the desktop (non-mobile) Web, developers have long relied upon
the user-agent to identify the browsers issuing requests in order to
tailor content to their "quirks". This has been going on at least
since the times of the Netscape / IE wars.
Let us now examine the use cases of a mobile Web browser accessing the
Internet, and evaluate the relevance of the user-agent -- assuming
that transcoders systematically substitue the original value with a
new one.
1.a User-agent-switcher.
The Web server is able, based on the user-agent, to
provide a mobile-optimized or a full-web service.
It therefore needs the original user-agent; modifying
it is unhelpful.
2.a Mobile Web only.
2.a.1 Generic content.
The server returns generic mobile content, without
customizing it for any specific user-agent.
This kind of applications is rare, and often
corresponds to surviving examples of text-only
services developed for older PDA and WAP 1 devices.
Since the server does not use the original user
agent, modifying it is useless.
2.a.2 Mobile with default.
The server returns mobile-optimized content. When
not recognizing the user-agent, it returns a default,
best-effort representation, perhaps with a message
suggesting that the content is tailored for mobile
devices.
Since the server relies upon the user-agent, and is
able to return a default representation, modifying it
is unhelpful.
2.a.3 No default.
The server returns mobile-optimized content, but will
return an error (page with "unsupported browser" warning
or return code "request not acceptable") whenever it
does not recognize the user-agent. In this case, modifying
the original user-agent is most unhelpful, as it guarantees
that the server will not recognize it as a valid
mobile one. If the server does not, for whatever reason
(e.g. incomplete device database), recognize a mobile
user-agent, then there might be a case for modifying it
towards an acceptable mobile one -- but transcoders
precisely do the reverse: they change a mobile user-agent
to an exotic full-Web one. Hence, a modification of the
original user-agent is unhelpful all situations.
3.a Full Web only.
3.a.1 Generic content.
The Web server serves generic full-Web content, without
looking at the user-agent. In this case, modification of
the original user-agent is useless.
3.a.2 Tailored full-web with default.
The server returns full-web content customized for specific
full-web user-agents (e.g. IE 6.0, IE 7.0 and Firefox 2.0),
and serves a default representation, perhaps with a warning
("This site is best viewed with the following browsers:...")
for other user-agents. In this case, modification of the
original user-agent is either useless (in any case a default
representation will be returned), or unhelpful (the default
representation is probably better downgradable than one
specifically customized for a very specific full-Web browser).
3.a.3 Tailored with no default.
The server returns content tailored for specific full-Web
browsers, and an error for other unrecognized or unsupported
user-agents.
Here there is a case to substitute the original user-agent to
force retrieval of content. However, this works only if the
"fake" user-agent precisely corresponds to one of those
accepted by the server -- but transcoders do not tailor their
substitute user-agents with respect to the application server:
the include only general hints (like Mozilla/x.y) in the hope
this is enough to determine content generation.
Hence, the generic substitution of user-agents performed by
transcoders is not appropriate here.
Conclusion: in two cases, modification of the user-agent is useless,
in three it is detrimental, in one it is either useless or
detrimental, and in one it could be helpful, but it is currently done
inappropriately.
Let us consider the interesting symetric use-cases: a full-Web mobile
device accessing the Internet.
1.b User-Agent switcher
Following the same reasoning as in 1.a, we find that
modification of the original user-agent is unhelpful.
2.b Mobile Web only.
2.b.1 Generic content.
Whatever user-agent, the server returns generic mobile-optimized
content. A modification of the original user-agent is useless.
2.b.2 Tailored with default.
The server returns mobile-optimized content, and a default
representation for unrecognized user-agent. Modifying it is
therefore useless -- the default representation will be returned
whether the original (full-Web) or the substitute (pseudo-full
Web) agent appears in the request.
2.b.3 Tailored, no default.
Following the same reasoning as in 2.a.3, the substitution of
original (full-Web) user-agent by a fake (full-Web) one is
useless, as it will anyway return an error.
3.b Full Web only
3.b.1 Generic content.
If the server returns generic full-Web content whatever the user
agent, then modifications of the user-agent are useless.
3.b.2 Tailored with default.
Following the reasoning in 3.a.2, modifying the original
full-Web user agent is either unhelpful (because the server
could have recognized the mobile device's agent), or useless
(the same default representation would be returned).
3.b.3 Tailored, no default.
Here the server may recognize the full-Web user-agent of the
mobile device; it is therefore unhelpful to modify it. Or it
might not support that specific user-agent, in which case it
would be sensible to substitute one that is effectively
supported by the server; however, this is not what transcoders
do: they provide a generic, not a real user-agent instead --
this is inappropriate.
So one situation where it is detrimental, four where it is useless,
one where it is either useless or detrimental, and one where it is
either useless or inappropriately done. It is also an acid test: do
transcoders modify requests from full-Web capable mobile browsers? If
so, something seriously weird is going on, as the excuse has generally
been to make full-Web content available to non-full-Web capable devices.
>From this examination, one can only conclude that proponents of the
preservation of the original user-agent do not have to justify their
position and established practice. Rather, the onus is on the
proponents of the substitution of the user-agent to argue in favour of
their approach, which disrupts established practice. There is
basically only one use case where changing the mobile user agent to a
desktop user agent might help, but it remains to demonstrate:
a) The relevance of the scenario. Perhaps people at Google could let
one of their crawlers roam over a few tens of thousands of WWW sites
to gather statistics on the relative frequency of each aforementioned
scenario.
b) The benefits resulting from handling that specific scenario.
c) That (a) and (b), taken together, are so overwhelming that they
more than compensate the disruptions caused in all other use-cases.
If another use case outside the framework I have presented here pops
up, this does not reduce the need for an assessment based on (a), (b),
(c).
As a final remark, I would like to note that transcoders have been
operating in the mobile Web for a long time. It started with
adaptation of HTML for PDA (Web clipping) and HTML to HDML conversion,
continued with HTML to WML, before arriving at the current crop of
content adaptation. In the old times, developers of content adaptation
software were wary of modifying the user-agent: turning generic WWW
content into a form suitable for mobile devices is so fraught with
difficulties that one would take every chance to let a server return
mobile optimized content (based on the user-agent) if it could. It is
only fairly recently that, without much justification, transcoders
have started in a
systematic way to overwrite the user-agent field.
I think I have said everything I wanted regarding the CTG. The
document requires quite some rework -- nothing exceptional, since it
is a draft. I will lean back and wait for the results of this round of
revisions. Till then, readers of the WMLprogramming and W3C BPWG lists
can rejoice in the knowledge that my long-winded posts are abating at
last.
E. CasaisLC-2054 from casays <casays@yahoo.com>
The styleguide should spell out very clearly "The Transcoder is NOT allowed to change the User-Agent String".LC-2005 from EdPimentl <edpimentl@gmail.com>
- the styleguide should spell out very clearly "The Transcoder is NOT
allowed to change the User-Agent String".
I understand that the current document says "do not change headers", but
at the same time, there are clauses ("the user has specifically requested a
restructured desktop experience") which would allow abusive transcoders to
find an excuse and keep being abusive of the rights of content owners.
Preventing transcoders from changing the UA string is an effective way to
avoid this abuse.LC-1996 from Luca Passani <passani@eunet.no>
* Section 4.1.5 Bullet points one and 3 are get-out-of-jail-free cards for non-transparent proxies to ignore no-transform and do other anti- social things. They should either be tightened up considerably, or removed.LC-2071 from Mark Nottingham <mnot@mnot.net>
* Section 4.1.5 What is a "restructured desktop experience"?LC-2072 from Mark Nottingham <mnot@mnot.net>
* Section 4.1.5 "proxies should use heuristics including comparisons of domain name to assess whether resources form part of the same "Web site." I don't think the W3C should be encouraging vendors to implement yet more undefined heuristics for this task; there are several approaches already in use (e.g., in cookies, HTTP, security context, etc.); please pick one and refer to it specifically.LC-2073 from Mark Nottingham <mnot@mnot.net>
5) Section 4.1.5. Statement to be added: "The request MUST NOT be altered Whenever the URI of the request indicates that the resource being accessed is able to provide mobile-optimized content, e.g. the domain is *.mobi, wap.*, m.*, mobile.*, pda.*, imode.*, iphone.*, or the leading portion of the path is /m/ or /mobile/." Rationale: The guidelines make the assumption that all requests may first undergo a transformation before possibly falling back on a transformationless mode of operation. This is unwarranted, and does not correspond to the way many deployed proxies operate. Obviously, it is rather pointless to go all the way to send a request to the server and wait for its response in order to detect whether the resources accessed are for mobile use, when it is already possible to do this by inspecting the request of the client. The addition to the guidelines covers this situation, and corresponds to the state of the art in transformation proxies. It is also consistent with the heuristic serving to determine whether a response is already mobile-optimized. Following this new guideline improves the performance of the entire content delivery chain without loss of functionality, and is congruent with the stated objective of the guidelines of not disturbing mobile-optimized content.LC-2049 from casays <casays@yahoo.com>
Hello, As the technical lead for SingleClick Systems mobile development, I'm writing to protest the W3C's failure to provide a clear rule against the modification of the User-Agent header. As mobile developers, my team spends a lot of time creating the mobile experience *we* want our users to see. If our users are subjected to the confusing experience of transcoding, we lose money. I urge the W3C to adopt the standards set forth by Luca Passani's Manifesto, of which I'm sure you're aware. Sincerely, Terren SuydamLC-2017 from Terren Suydam <terren@singleclicksystems.com>
4.1.5 Alteration of HTTP Header Values RFC 2616 already says a lot about this. See sec 13.5.2 for example. "The theoretical idempotency of GET requests is not always respected by servers. In order, as far as possible, to avoid mis-operation of such content, proxies should avoid issuing duplicate requests and specifically should not issue duplicate requests for comparison purposes." First of all, do you mean "safe" or "idempotent"? That you refer only to GET suggests safety, but the second sentence suggests you are referring to idempotency. So please straighten that out. Oh, and there's nothing "theoretical" about GET's safety or idempotency; it's by definition, in fact. Secondly, if the server changes something important because it received a GET request, then that's its problem. Likewise, if it changes something non-idempotently because it received a PUT request, that's also something it has to deal with. In both cases though, the request itself is idempotent (and safe with GET), so I see no merit to that advice that you offer ... unless of course the problem you refer to is pervasive which clearly isn't the case. I also wonder if most of 4.1.5 shouldn't just defer to 2616. As is, large chunks of this section (as well as others) specify a protocol which is a subset of HTTP 1.1. (see also the RFC 2119 comment above)LC-2036 from Mark Baker <distobj@acm.org>
4.1.5 Alteration of HTTP Header Values5) Section 4.1.5. Statement to be added: "In so far as the transformation carried out by the proxies is to make content intended for a certain class A of devices available to devices of another class B, then requests MUST NOT be modified whenever a client of a certain class is accessing content intended for its class. If the class of request (either mobile-optimized or full-Web) is not unambiguously determined from the URI pattern, the proxy MUST take into account the original user-agent to avoid unnecessary transformations." Rationale: It is obviously pointless to transform full-Web content accessed by full-Web capable devices (or vice-versa, transforming mobile-optimized content for devices with mobile browsers). Two cases illustrate the situation. a) When full-Web devices such as advanced HTC PDAs, iPhones or tablets access the Web, there is no guarantee that an established server will include a no-transform directive; in fact, it might explicitly leave it out to allow transformation to cater for non-full-Web capable devices. Further, the proposed heuristics will not work: the MIME types of returned content will indicate full-Web content (e.g. text/html), as well as the DOCTYPE (e.g. -//W3C//DTD HTML 4.01//EN). b) When i-Mode terminal accessing i-Mode applications, there is no guarantee that the corresponding servers return a no-transform directive (since it is irrelevant for i-Mode applications). Heuristics may not work either, since content is largely returned as text/html, and without any DOCTYPE declaration.LC-2053 from casays <casays@yahoo.com>
Other than to comply with transparent HTTP operation, proxies should not modify request headers unless:
the user would be prohibited from accessing content as a result of the server responding that the request is "unacceptable" (see 4.3.3 Server Rejection of HTTP Request);
the user has specifically requested a restructured desktop experience;
the request is part of a sequence of requests to the same Web site and either it is technically infeasible not to adjust the request because of earlier interaction, or because doing so preserves consistency of user experience.
These circumstances are detailed in the following sections.
Note:
In this section, the concept of "Web site" is used (rather than "origin server") as some origin servers host many different Web sites. Since the concept of "Web site" is not strictly defined, proxies should use heuristics including comparisons of domain name to assess whether resources form part of the same "Web site".
4.1.5.1 Content Tasting* Section 4.1.5.1 Proxies (and other clients) are allowed to and do reissue requests; by disallowing it, you're profiling HTTP, not providing guidelines.LC-2074 from Mark Nottingham <mnot@mnot.net>
The theoretical idempotency of GET requests is not always respected by servers. In order, as far as possible, to avoid mis-operation of such content, proxies should avoid issuing duplicate requests and specifically should not issue duplicate requests for comparison purposes.
* Section 4.1.5.2 Again, not specifying the heuristics is going to lead to differences in behaviour, which will cause content authors to have to account for this as well. * Section 4.1.5.2 "A proxy must not re-issue a POST/PUT request..." Is this specific to POST and PUT, or all requests with bodies, or...?LC-2075 from Mark Nottingham <mnot@mnot.net>
4.1.5.2 Avoiding "Request Unacceptable" ResponsesI don't understand the need for 4.1.5.2. The second paragraph in particular seems overly specific, as proxies should obviously not be retrying POST requests unless an error - any error - was received. PUT messages can be retried because they're idempotent.LC-2037 from Mark Baker <distobj@acm.org>
A proxy may reissue a request with altered HTTP header values if a previous request with unaltered values resulted in the origin server rejecting the request as "unacceptable" (see 4.3.3 Server Rejection of HTTP Request). A proxy may apply heuristics of various kinds to assess, in advance of sending unaltered header values, whether the request is likely to cause a "request unacceptable" response. If it determines that this is likely then it may alter header values without sending unaltered values in advance, providing that it subsequently assesses the response as described under 4.3.4 Receipt of Vary HTTP Header below, and is prepared to reissue the request with unaltered headers, and alter its subsequent behavior in respect of the Web site so that unaltered headers are sent.
A proxy must not re-issue a POST/PUT request with altered headers when the response to the unaltered POST/PUT request has HTTP status code 200 (in other words, it may only send the altered request for a POST/PUT request when the unaltered one resulted in an HTTP 406 response, and not a "request unacceptable" response).
Proxies may offer users an option to choose to view a restructured experience even when a Web site offers a choice of user experience. If a user has made such a choice then proxies may alter header values when requesting resources in order to reflect that choice, but must, on receipt of an indication from a Web site that it offers alternative representations (see 4.2.3.2 Indication of Intended Presentation Media Type of Representation), inform the user of that and allow them to select an alternative representation.
Proxies should assume that by default users will wish to receive a representation prepared by the Web site. Proxies must assess whether a user's expressed preference for a restructured representation is still valid if a Web site changes its choice of representations (see 4.3.4 Receipt of Vary HTTP Header).
>From 4.1.5.4, "When requesting resources that form part of the representation of a resource (e.g. style sheets, images), proxies should make the request for such resources with the same headers as the request for the resource from which they are referenced.". Why? There may be lots of reasons for using different headers on these requests. For example, I'd expect the Accept header to be different for a stylesheet than for an image. What are you trying to accomplish with this restriction?LC-2039 from Mark Baker <distobj@acm.org>
4.1.5.4 Sequence of Requests* Section 4.1.5.4 Use of the term 'representation' is confusing here; please pick another one. * Section 4.1.5.4 Using the same headers is often not a good idea. More specific, per-header advice would be more helpful.LC-2076 from Mark Nottingham <mnot@mnot.net>
When requesting resources that form part of the representation of a resource (e.g. style sheets, images), proxies should make the request for such resources with the same headers as the request for the resource from which they are referenced.
For the purpose of consistency of representation, proxies
may request linked resources (e.g. those referenced
using the a element) that form part of the same Web site as
a previously requested resource with the same headers as the resource
from which they are referenced.
When requesting linked resources that do not form part of the same Web site as the resource from which they are linked, proxies should not base their choice of headers on a consistency of presentation premise.
Original headers MUST not be changed (User-Agent string has a special place, but also the UAProf x-wap-profile is very very relevant).LC-2006 from EdPimentl <edpimentl@gmail.com>
4.1.5.5 defines a protocol. This should be in an Internet Draft, not in a guidelines document.LC-2040 from Mark Baker <distobj@acm.org>
2) Section 4.1.5.5 Statement to be inserted: "Except when explicitly provided for by RFC2616 to comply with HTTP operations, a proxy MUST NOT delete HTTP header fields received upstream from the client or downstream from the server." Rationale: deployed transcoders have been known to filter out entire HTTP fields, preventing servers from performing adequate content delivery. In some environments, this behaviour seems to have affected x-wap-profile in particular. The statement makes it clear that deleting HTTP header fields is in violation of the Web standards.LC-2046 from casays <casays@yahoo.com>
- original headers MUST not be changed (User-Agent string has a special place, but also the UAProf x-wap-profile is very very relevant). This makes it unnecessary to explain how original header values are recast to different headers (this is not supposed to happen in any case). In short, 4.1.5.5 should be removed.LC-1997 from Luca Passani <passani@eunet.no>
4.1.5.5 Since User-Agent has been the topic of some controversy in comments, just wanted to voice support for the recommendation as written here. While it is vital to preserve information about the mobile device, this does not imply that User-Agent cannot be changed if that information is otherwise preserved. Preserving the User-Agent through a transforming proxy is misleading; the request is *not* coming from a mobile device, but through a proxy. The origin server should be aware of this.LC-2014 from Sean Owen <srowen@google.com>
4.1.5.5 Original Headers* Section 4.1.5.5 This is specifying new protocol elements; this is becoming a protocol, not guidelines.LC-2077 from Mark Nottingham <mnot@mnot.net>
When forwarding an HTTP request with altered HTTP headers proxies
must include in the request copies of the
unaltered header values in the form "X-Device-"<original
header name>. For example, the User-Agent
header has been altered, an X-Device-User-Agent header
must be added with the value of the received
User-Agent header.
Note:
The X-Device- prefix was chosen primarily on the basis
that this is a already existing convention. It is noted that the
values encoded in such header may not ultimately derive from a
device, they are merely received headers. The treatment of received
X-Device headers, which may happen where there are
multiple transforming proxies, is undefined (see D Scope for Future Work).
Irrespective of the presence of a no-transform directive:
proxies should add the IP address of the initiator
of the request to the end of a comma separated list in an
X-Forwarded-For HTTP header;
proxies must include a Via HTTP
header (see 4.1.6.1 Proxy Treatment of Via Header).
4.1.6.1 Proxy Treatment of* Section 4.1.6.1 When a proxy inserts the URI to make a claim of conformance, exactly what are they claiming -- all must-level requirements are met? Should-level? What is the use case for this information?LC-2078 from Mark Nottingham <mnot@mnot.net>
Via Header
Proxies must (in accordance with compliance to RFC
2616) include a Via HTTP header indicating their presence
and should indicate their conformance to this
Recommendation by including a comment in the Via HTTP
header consisting of the URI "http://www.w3.org/ns/ct".
When forwarding Via headers proxies should
not alter them in any way.
Note:
According to [RFC 2616 HTTP]
Section 14.45
Via header comments "may be removed
by any recipient prior to forwarding the message". However, the
justification for removing such comments is based on memory
limitations of early proxies, most modern proxies do not suffer such
limitations.
4.2 Server Response to ProxyThe use of MUST on the CTG when referring to the role of the server should not be allow, since irresponsible transcoding companies will use this to disrupt service and destroy the user experience set us back many years. We can accept RECOMMENDED, and only RECOMMENDED.LC-2007 from EdPimentl <edpimentl@gmail.com>
4.2.1 Use of HTTP 406 Status* Section 4.2.1 Requiring servers to respond with 406 is profiling HTTP; HTTP currently allows the server to send a 'default' representation even when the headers say that the client doesn't prefer it.LC-2079 from Mark Nottingham <mnot@mnot.net>
Servers should respond with an HTTP 406 Status (and not an HTTP 200 Status) if a request cannot be satisfied with content that meets the criteria specified by values of the HTTP request headers.
* Section 4.2.2 "Servers must include a Cache-Control: no-transform directive if one is received in the HTTP request." Why?LC-2080 from Mark Nottingham <mnot@mnot.net>
4.2.2 Server Origination of4.2.2 "Servers must include a Cache-Control: no-transform directive if one is received in the HTTP request." Why? What does the transformability of a request body have to do with the transformability of the associated response body?LC-2041 from Mark Baker <distobj@acm.org>
Cache-Control: no-transformServers must include a Cache-Control:
no-transform directive if one is received in the HTTP request.
Servers should include a Cache-Control:
no-transform directive if, for any reason, they wish to inhibit
transformation of the response.
Note:
Including a Cache-Control:
no-transform directive can disrupt the behavior of WAP/WML
proxies, because it can inhibit such proxies from converting WML to
WMLC.
Servers should take account of user agent capabilities and formulate an appropriate experience according to those capabilities. Servers should provide a means for users to select among available representations, should default to the last selected representation and should provide a means of changing the selection.
* Section 4.2.3.1 "Serves may base their actions on knowledge... but should not choose an Internet content type for a response based on an assumption or heuristics about behaiour of any intermediaries." Why not?LC-2081 from Mark Nottingham <mnot@mnot.net>
4.2.3.1 Use of4.2.3.1 If a server varies its representation according to examination of received HTTP headers then it must include a Vary HTTP header indicating this to be the case. If, in addition to, or instead of HTTP headers, a server varies its representation based on other factors (e.g. source IP Address) then it must, in accordance with [RFC 2616 HTTP]<http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/#ref-HTTP>, include a Vary header containing the value '*'. What should contain the Vary HTTP Header when a server varies its representation according to examination of received HTTP headers? Best RegardsLC-2008 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
Vary HTTP Header
If a server varies its representation according to examination of
received HTTP headers then it must include a
Vary HTTP header indicating this to be the case. If, in
addition to, or instead of HTTP headers, a server varies its
representation based on other factors (e.g. source IP Address) then it
must, in accordance with [RFC 2616 HTTP], include a Vary header containing the value '*'.
Servers may base their actions on knowledge of
behavior of specific transforming proxies, as identified in a
Via header, but should not choose an
Internet content type for a response based on an assumption or
heuristics about behavior of any intermediaries. (e.g. a server should
not choose Content-Type: application/vnd.wap.xhtml+xml
solely on the basis that it suspects that proxies will not transform
content of this type).
4.2.3.2 " In HTML content it should indicate the medium for which the representation is intended by including a link element identifying in its media attribute the target presentation media types of this representation and setting the href attribute to a valid local reference (i.e. use the fragment identifier (see [RFC 3986]<http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/#ref-rfc-3986> section 3.5<http://www.tools.ietf.org/html/rfc3986.html#section-3.5>) added to the URI of the document being served to point to a valid target within the document)." Why it has to be a fragment identifier within the page? If you do so, strictly speaking you are saying that an specific fragment of the current page is an alternative representation for the media handheld for the current page. That's not true, as the whole page is such representation. Proposed Amendment: As per RFC 3986 section 4.4 [Amended: was RFC 1808 initially] an empty relative URI href="" resolves to complete base URL, so it is suggested to use this mechanism to point to the current resource <link rel="alternate" media="handheld" type="text/html" href="" /> (another option is to suggest the usage of the URI that points to the current resource. ) Best RegardsLC-2009 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
4.2.3.2 Note: "The presence of link elements which do not contain a valid local reference does not indicate one way or another whether this representation is formatted for the presentation media types listed." This note is useless, it says nothing. Proposed Amendment: As per my previous comments on this, the note should be dropped. Best RegardsLC-2011 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
4.2.3.2 Indication of Intended Presentation Media Type of Representation4.2.3.2 "In addition it should include link elements identifying the target presentation media types of other available representations by setting the media attribute to indicate those representations and the href attribute to a URI without a fragment identifier." This is totally wrong as I may have other representations of the current resource (for example in RDF or as text) in specific sections of my page and in that case I could use a fragment identifier. Proposed Amendment: Avoid the suggestions about fragment identifiers as they are misleading Best RegardsLC-2010 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
If a server has distinct representations that vary according to the
target presentation media type, it should inhibit
transformation of the response by including a Cache-Control:
no-transform directive (see 4.2.2 Server Origination of Cache-Control: no-transform).
In HTML content it should indicate the medium for
which the representation is intended by including a link
element identifying in its media attribute the target
presentation media types of this representation and setting the
href attribute to a valid local reference (i.e. use the
fragment identifier (see [RFC 3986]
section 3.5) added to
the URI of the document being served to point to a valid target within
the document).
In addition it should include link
elements identifying the target presentation media types of other
available representations by setting the media attribute to
indicate those representations and the href attribute to a
URI without a fragment identifier.
Note:
The presence of link elements which do not contain a
valid local reference does not indicate one way or another whether
this representation is formatted for the presentation media types
listed.
Note:
Some examples of the use of the link element are
included below in B Example Transformation Interactions.
4.3 Proxy Forwarding of Response to User Agent2) Section 4.3 Following item to be added to the guidelines: A Content Transformation Proxy receiving a response that contains a non-empty meta-tag "Copyright" MUST NOT restructure or recode the content, nor dependent resources (such as pictures or videos, or other textual content). Rationale: The following content alterations may constitute willful copyright violations: a) Insertion of additional links, advertisements, navigation elements (e.g. scroll bars), or extraneous content. b) Modification of the look and feel (e.g. through reorganization of the page, filtering out elements such as pictograms). The following alterations may also constitute defacement of trade marks and other registered elements: c) Changing the representation of trademarks, logos and other registered design elements, as a change in the representation may affect its legibility, the colours or the colour palette. The following alterations may also constitute disactivation of or bypassing IPR protections: d) Re-encoding images or videos which embed steganographic IPR protection information may eliminate or render ineffective these mechanisms. A copyright meta-tag, in the absence of any other indication, is enough to signal that the content must not be transformed. This meta-tag is widely used in the WWW, and its inclusion in content as a meta-tag (invisible to end-users) is precisely intended to control automatic processing in the delivery chain.LC-2020 from casays <casays@yahoo.com>
1) Section 4.2.2: Statement to be inserted: "As per sections 13.5.2 and 14.9.5 of RFC2616, proxies MUST NOT modify neither the header fields Content-Encoding, Content-Range and Content-Type, nor the body originating from a server that includes a no-transform directive in its response. Proxies that do not follow this rule do not conform to the HTTP protocol." Rationale: there actually are transcoders which, despite no-transform directives, modify the body of the responses from server. The statement reminds the users of such proxies that they must be configured so as not to violate IETF standards.LC-2045 from casays <casays@yahoo.com>
Consistently with my other comment that no extra content should be added to transcoded web sites, I think that this should apply even more strongly to mobile-optimised sites. Unfortunately, I see a lot of transcoder deployments where operators and/or transcoder vendors feel entitled to add advertisement and extra navigation bars to existing mobile optimisec ontent. Because of this, I suggest the following addition as a note to "4.3.1": "Note: It should be stressed that, in case of a |Cache-Control: no-transform| directive, adding any extra content (such as banners, navigation bars and links not available in the original application) is not admissable" Thank you Luca PassaniLC-2091 from Luca Passani <passani@eunet.no>
4.3.1 Receipt ofOn section 4.3.6 it is mentioned the possibility of also checking the existence of meta HTTP-Equiv directives on the HTML response in addition to standard HTTP headers. However, this should be explicitly clarified and if so should apply to any server-generated header. Proposed Amendment: Clarify explictly if proxies should check standard HTTP headers or meta HTTP-Equiv headers or both. Best RegardsLC-2013 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
Cache-Control: no-transformIf the response includes a Cache-Control: no-transform directive
then proxies must not alter it other than to comply with
transparent HTTP behavior and
other than as follows.
If a proxy determines that a resource as currently represented is likely to cause serious mis-operation of the user agent then it may advise the user that this is the case and must provide the option for the user to continue with unaltered content.
* Section 4.3.2 Why can't proxies transform something that has already been transformed?LC-2082 from Mark Nottingham <mnot@mnot.net>
4.3.2 Receipt of4.3.2 "If the response includes a Warning: 214 Transformation Applied HTTP header, proxies must not apply further transformation. " Why? The transformation indicated by the warning may have been the result of a server-side transformation which a client-side proxy may deem suboptimal, and so want to retransform. I see no problem with that.LC-2042 from Mark Baker <distobj@acm.org>
Warning: 214 Transformation AppliedIf the response includes a Warning: 214 Transformation Applied
HTTP header, proxies must not apply further
transformation.
4.3.3 Server Rejection of HTTP Request* Section 4.3.3 Sniffing content for error messages is dangerous, and also unlikely to work. E.g., will you sniff for all languages and all possible phrases? How will you avoid false positives? Remove this section and require content providers to get it right. People may still do this in their products, but there's no reason to codify it.LC-2083 from Mark Nottingham <mnot@mnot.net>
For compatibility with servers that do not implement this Recommendation (see 4.2.1 Use of HTTP 406 Status), a proxy may treat responses with an HTTP 200 Status as though they were responses with an HTTP 406 Status if it has determined that the content (e.g. "Your browser is not supported") is equivalent to a response with an HTTP 406 Status.
4.3.4 Receipt of* Section 4.3.4 What's the purpose behind this behaviour?LC-2084 from Mark Nottingham <mnot@mnot.net>
Vary HTTP Header
If, in response to an HTTP request with altered headers that was not preceded
by an HTTP request with unaltered headers, a proxy receives a response
containing a Vary header referring to one of the altered
headers then it should request the resource again with
unaltered headers, it should update whatever heuristics
it uses so that unaltered headers are presented first in subsequent requests
for this resource and it should resume the behavior
described under 4.1.5.2 Avoiding "Request Unacceptable" Responses to avoid
rejection of subsequent requests.
If the response is an HTML response and it contains a <link
rel="alternate" media="handheld" /> element, the CT-proxy
should request and process the referenced resource,
unless the resource referenced is the current resource as determined by the
presence of link elements as discussed under 4.2.3.2 Indication of Intended Presentation Media Type of Representation.
3) Section 4.3.6 Under "Examples of mobile specific DOCTYPEs:", add: -//WAPFORUM//DTD WML 1.3//EN -//WAPFORUM//DTD WML 1.1//EN Rationale: WML is still in use in the mobile Web. Responses of this type are precisely the kind that should not be transformed, as WML is intrinsically targeted at mobile devices only. WML can also be delivered over HTTP, so the draft applies to this content as well.LC-2021 from casays <casays@yahoo.com>
4) Section 4.3.6 The proposed heuristics seem to fail completely for i-mode sites, because: a) i-Mode sites often do not have peculiar URL distinguishing them from (non-mobile) sites, and in any case the prefix imode.* is not included in the list of URI patterns to check for. b) i-Mode sites return mostly their content as text/html. c) i-Mode sites do not include a DOCTYPE. d) The markup for i-Mode does not cater for the utilization of the link element as proposed in the draft, which is therefore not included in i-Mode content. e) i-Mode servers do not have much use for the "no-transform" directive, and hence do not necessarily implement it.LC-2022 from casays <casays@yahoo.com>
Hi, I think that CTG should mention the fact that, in case of transcoding, no extra content should be injected without the consent of the original content owner. The idea is to avoid that W3C protocols/guidelines implicitly endorse the attempt by those who manage the transcoder to monetize on the effort/investment of other people. Of course, there is also a point that injecting extra content will invariably affect usability negatively and as such should be avoided. I suggest the following addition: "4.3.6.3 Injection of external content In its effort to optimise the user experience of non-mobile optimised sites, a proxy *should not* inject extra content into the transcoded pages, where the term 'extra content' refers to text, links, banners and other multimedia content which is not available on the original untranscoded page. Addition of links aimed at implementing pagination and navigational shortcuts is admissible. Note: For clarity, it is emphasised that W3C does not endorse injection of third-party content into a transcoded page without the explicit consent of the content owner" Can this comment be added to the tracking system? Thank you Luca PassaniLC-2090 from Luca Passani <passani@eunet.no>
- the "|application/xhtml+xml" MIME type should be the basis for an heuristics that informs transcoders that no transcoding must be applied. The rationale for this is obvious: this MIME type is being used for mobile content virtually exclusively these daysLC-1998 from Luca Passani <passani@eunet.no>
4) Section 4.3.6. Complete the statement: "the URI of the response (following redirection or as indicated by the Content-Location HTTP header) indicates that the resource is intended for mobile use (e.g. the domain is *.mobi, wap.*, m.*, mobile.*" ADD: , pda.*, imode.*, iphone.*, "or the leading portion of the path is /m/ or /mobile/);" Rationale: URL with pattern imode.* and pda.* have been in use for many years, and unambiguously indicate sites that are optimized for i-Mode devices or for PDA (Palm, PPC, IEMobile). URL of the form iphone.* have started to appear, providing experience specifically tailored to i-Phones; they do not need, and should not be transformed either.LC-2048 from casays <casays@yahoo.com>
- There should be restrictions over how short a page transcoders are allowed to reformat. In no case should a page smaller than 10kb be reformatted (ideally this threshold should be higher, but 10kb will make it consistent with BT, so it would be a step in the right direction)LC-1999 from Luca Passani <passani@eunet.no>
- Navigation bars: this is something that I would like to introduce in the Manifesto too. In no event should a transcoder add extra footers or headers (logos, extra navbars, advertisement and similar) without the consent of the content owner.LC-2000 from Luca Passani <passani@eunet.no>
- The list of "safe" URL patterns should be improved to support iphone.* and */iphone/LC-2002 from Luca Passani <passani@eunet.no>
4.3.6 Proxy Decision to Transform3) Section 4.3.6 The third bullet under "examples of heuristics" is to be split into two points: "the Content-Type of the response are known to be specific to the device or class of device. At a minimum, the following MIME types intended for mobile Web browsers MUST represent mobile-optimized content: Browsing XHTML-related application/vnd.wap.xhtml+xml application/xhtml+xml Browsing WML-related text/vnd.wap.wml application/vnd.wap.wmlc text/vnd.wap.wml+xml text/vnd.wap.wmlscript application/vnd.wap.wmlscriptc image/vnd.wap.wbmp application/vnd.wap.wbxml Browsing and downloading application/vnd.wap.multipart.mixed application/vnd.wap.multipart.related application/vnd.wap.multipart.alternative application/vnd.wap.multipart.form-data In addition, the following MIME types of the form */x-up-* SHOULD be considered as representing mobile-optimized content, at a minimum: Legacy Openwave image/x-up-wpng image/x-up-bmp The range of MIME types is intended to cover typical mobile browsing applications. Transformations specified by the relevant standards are allowed (WAP-236 WAE specifications 19.12.2001, WAP-192 WBXML specifications 25.7.2001, WAP-191 WML specifications 19.2.2000 and predecessors, WAP-193 WMLScript specifications 25.10.2000). In accordance with Internet standards and practices, a proxy SHOULD determine whether a content is mobile-optimized FIRST by examining the HTTP header field content-type, before inspecting the XML declaration and its associated DOCTYPE." Rationale: Inspection of the HTTP field Content-type is an usual mode of operation amongst transcoders. It is also simpler and safer than applying heuristics on DOCTYPE, because inspecting the content of a body requires one to deal with character encoding issues (see RFC3023, XML 1.1 sections 4.3.3 and E), or parsing multipart-structured content; these are unnecessary when handling HTTP fields. Finally, specifying a minimum set of required MIME types to take into account helps ensure that proxies will exhibit a standard behaviour, and that non-textual content types for which there is no DOCTYPE (notably mobile-specific image formats) are properly dealt with. A normative document cannot leave full freedom to implementors to select whichever subset of content types are to be considered mobile-optimized or not. 4) Section 4.3.6 The second part of the bullet split as described in (b) is to contain the following: "other aspects of the response such as the DOCTYPE are known to be specific to the device or class of device. At a minimum, the following DOCTYPEs MUST be considered as mobile-specific: XHTML mobile profile -//OMA//DTD XHTML Mobile 1.2//EN -//WAPFORUM//DTD XHTML Mobile 1.1//EN -//WAPFORUM//DTD XHTML Mobile 1.0//EN XHTML basic -//W3C//DTD XHTML Basic 1.1//EN -//W3C//DTD XHTML Basic 1.0//EN -//OPENWAVE//DTD XHTML 1.0//EN -//OPENWAVE//DTD XHTML Mobile 1.0//EN XHTML i-Mode -//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/1.0) 1.0//EN -//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/1.1) 1.0//EN -//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/2.0) 1.0//EN -//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/2.1) 1.0//EN -//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/2.2) 1.0//EN [list completed in http://lists.w3.org/Archives/Public/public-bpwg-comments/2008JulSep/0150.html with: -//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/2.3) 1.0//EN] Compact HTML -//W3C//DTD Compact HTML 1.0 Draft//EN -//BBSW//DTD Compact HTML 2.0//EN The following DOCTYPEs MUST be considered as mobile-specific. Transformations explicitly provided for by the relevant standards are allowed (WAP-192 WBXML specifications 25.7.2001, WAP-236 WAE specifications 19.12.2001, WAP-191 WML specifications 19.2.2000 and predecessors, WAP-193 WMLScript specifications 25.10.2000). WML -//WAPFORUM//DTD WML 1.0//EN -//WAPFORUM//DTD WML 1.1//EN -//WAPFORUM//DTD WML 1.2//EN -//WAPFORUM//DTD WML 1.3//EN -//WAPFORUM//DTD WML 2.0//EN -//PHONE.COM//DTD WML 1.1//EN -//OPENWAVE.COM//DTD WML 1.3//EN The range of MIME types is intended to cover typical mobile browsing applications." Rationale: A normative document cannot leave full freedom to implementors to select whichever subset of DOCTYPEs are considered mobile-optimized or not. This helps ensure that transformation proxies exhibit a standard behaviour.LC-2052 from casays <casays@yahoo.com>
In the absence of a Vary or no-transform directive
(or a meta HTTP-Equiv element containing Cache-Control:
no-transform) proxies should apply heuristics
to the response to determine whether it is appropriate to restructure or recode it (in the presence of such
directives, heuristics should not be used.)
Examples of heuristics:
The Web site (see note) has previously shown that it is contextually aware, even if the present response does not indicate this;
a claim of mobileOK Basic [mobileOK Basic Tests] conformance is indicated;
the Content-Type or other aspects of the response (such
as the DOCTYPE) are known to be specific to the device or class of
device;
Examples of mobile specific DOCTYPEs:
-//OMA//DTD XHTML Mobile 1.2//EN
-//WAPFORUM//DTD XHTML Mobile 1.1//EN
-//WAPFORUM//DTD XHTML Mobile 1.0//EN
-//W3C//DTD XHTML Basic 1.1//EN
-//W3C//DTD XHTML Basic 1.0//EN)
the user agent has linearization or zoom capabilities or other features which allow it to present the content unaltered;
the URI of the response (following redirection or as indicated by the
Content-Location HTTP header) indicates that the
resource is intended for mobile use (e.g. the domain is *.mobi,
wap.*, m.*, mobile.* or the leading portion of the path is /m/ or
/mobile/);
the response contains client-side scripts that may mis-operate if the resource is restructured;
the response is an HTML response and it includes
<link> elements specifying
alternatives according to presentation media type.
4.3.6.1 Alteration of Response5) Section 4.3.6.1 I miss any discussion or reference in the document about the issue of character encodings. Transforming content across different charsets is a mine-field and affects a number of aspects: a) Content may rely upon widely different character encodings, depending on the targetted devices and markets. In particular, the trio China - Japan - Korea (CJK) continues to rely on a number of encodings (such as Shift_JIS, BIG5, etc) whose handling is a complex matter; for instance, there are not necessarily bijective mappings between these encodings and others, including UTF-8. b) Documents may have multi-encoding representations. Different encodings may be associated with external entities through the charset attribute (see HTML 4.0.1). How transformation proxies deal with such a situation is left undefined. c) Similarly, the draft does not explain what happens when a server associates an attribute accept-charset to a form, and whether proxies respect or manipulate such information. d) In i-Mode, and at least in the Softbank environment (Japan), unreserved character points in the character encoding space are used to represent pictograms. Any attempt to convert these characters directly will fail; they should therefore not be transformed, but preserved, taking into account the fact that the character points thus referred to differ between Unicode and Shift_JIS, and that DoCoMo and Softbank do not use the same code points for the same pictograms. A consequence of all this is that if a proxy does not operate natively with the character encoding of the content returned by the server, or is not able to ensure a bijective mapping between this encoding and other encodings it deals with, recurrent and irrecoverable problems will creep. A simple way that could go some way towards alleviating this risk would be to forbid any transformation if the server announces (either via the HTTP field Content-type: charset=..., the XML declaration, or a meta-tag) an encoding different from ASCII or perhaps UTF-8.LC-2023 from casays <casays@yahoo.com>
A proxy should strive for the best possible user experience that the user agent supports. It should only alter the format, layout, dimensions etc. to match the specific capabilities of the user agent. For example, when resizing images, they should only be reduced so that they are suitable for the specific user agent, and this should not be done on a generic basis.
If a proxy alters the response then:
It must add a Warning 214 Transformation
Applied HTTP header;
The altered content should validate according to an appropriate published formal grammar;
It should indicate to the user that the content has been transformed for mobile presentation and provide an option to view the original, unmodified content.
I am the founder of Goowallet a Mobile Banking / Payment private label service provider After reading the Last Call comments we are very concern that many of these recommendations will seriously impact security, privacy and trust. We are therefore 100% oppose to allowing Disrupting HTTPS they way transcoder do today is probably illegal and certainly unethical. HTTPS is built to guarantee end2end security. Breaking end2end security is probably illegal. Men in the Middle/Interfering with HTTPS should not be permissible under any circumstances. Making(allowing) it possible for an Operator to now attempt to dismantle the security of the internet in favor of transcoding, will seriously and significantly and negatively impact the banking and financial industry. Data protection rules and regulations. If allow, this will also impact the national security of all law abiding nations.LC-2004 from EdPimentl <edpimentl@gmail.com>
6) Section 4.3.6.2 The possibility to break the end-to-end security of an HTTPS connection is unacceptable and must be forbidden. This jeopardizes the set-up of mobile e-commerce, which had difficulties to get established in part because of the point-to-point, hop-wise secure connection with WTLS, and makes a sham of security for other applications that require it. Besides, there is no guarantee that transformations performed by a proxy preserve the content being exchanged between client and server to a point that does not further disturb the secure exchange. As an example, there is no explicit prohibition in the draft against turning POST requests into GET ones, the resizing of images may make visual captchas unreadable, and reordering elements may make forms or security information difficult to figure out at the client side.LC-2024 from casays <casays@yahoo.com>
Dom, thanks for your request for review. With respect to the guidelines regarding the rewriting of HTTPS URIs, we notice that any such rewriting will break any use of TLS for authenticating the client to the server (e.g., use of TLS client certificates). Similarly, any applications on top of HTTPS that rely on TLS channel bindings would detect the proxy's intervention as an attack, and lead to a broken user experience; see RFC 5056 for more details about channel bindings. We recommend that you discuss this aspect with the IETF TLS Working Group. Regards,LC-2085 from Thomas Roessler <tlr@w3.org>
c) Similary, the guidelines leave completely open the way of how to "provide the option to avoid decryption...", and do not require it to be an OSTENSIBLE one. If the users miss the alternate (or rather, the original) link, they may unwillingly and unconsciously access the server without the expected security. As an example, a small icon (perhaps representing a key) in a corner of the first page accessed via HTTPS, and linking to the end-to-end HTTPS link, fully conforms to the guidelines. How many users would notice it and understand its significance?LC-2028 from casays <casays@yahoo.com>
d) Informing the user that there are security implications in the way he chooses to access the server, and providing him with an alternative link to it risks causing the following reactions: i. WWW-beginners may simply not bother reading the advice and always take the default action, which according to the guidelines seems to correspond to taking the less safe, point-to-point HTTPS connection. ii. Somewhat WWW-knowledgeable users, aware of the existence of Trojan horses and phishing, may reel at the invitation to try alternative links. If they are curious and examine the URI of the current page, they may further suspect foul play, as the rewritten URI may not match the one they accessed originally. iii. Expert WWW-users will understand the implications of the proxy set-up, but may be wary at using its services for HTTPS links -- after all, what is the guarantee that the proxy will not misuse or unintentionally disclose private information in a point-to-point connection? And if there is a proxy acting as middle-man, what is the guarantee that the end-to-end HTTPS link is actually an end-to-end one and the proxy is not just performing some other tricky manipulations? Overall, fiddling with HTTPS connections risks reducing, rather than increasing, the willingness of end-users to access the mobile Web. A relevant point is that these end-users may actually assign the fault with the untrustworthy connections to the content or application provider, rather than to the operator of the proxy.LC-2029 from casays <casays@yahoo.com>
e) The guidelines allow the client to go through a first point-to-point session establishment with the proxy, and if so desired, through a second end-to-end session establishment with the server. Establishing an HTTPS connection is a somewhat heavy process for wireless devices, requiring the delivery and possibly acceptance of certificates. A double initiation procedure reduces, rather than increases, usability at session start.LC-2030 from casays <casays@yahoo.com>
4.3.6.2 I think the Note here is a good one, but may be worth expanding, since it is apparently already unclear to some how HTTPS works here. The very purpose of HTTPS is to ensure that content is not modified or read by third parties in transit, which means a transforming proxy cannot jump into an HTTPS conversation between mobile device and origin server. So there's not actually a question of whether it's illegal or unethical -- it's simply not possible (unless you have cracked SSL). It can only create a secure connection between the mobile device and itself, and between itself and the origin server. This is indeed a situation that the end user needs to understand: I suggest wording along these lines, take it or leave it as you see fit -- URIs which begin with the https scheme, when accessed, are secured against eavesdropping and modification by third parties by the SSL protocol. It is therefore not possible for a third-party transforming proxy to participate directly in such a connection between mobile device and origin server. Transforming proxies may still transform content of https resources, but at best, it involves creating a separate secure connection between device and proxy, and between proxy and origin server. These communications are secure but the secured content is of course visible to the transforming proxy. This may of course be undesirable to an end user. Therefore if a proxy rewrites https links, replacements links MUST at least use the https scheme as well, and the proxy MUST use https to communicate with the origin server. In addition the proxy MUST clearly advise the user that the potentially sensitive contents of the communication will be visible to the proxy, and must give the user an option to opt out.LC-2015 from Sean Owen <srowen@google.com>
f) In the absence of any requirements regarding the reliability of proxies and their operating environment, one can only wonder why anybody would choose a point-to-point connection through an uncontrollable middle-man over an end-to-end one if the intent is to access private information safely over the mobile Web. The experience of WTLS taught some hard lessons there.LC-2031 from casays <casays@yahoo.com>
Having look at the conversation you are having here, I think there are conflicting information about how HTTPS is handled by transcoding servers. I understand that not all transcoders work the same, but some do perform a man-in-the-middle-attack, and IMO this should not be endorsed by the W3C guidelines. The way many transcoders work is that they run instances of real web browsers (talking about tens or hundreds of Internet Explorer instances running in the memory of the server here). This means that there is no way for content owners to protect against transcoders simply because the server is talking to a legitimate web browser, exchanging real certificates, logging-in with real passwords, establishing secure SSL connetions and all the rest. The point of the Content Transformation Guidelines seems to be "some users may want to continue using the service at the cost of degrading security". Well, this is not up to the user to decide, I am afraid. HTTPS is also about non-repudiation and the fact that users must not be able to say "I did not do it" at a later stage. The fact that transcoders have found a technical way to by-pass HTTPS security does not mean that they have the right to do it. Nor does it mean that end-users can take advantage of it. LucaLC-2016 from Luca Passani <passani@eunet.no>
g) The guidelines rely upon a fundamentally flawed assumption: in the HTTPS connection, the client is the only party concerned with security, and which must take a decision as to whether to access resources over a point-to-point or end-to-end link. This is incorrect: there are actually two parties to the secure connection, client and server, both with legitimate security concerns. The server has thus as much a right to determine whether it wants to provide services over a point-to-point connection as the client. I can very well imagine that for instance banking, electronic commerce or social networking application servers may decide to sever point-to-point connections rather than providing services over them, and inform the end-user about the reasons. Unfortunately, because of the flawed assumption of the guidelines, there is strictly no way a server may reliably detect whether it is communicating over a point-to-point link or not. Consider: i. The proxy rewrites links but the replacement links must have HTTPS; hence for the server communication obviously takes place over HTTPS. ii. If the proxy preserves the HTTP header fields (such as user-agent, accept, accept-charset, etc), which is actually encouraged by the guidelines, then the proxy cannot detect that transformations may be taking place. iii. Further, the "via" HTTP header field does not constitute a proper mechanism to detect the presence of a transformation proxy, and whether HTTPS is point-to-point or end-to-end. First, the comment "http://www.w3.org/ns/ct" indicating the presence of a transformation proxy is not mandatory, as per the guidelines. Secondly, RFC2616 authorizes proxies to use a pseudonym instead of a domain name for the "received-by" part of their hop, which does not necessarily have a meaning for servers. The server is therefore not in a position to take educated decisions as to its secure communications with clients through a transformation proxy.LC-2032 from casays <casays@yahoo.com>
Overall, the guidelines follow the rule that accessing the WWW is the prime intent of the end-user, and that security comes only second. Hence the approach of defaulting to the transformation chain, with the possibility of opting out of it. This is a questionable assumption precisely in the context of secure transactions. There, secure access is the paramount requirement, and must therefore be fulfilled by any proxy set-up, with the possibility to opt-in to the unsecure transformation chain.LC-2033 from casays <casays@yahoo.com>
- Messing with HTTPS should not be permissible under any circumstances. Disrupting HTTPS they way transcoder do today is probably illegal and certainly unethical. HTTPS is built to guarantee end2end security. Breaking end2end security is probably illegal and certainly not an activity that W3C should endorse in any way.LC-2001 from Luca Passani <passani@eunet.no>
a) Tthe guidelines state: "[...] and must provide the option to avoid decryption and transformation of the resources the links refer to." This stipulation theoretically allows manipulations of the HTTPS stream that are not strictly related to decryption and transformation of the content. What is required is that the client may establish an HTTPS connection with the server in the exact, undisturbed context as if the proxy were a transparent one, performing no transformations whatsoever.LC-2026 from casays <casays@yahoo.com>
4.3.6.2 HTTPS Link Re-writingb) The guidelines do not state that the users "must be advised of the security implications of rewriting HTTPS links" BEFORE they have a chance to perform any operation with the target site. If the advice takes place after an operation, then users may unknowingly access the server through the point-to-point HTTPS connection instead of the end-to-end one. As an example, a small icon (perhaps representing a question mark) in a corner of the first page accessed via HTTPS, and pointing to a description of the consequences of the rewritten HTTPS links, fully conforms to the guidelines. How many users would notice it? How many would click on it, take the time to read its content fully (and understand it), before performing any further action?LC-2027 from casays <casays@yahoo.com>
If the response contains links whose URIs have the scheme
https proxies may only rewrite them
so that they can transform the content of linked resources, if the
following provision is met. If a proxy does rewrite such links, it
must advise the user of the security implications
of doing so and must provide the option to avoid
decryption and transformation of the resources the links refer to.
If a proxy re-writes HTTPS links, replacement links
must have the scheme https.
Note:
For clarity it is emphasized that it is not possible for a transforming proxy to transform content accessed via an HTTPS link without breaking end to end security.
A References2) Sections A and D Since 2005, the Open Mobile Alliance has been working on a Standard Transcoding Interface, and has published specifications for it. The usage scenario is different: the STI is meant for servers offering transformation services on demand via a Web services interface, whereas the usage scenario of the CTG is for proxies that intercept all HTTP flows between clients and servers. However, there are several aspects that may overlap -- in the requirements or the definition of the acceptable limits during transcoding (e.g. content size). A reference to this standard, and a discussion of the relation between the CTG and the OMA specification is in order.LC-2051 from casays <casays@yahoo.com>
Note:
The following examples refer to requests with the GET method.
Request resource with original headers
If the response is a 406 response:
If the response contains Cache-Control:
no-transform, forward it
Otherwise re-request with altered headers
If the response is a 200 response:
If the response contains Vary: User-Agent, an
appropriate link element or header, or Cache-Control:
no-transform, forward it
Otherwise assess whether the 200 response is a form of "Request Unacceptable"
If it is not, forward it
If it is, re-request with altered headers
Proxy receives a request for resource P that it has not encountered before
Proxy forwards this request
Response is 200 OK containing the text "Unsupported browser. Please get a different one or use a CT proxy."
Proxy determines that this equates to a 406 Status and re-requests the resource from the origin server with altered headers (emulating a well known desktop browser)
Response is a desktop oriented representation of the resource
Proxy transforms this response into content that the user agent can display well and forwards it
Proxy receives a further request for the resource P
Based on evidence from the previous interaction (e.g. that there
was no Vary header, that the response was not targeted at only
the previous user in that there was no Cache-Control: private
directive) the CT proxy forwards the request with altered headers
Response is a desktop oriented representation of the resource
Proxy transforms this response into content that the user agent can display well and forwards it
Proxy receives a request for resource P, that it has previously encountered as in B.2 Optimization based on Previous Server Interaction
Proxy forwards request with altered headers
Response is 200 OK containing a Vary: User-Agent
header
Proxy notices that behavior has changed and re-issues request with original headers
Response is 200 OK and proxy forwards it
Proxy receives a request for resource P
Proxy forwards request with original headers
Response is 200 OK with Vary: User-Agent and
<link type="alternate" media="handheld" href="P#id"
/> where id is a document local reference
Proxy forwards response as designed specifically for the requesting device
Proxy receives a request for resource P
Proxy forwards request with original headers
Response is 200 OK with <link type="alternate"
media="handheld" href="Q" /> and Q is not P
Proxy requests Q with original headers
Response is 200 OK and proxy forwards it
There are a number of well-known examples of solutions that seem to their users as though they are using a browser, but because the client software communicates with using proprietary protocols and techniques, it is the combination of the client and the in-network component that is regarded as the HTTP User Agent. The communication between the client and the in-network component is therefore out of scope of this document.
Additionally, where some kind of administrative arrangement exists between a transforming proxy and an origin server for the purposes of transforming content on the origin server's behalf, this is also out of scope of this document.
In both of the above cases, it is recommended that when forwarding requests to origin servers that proxies adhere to the provisions of this document in respect of providing information about the device and the original IP address.
The BPWG believes that POWDER will represent a powerful mechanism by which a server may express transformation preferences. Future work in this area may recommend the use of POWDER to provide a mechanism for origin servers to indicate more precisely which alternatives they have and what transformation they are willing to allow on them, and in addition to provide for Content Transformation proxies to indicate which services they are able to perform.
D.2"D.2 link HTTP Header The BPWG believes that the link HTTP header which was removed from recent drafts of HTTP, and which is under discussion for re-introduction, would represent a more general and flexible mechanism than use of the HTML link element, as discussed in this recommendation." This is totally misleading. The link header was removed in RFC2616 (RFC, not a draft), and that was in 1999 (so, not "recent"). BR, JulianLC-1995 from Julian Reschke <julian.reschke@gmx.de>
link HTTP Header
The BPWG believes that the link HTTP header which was removed from
recent drafts of HTTP, and which is under discussion for re-introduction, would
represent a more general and flexible mechanism than use of the HTML
link element, as discussed in this recommendation.
The process of adapting content at the origin server, or transforming it in a proxy is likely to have a dependency on a repository of device descriptions. An origin server's willingness to allow a transforming proxy to transform content may depend on its evaluation of the trustworthiness of device description data that is being used. There is scope for enhancement of the trust relationship by some means of indicating this.
D.4 Inter Proxy Communication3) Cascaded proxies. a. Section 4.1.3 Statement to be inserted: "Whenever the requester is another transformation proxy, the receiving proxy MUST treat it as a non-browser agent. The receiving proxy SHOULD rely upon the presence of alternative X-Device- HTTP fields and the values in the via HTTP field as per 4.1.6.1 to detect that it is placed downstream from a chain of proxies." b. Section 4.3.2 Statement to be inserted: "As per section 14.46 of RFC2616, 214 Transformation applied MUST be added by an intermediate cache or proxy if it applies any transformation changing the content-coding (as specified in the Content-Encoding header) or media-type (as specified in the Content-Type header) of the response, or the entity-body of the response, unless this Warning code already appears in the response. A proxy receiving a 214 code MUST NOT change it." c. Section D.4 Eliminate entirely. Rationale: together with 4.3.1, 4.3.2, 4.1.2, 4.1.5.5, 4.1.6.1, 4.3.6.1, these changed sections entirely solve the issue of cascading proxies in a standards-compliant way.LC-2047 from casays <casays@yahoo.com>
There is scope for further work to define how multiple proxies may inter-operate. A common case of multiple proxies is where a network provider transforming proxy and a search engine transforming proxy are both present.
The BPWG believes that amendments to HTTP are needed to improve the inter operability of transforming proxies. For example, HTTP does not provide a way to distinguish between prohibition of any kind of transformation and the prohibition only of restructuring (and not recoding or compression).
At present HTTP does not provide a mechanism for communicating original header
values (hence the use of X-Device- headers as discussed under
4.1.5 Alteration of HTTP Header Values).
A number of mechanisms exist in HTTP which might be exploited given more precise
definition of their operation - for example the OPTIONS method and
the HTTP 300 (Multiple Choices) Status.
It is noted that there are means which fall outside of the scope of this document for establishing user preferences with content transformation proxies. It is anticipated that proxies will maintain preferences on a user by user and Web site by Web site basis, and will change their behavior in the light of changing circumstances as discussed under 4.3.4 Receipt of Vary HTTP Header.
The editor acknowledges contributions of various kinds from members of the Mobile Web Best Practices Working Group Content Transformation Task Force.
The editor acknowledges significant written contributions from:
In general, I must state unequivocally that our experience with current transformation proxies deployed throughout the world has always been negative, since all proxies seem to transform original mobile content regardless, with results ranging from passable to outrageously unusable. The draft, while an interesting attempt to bring some order in the wild practices that abound in the mobile Web, is still vague and incomplete in several points, and thus, in its present form, may not stem some of the more egregious forms of transcoding we have witnessed so far.LC-2025 from casays <casays@yahoo.com>
Here's my comments. In summary, the group really needs to decide whether this is a guidelines document, or a protocol. It can't be both. A lot of work remains.LC-2043 from Mark Baker <distobj@acm.org>
To the W3C Mobile Web Best Practices Working Group: The Internet Architecture Board has reviewed the subject document, and notes that it has previously reviewed related work done in the IETF in the Open Pluggable Edge Services (OPES) Working Group. In its preview and review of OPES work, the IAB expressed its concerns about privacy, control, monitoring, and accountability of such services in RFC 3238 [ http://tools.ietf.org/html/rfc3238 ]. We have no specific architectural concerns with the "Content Transformation Guidelines" document as written; it does seem to take into account the questions raised during the OPES discussions. We would like, though, to make that explicit by specifically documenting that you reviewed and considered the issues in RFC 3238. Barry Leiba, for the Internet Architecture Board ( http://iab.org )LC-2097 from Barry Leiba <leiba@watson.ibm.com>
My name is Dennis Bournique. I write about mobile browsing, primarily from a user perspective, at http://wapreview.com. I've done a little web development, mostly mobile specific sites, but I'm by no means an expert on the technical side of this issue. Putting on my user hat, I'd like to make a request that the Content Transformation Guidelines include a requirement that content transformation proxies "must" provide end users (consumers of web content) with a way to turn off transformation both globally and on a site by site basis. As an end user, I’ve experienced both the joys and the frustrations of using content transformation proxies. In general, I believe in content transformation as a valuable tool to make web content, which would otherwise be difficult or impossible to use, available through the limited browsers of many mobile phones. I have also been frustrated when a carrier or content provider unilaterally imposes content transformation with no way for me to disable it. I've been unable to access content through content transformation proxies that was previously available on the same device using a direct connection. This has happened both with installable content such as midlets and ringtones and also with pure html and xhtml pages, including mobile optimized pages and those that are not. I have also seen my secure end to end HTTPS traffic being forced through content transformation proxies, exposing it to the potential for a "man in the middle" attack. I understand that the Guidelines are intended to prevent these sorts of problems by specifying when content transformation proxies must allow content to flow directly between server and user agent without modification. This is good, but no technical solution can ever be perfect. There will always be edge cases where content transformation does more harm than good. For this reason it is important that end users have the option to opt-out of content transformation. I propose that the Guidelines be amended to include the following or similar language. "...1. Content transformation proxies, if they are modifying traffic between a server and a user agent in any way, MUST provide a mechanism allowing the end user to resubmit the request and disable content transformation for the duration of the current session." "...2. Content transformation proxies, must provide a means for end users of that proxy to disable all content transformation until they take explicit action to re-enable it."LC-2065 from Dennis Bournique <db@wapreview.com>
Old text Request resource with original headers If the response is a 406 response: If the response contains Cache-Control: no-transform, forward it Otherwise re-request with altered headers If the response is a 200 response: If the response contains Vary: User-Agent, an appropriate link element or header, or Cache-Control: no-transform, forward it Otherwise assess whether the 200 response is a form of "Request Unacceptable" If it is not, forward it If it is, re-request with altered headers BUT WHERE IS THE TRANSCODING? New Text: Request resource with original headers If the response is a 406 response: If the response contains Cache-Control: no-transform, forward it Otherwise re-request with altered headers If the response is a 200 response: If the response contains Vary: User-Agent, an appropriate link element or header, or Cache-Control: no-transform, forward it Otherwise assess whether the 200 response is a form of "Request Unacceptable" If it is not, TRANSCODE it If it is, re-request with altered headersLC-2089 from Heiko Gerlach <heiko.gerlach@vodafone.com>
A purely editorial note: your markup is reusing the ID sec-purpose
(perhaps others) more than once. This makes the document invalid:
<a name="sec-purpose" id="sec-purpose">From the point of view of this
document, Content Transformation is the
manipulation in various ways, by proxies, of
requests made to and content
delivered by an origin server with a view to making
it more suitable for mobile
presentation.</a></p><p><a name="sec-purpose"
id="sec-purpose">The W3C Mobile Web Best Practices Working Group neither
approves nor disapproves of Content Transformation, but
recognizes that is being deployed widely across
mobile data access networks. The
deployments are widely divergent to each other,
with many non-standard HTTP
implications, and no well-understood means either
of identifying the presence of
such transforming proxies, nor of controlling their
actions. This document
establishes a framework to allow that to
happen.</a></p><p><a name="sec-purpose" id="sec-purpose">The overall
objective of this document is to provide a means, as far as is
practical, for users to be provided with at least a
</a><a
href="http://www.w3.org/TR/di-gloss/#def-functional-user-experience">"functional
user experience"</a>
<a href="#ref-DIGLOSS">[Device Independence
Glossary]</a> of the Web, when mobile, taking into account the
fact that an increasing number of content providers
create experiences specially
tailored to the mobile context which they do not
wish to be altered by third
parties. Equally it takes into account the fact
that there remain a very large
number of Web sites that do not provide a
<em>functional user
experience</em> when perceived on many mobile
devices.</p>LC-2064 from Elliotte Harold <elharo@metalab.unc.edu>