This document has annotations on an Annotea server: (hide/show)

W3C

Would it perhaps be better to give this specification a more informative
title, or at least some sort of informative subtitle?

The phrase "Content Transformation" sounds to an uninitiated reader
as if it could apply to anything from the use of the data manipulation
language (e.g. SQL) in a database management system, to the use of
XSLT, or the SAX or DOM interfaces, to transform XML documents, to
the use of dynamic HTML techniques to transform data in the browser.

Perhaps "Mobile Web Content Transformation"?  Or "Content Transformation
for Mobile Presentation"?  Surely there are ways of making it easier
for potential readers to see whether the document is relevant to their
concerns or not.

This isn't the first W3C spec to have such a generic title; the  
experience
of the XML Schema specification, however, leads me to commend to you
urgently the wisdom of have a more specific, more informative, less
generic title for your document.

--Michael Sperberg-McQueen
   W3C XML Activity
LC-2018 from C. M. Sperberg-McQueen <cmsmcq@acm.org>
Content Transformation Guidelines 1.0

W3C Working Draft 1 August 2008

This version:
http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/
Latest version:
http://www.w3.org/TR/ct-guidelines/
Previous version:
http://www.w3.org/TR/2008/WD-ct-guidelines-20080414/
Editor:
Jo Rabin, mTLD Top Level Domain (dotMobi)

Abstract

This document provides guidance to content transformation proxies and content providers as to how inter-work when delivering Web content.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is a Last Call Working Draft of Content Transformation Guidelines 1.0, expected to become a W3C Recommendation. The W3C Membership and other interested parties are invited to review the document and send comments to public-bpwg-comments@w3.org (with public archive) through 16 September 2008.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document has been produced by the Mobile Web Best Practices Working Group as part of the Mobile Web Initiative.

Since its publication as a First Public Working Draft on 14 April 2008, the Content Transformation Guidelines 1.0 document has been almost entirely re-written. The guidelines were extended, precised and re-worded for clarity reasons. In particular:

The Working Group notes that it has already identified a guideline considered to be at risk: the notification of users as defined in section 4.1.4 Serving Cached Responses may turn out to be difficult to implement from a user experience's point of view and may be removed in future versions of the document based on the feedback received.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1 Introduction (Non-Normative)

    1.1 Purpose

    1.2 Audience

    1.3 Scope

    1.4 Summary of Requirements

2 Terminology (Normative)

    2.1 Types of Proxy

    2.2 Types of Transformation

3 Conformance (Normative)

    3.1 Classes of Product

    3.2 Normative and Informative Parts

    3.3 Normative Language for Conformance Requirements

    3.4 Content Deployment Conformance

    3.5 Transformation Deployment Conformance

4 Behavior of Components (Normative)

    4.1 Proxy Forwarding of Request

        4.1.1 Applicable HTTP Methods

        4.1.2 no-transform directive in Request

        4.1.3 Treatment of Requesters that are not Web browsers

        4.1.4 Serving Cached Responses

        4.1.5 Alteration of HTTP Header Values

            4.1.5.1 Content Tasting

            4.1.5.2 Avoiding "Request Unacceptable" Responses

            4.1.5.3 User Selection of Restructured Experience

            4.1.5.4 Sequence of Requests

            4.1.5.5 Original Headers

        4.1.6 Additional HTTP Headers

            4.1.6.1 Proxy Treatment of Via Header

    4.2 Server Response to Proxy

        4.2.1 Use of HTTP 406 Status

        4.2.2 Server Origination of Cache-Control: no-transform

        4.2.3 Varying Representations

            4.2.3.1 Use of Vary HTTP Header

            4.2.3.2 Indication of Intended Presentation Media Type of Representation

    4.3 Proxy Forwarding of Response to User Agent

        4.3.1 Receipt of Cache-Control: no-transform

        4.3.2 Receipt of Warning: 214 Transformation Applied

        4.3.3 Server Rejection of HTTP Request

        4.3.4 Receipt of Vary HTTP Header

        4.3.5 Link to "handheld" Representation

        4.3.6 Proxy Decision to Transform

            4.3.6.1 Alteration of Response

            4.3.6.2 HTTPS Link Re-writing

5 Testing (Normative)

Appendices

A References

B Example Transformation Interactions (Non-Normative)

    B.1 Basic Content Tasting by Proxy

    B.2 Optimization based on Previous Server Interaction

    B.3 Optimization based on Previous Server Interaction, Server has Changed its Operation

    B.4 Server Response Indicating that this Representation is Intended for the Target Device

    B.5 Server Response Indicating that another Representation is Intended for the Target Device

C Applicability to Transforming Solutions which are Out of Scope (Non-Normative)

D Scope for Future Work (Non-Normative)

    D.1 POWDER

    D.2 link HTTP Header

    D.3 Sources of Device Information

    D.4 Inter Proxy Communication

    D.5 Amendment to and Refinement of HTTP

E Administrative Arrangements (Non-Normative)

F Acknowledgments (Non-Normative)


Introduction

"From the point of view of this document, Content Transformation is the
manipulation in various ways, by proxies, of requests made to and content
delivered by an origin server with a view to making it more suitable for
mobile presentation."

It took me three times to read the sentence in order to understand it. I
think the sentence can be simplified

Proposed Amendment

Convert the sentence in something clearer.
LC-2012 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
1 Introduction (Non-Normative)

1.1 Purpose

From the point of view of this document, Content Transformation is the manipulation in various ways, by proxies, of requests made to and content delivered by an origin server with a view to making it more suitable for mobile presentation.

The W3C Mobile Web Best Practices Working Group neither approves nor disapproves of Content Transformation, but recognizes that is being deployed widely across mobile data access networks. The deployments are widely divergent to each other, with many non-standard HTTP implications, and no well-understood means either of identifying the presence of such transforming proxies, nor of controlling their actions. This document establishes a framework to allow that to happen.

The overall objective of this document is to provide a means, as far as is practical, for users to be provided with at least a "functional user experience" [Device Independence Glossary] of the Web, when mobile, taking into account the fact that an increasing number of content providers create experiences specially tailored to the mobile context which they do not wish to be altered by third parties. Equally it takes into account the fact that there remain a very large number of Web sites that do not provide a functional user experience when perceived on many mobile devices.

1.2 Audience

The audience for this document is creators of Content Transformation proxies, purchasers and operators of such proxies and content providers whose services may be accessed by means of such proxies.

1.3 Scope

The recommendations in this document refer only to "Web browsing" - i.e. access by user agents that are intended primarily for interaction by users with HTML Web pages (Web browsers) using HTTP. Clients that interact with proxies using mechanisms other than HTTP (and that typically involve the download of a special client) are out of scope, and are considered to be a distributed user agent. Proxies which are operated in the control of or under the direction of the operator of an origin server are similarly considered to be a distributed origin server and hence out of scope.

The BPWG is not chartered to create new technology - its role is to advise on best practice for use of existing technology. In satisfying Content Transformation requirements, existing HTTP headers, directives and behaviors must be respected, and as far as is practical, no extensions to [RFC 2616 HTTP] are to be used.

1.4 Summary of Requirements

This section summarizes the communication requirements of actors (users, user agents, transforming proxies and origin servers) to communicate with each other. It is recognised that several transformation proxies may be present but their interactions are not discussed in detail. The relevant scenario is as follows:

Interactions between actors

The needs of these actors are as follows:

  1. The user agent needs to be able to tell the Content Transformation proxy and the origin server:

    1. what type of mobile device and which user agent is being used;

    2. that all Content Transformation should be avoided.

  2. The Content Transformation proxy needs to be able to tell the origin server:

    1. that some degree of Content Transformation (restructuring and recoding) can be performed;

    2. that content is being requested on behalf of something else and what that something else is;

    3. that the request headers have been altered and what the original ones were.

  3. The origin server needs to be able to tell the Content Transformation proxy:

    1. that it varies the representation of its responses according to device type and other factors;

    2. that it is not permissible to perform Content Transformation;

    3. that it has media-specific representations;

    4. that is unable or unwilling to deal with the request in its present form.

  4. The Content Transformation proxy needs to be able to tell the user agent:

    1. that it has applied transformations of various kinds to the content.

  5. The Content Transformation proxy needs to be able to interact with the user:

    1. to allow the user to disable its features;

    2. to alert the user to the fact that it has transformed content and to allow access to an untransformed representation of the content.

Note:

A more extensive discussion of the requirements for these guidelines can be found in "Content Transformation Landscape" [CT Landscape].

2 Terminology (Normative)

* Section 2.1 - "Alteration of HTTP requests and responses is not  
prohibited by HTTP other than in the circumstances referred to in  
[RFC2616 HTTP] Section 13.5.2."  This isn't true; section 14.9.5 needs  
to be referenced here as well.
LC-2066 from Mark Nottingham <mnot@mnot.net>
2.1 Types of Proxy

Alteration of HTTP requests and responses is not prohibited by HTTP other than in the circumstances referred to in [RFC 2616 HTTP] Section 13.5.2.

HTTP defines two types of proxy: transparent proxies and non-transparent proxies. As discussed in [RFC 2616 HTTP] Section 1.3, Terminology:

"A transparent proxy is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification. A non-transparent proxy is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering. Except where either transparent or non-transparent behavior is explicitly stated, the HTTP proxy requirements apply to both types of proxies."

This document elaborates the behavior of non-transparent proxies, when used for Content Transformation in the context discussed in [CT Landscape].

1) Section 2.2.1:

The CTG distinguishes between retructuring, recoding and
optimization. This is a useful approach, and the distinction
could be used more systematically across the document. However,
without a formal definition of these terms, various parties
are left with too much leeway when classifying some operations
one or the other of the categories. This may entail inconsistencies 
regarding the interpretation of the guidelines.

The guidelines should:
a) Define formally each of the three categories, possibly on
the basis of language theory. As an example, optimization seems
to be related to equivalent token streams (for textual content),
whereas recoding seems to deal with equivalent parse trees. Some
operations are reversible, others are not. The W3C is home to 
technologies such as XSLT, so there should be competence there 
to help ground definitions on solid formal concepts. Basing such 
definitions on formal language theory is a suggestion, not a 
requirement; other formally grounded definitions are possible.
b) Define exactly how to classify an operation that spans several
categories. As an example, converting HTML to XHTML while at the
same time eliminating comments and redundant white space should
amount to a recoding.
LC-2050 from casays <casays@yahoo.com>
2.2 Types of Transformation

Transforming proxies can carry out a wide variety of operations. In this document we categorize these operations as follows:

  1. Alteration of Requests

    Transforming proxies process requests in a number of ways, especially replacement of various request headers to avoid HTTP 406 Status responses (if a server can not provide content that is compatible with the original HTTP request headers) and at user request.

  2. Alteration of Responses

    There are three classes of operation on responses:

    1. Restructuring content

      Restructuring content is a process whereby the original layout is altered so that content is added or removed or where the spatial or navigational relationship of parts of content is altered, e.g. by linearization or pagination. It includes also rewriting of URIs so that subsequent requests route via the proxy handling this response.

    2. Recoding content

      Recoding content is a process whereby the layout of the content remains the same, but details of its encoding may be altered. Examples include re-encoding HTML as XHTML, correcting invalid markup in HTML, conversion of images between formats (but not, for example, reducing animations to static images).

    3. Optimizing content

      Optimizing content includes removing redundant white space, re-compressing images (without loss of fidelity) and compressing for transfer.

3 Conformance (Normative)

3.1 Classes of Product

The Content Transformation Guidelines specification has two classes of products:

Content Deployment

A Content Deployment is the provision of resources intended for retrieval by user agents. Provisions that are applicable to a Content Deployment are identified in this document by use of the terms "origin server", "server" and "Web site" in the singular or plural.

Transformation Deployment

A Transformation Deployment is the provision of non-transparent components in the path of HTTP requests and responses. Provisions that are applicable to a Transformation Deployment are identified in this document by use of the term "transforming proxy" or "proxy" in the singular or plural.

3.2 Normative and Informative Parts

Normative parts of this document are identified by the use of "(Normative)" following the section name. Informative parts are identified by use of "(Non-Normative)" following the section name.

3.3 Normative Language for Conformance Requirements

The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this Recommendation have the meaning defined in [RFC 2119].

* Section 3.4 / 3.5 "A [Content|Transformation] Deployment conforms to  
these guidelines if it follows the statements..."  What does "follows"  
mean here -- if they conform to all MUST level requirements? SHOULD  
and MUST?
LC-2067 from Mark Nottingham <mnot@mnot.net>
3.4 Content Deployment Conformance

A Content Deployment conforms to these guidelines if it follows the statements in 4.2 Server Response to Proxy.

3.5 Transformation Deployment Conformance

A Transformation Deployment conforms to these guidelines if it follows the statements in 4.1 Proxy Forwarding of Request, 4.3 Proxy Forwarding of Response to User Agent and 5 Testing (Normative).

4 Behavior of Components (Normative)

Also, I see that CTG does not mention "whitelists". I think it should,
since many transcoders manage that. The rule (consistently with the concept
that transcoders must err on the side of not transcoding) should be that
whitelists can only specify which potentially mobile sites can be forced to
be trascoded (and not the other way around as happens to be common today,
thus potentially forcing mobile developers to ask operators in different
countries to whitelist their service, which is of course unacceptable).
LC-2003 from Luca Passani <passani@eunet.no>
4.1 Proxy Forwarding of Request

4.1.1 Applicable HTTP Methods

"Proxies should not intervene in methods other than GET, POST, HEAD and
PUT."

I can't think of any good reason for that.  If a request using an
extension method wants to avoid transformation, it can always include
the no-transform directive.
LC-2034 from Mark Baker <distobj@acm.org>
1) Section 4.1.1

Add to the section:

Proxies MUST NOT convert POST methods into GET ones, or vice-versa.

Rationale: This kind of transformation may make exchanges between
clients and servers inoperative. In particular, this kind of
substitution has been known to cause problems for content downloading
applications in the mobile Web.
LC-2019 from casays <casays@yahoo.com>
4.1.1 Applicable HTTP Methods

Proxies should not intervene in methods other than GET, POST, HEAD and PUT.

User agents sometimes issue HTTP HEAD requests in order to determine if a resource is of a type and/or size that they are capable of handling. A transforming proxy may convert a HEAD request into a GET request (in order to determine the characteristics of a transformed response that it would return if the user agent subsequently issued a GET request for the same resource).

If the HTTP method is altered from HEAD to GET, proxies should (providing such action is in accordance with normal HTTP caching rules) cache the response so that a second GET request for the same content is not required (see also 4.1.4 Serving Cached Responses).

* Section 4.1.2 "If the request contains a Cache-Control: no-transform  
directive proxies must forward the request unaltered to the server,  
other than to comply with transparent HTTP behaviour and as noted  
below."  I'm not sure what this sentence means.
LC-2068 from Mark Nottingham <mnot@mnot.net>
4.1.2 no-transform directive in Request

If the request contains a Cache-Control: no-transform directive proxies must forward the request unaltered to the server, other than to comply with transparent HTTP behavior and as noted below (see 4.1.6 Additional HTTP Headers).

Note:

An example of the use of Cache-Control: no-transform is the issuing of asynchronous HTTP requests, perhaps by means of XMLHTTPRequest [XHR], which may include such a directive in order to prevent transformation of both the request and the response.

http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/


Section 4.1.3 ...

""'
The mechanism by which a proxy recognizes the user agent as a Web
browser should use evidence from the HTTP request, in particular the
User-Agent and Accept headers.
"""

Please clarify -- is this just the *existence* of those headers, or
the specific values?  If it is the specific values, then please
provide some guidance (or a normative alternative) that new user
agents can use, before their names propagate to various whitelists.

-jJ
LC-2044 from Jim Jewett <jimjjewett@gmail.com>
* Section 4.1.3 "Proxies must act as though a no-transform directive  
is present (see 4.1.2 no-transform directive in Request) unless they  
are able positively to determine that the user agent is a Web  
browser."  How do they positively" determine this? Using heuristics is  
far from a guaranteed mechanism. Moreover, what is the reasoning  
behind this? If the intent is to only allow transformation of content  
intended for presentation to humans, it would be better to say that.  
In any case, putting a MUST-level requirement on this seems strange.
LC-2069 from Mark Nottingham <mnot@mnot.net>
4.1.3 Treatment of Requesters that are not Web browsers

Proxies must act as though a no-transform directive is present (see 4.1.2 no-transform directive in Request) unless they are able positively to determine that the user agent is a Web browser. The mechanism by which a proxy recognizes the user agent as a Web browser should use evidence from the HTTP request, in particular the User-Agent and Accept headers.

* Section 4.1.4 "Proxies should follow standard HTTP procedures in  
respect to caching..."  This seems a strange way to phrase it, and I  
don't think it's useful to use RF2616 language here.
LC-2070 from Mark Nottingham <mnot@mnot.net>
4.1.4 Serving Cached Responses

Proxies should follow standard HTTP procedures in respect of caching and should use cached copies of resources where this is in accordance with those procedures.

In some circumstances, proxies may paginate responses and where this is the case a request may be for a subsequent page of a previously requested resource. In this case proxies may for the sake of consistency of representation serve stale data but when doing so should notify the user that this is the case and should provide a simple means of retrieving a fresh copy.

The rest of the 4.1.5.* sections all seem to be basically "Here's some
things that some proxies do".  By listing them, are you saying these
are good and useful things, i.e. best practices?  If so, perhaps that
should be made explicit.
LC-2038 from Mark Baker <distobj@acm.org>
I posted the following message in the WMLProgramming mailing list.
People have suggested that I publish it as a formal comment to the CTG
draft, so here it is, under the heading "Allowing modifications of the
HTTP header field user-agent: rationale missing".

Eduardo Casais
areppim AG
Bern Switzerland
------------------

I would like to review (a last time) an issue that reoccurs in all
discussions about transcoders.

> Changing User Agent or other headers is not prohibited by HTTP

The first thing to stress is that the user-agent is essential to drive
content selection and generation processes, both in the mobile Web and
in the desktop Web. 

a) In the mobile Web, the user-agent is directly associated to the
actual device, and hence serves as a key to characteristics such as
screen dimensions, preferred content types, etc. The advent of
uaprof/ccpp was supposed to make this mapping unncessary, but it is
not the case: uaprof descriptions are often missing, point to invalid
URL, omit important information, or are just plain unreliable. Device
databases like WURFL, based on user-agent mappings, thus remain
indispensible.
b) In the desktop (non-mobile) Web, developers have long relied upon
the user-agent to identify the browsers issuing requests in order to
tailor content to their "quirks". This has been going on at least
since the times of the Netscape / IE wars.

Let us now examine the use cases of a mobile Web browser accessing the
Internet, and evaluate the relevance of the user-agent -- assuming
that transcoders systematically substitue the original value with a
new one.

1.a User-agent-switcher.
	The Web server is able, based on the user-agent, to
	provide a mobile-optimized or a full-web service.
	It therefore needs the original user-agent; modifying
	it is unhelpful.
2.a Mobile Web only.
	2.a.1 Generic content.
	The server returns generic mobile content, without
	customizing it for any specific user-agent.
	This kind of applications is rare, and often
	corresponds to surviving examples of text-only
	services developed for older PDA and WAP 1 devices.
	Since the server does not use the original user
	agent, modifying it is useless.
	2.a.2 Mobile with default.
	The server returns mobile-optimized content. When
	not recognizing the user-agent, it returns a default,
	best-effort representation, perhaps with a message
	suggesting that the content is tailored for mobile
	devices.
	Since the server relies upon the user-agent, and is
	able to return a default representation, modifying it
	is unhelpful.
	2.a.3 No default.
	The server returns mobile-optimized content, but will
	return an error (page with "unsupported browser" warning
	or return code "request not acceptable") whenever it
	does not recognize the user-agent. In this case, modifying
	the original user-agent is most unhelpful, as it guarantees 
	that the server will not recognize it as a valid
	mobile one. If the server does not, for whatever reason
	(e.g. incomplete device database), recognize a mobile
	user-agent, then there might be a case for modifying it
	towards an acceptable mobile one -- but transcoders 
	precisely do the reverse: they change a mobile user-agent
	to an exotic full-Web one. Hence, a modification of the 
	original user-agent is unhelpful all situations.
3.a Full Web only.
	3.a.1 Generic content.
	The Web server serves generic full-Web content, without
	looking at the user-agent. In this case, modification of
	the original user-agent is useless.
	3.a.2 Tailored full-web with default.
	The server returns full-web content customized for specific
	full-web user-agents (e.g. IE 6.0, IE 7.0 and Firefox 2.0),
	and serves a default representation, perhaps with a warning
	("This site is best viewed with the following browsers:...")
	for other user-agents. In this case, modification of the
	original user-agent is either useless (in any case a default
	representation will be returned), or unhelpful (the default
	representation is probably better downgradable than one
	specifically customized for a very specific full-Web browser).
	3.a.3 Tailored with no default.
	The server returns content tailored for specific full-Web
	browsers, and an error for other unrecognized or unsupported
	user-agents.
	Here there is a case to substitute the original user-agent to
	force retrieval of content. However, this works only if the
	"fake" user-agent precisely corresponds to one of those
	accepted by the server -- but transcoders do not tailor their
	substitute user-agents with respect to the application server:
	the include only general hints (like Mozilla/x.y) in the hope
	this is enough to determine content generation.
	Hence, the generic substitution of user-agents performed by
	transcoders is not appropriate here.

Conclusion: in two cases, modification of the user-agent is useless,
in three it is detrimental, in one it is either useless or
detrimental, and in one it could be helpful, but it is currently done
inappropriately.

Let us consider the interesting symetric use-cases: a full-Web mobile
device accessing the Internet.

1.b User-Agent switcher
	Following the same reasoning as in 1.a, we find that
        modification of the original user-agent is unhelpful.
2.b Mobile Web only.
	2.b.1 Generic content.
	Whatever user-agent, the server returns generic mobile-optimized
	content. A modification of the original user-agent is useless.
	2.b.2 Tailored with default.
	The server returns mobile-optimized content, and a default
	representation for unrecognized user-agent. Modifying it is
	therefore useless -- the default representation will be returned
	whether the original (full-Web) or the substitute (pseudo-full
	Web) agent appears in the request.
	2.b.3 Tailored, no default.
	Following the same reasoning as in 2.a.3, the substitution of 
	original (full-Web) user-agent by a fake (full-Web) one is 
	useless, as it will anyway return an error.
3.b Full Web only
	3.b.1 Generic content.
	If the server returns generic full-Web content whatever the user
	agent, then modifications of the user-agent are useless.
	3.b.2 Tailored with default.
	Following the reasoning in 3.a.2, modifying the original
        full-Web user agent is either unhelpful (because the server
        could have recognized the mobile device's agent), or useless
        (the same default representation would be returned).
	3.b.3 Tailored, no default.
	Here the server may recognize the full-Web user-agent of the
        mobile device; it is therefore unhelpful to modify it. Or it
        might not support that specific user-agent, in which case it
        would be sensible to substitute one that is effectively
        supported by the server; however, this is not what transcoders
        do: they provide a generic, not a real user-agent instead --
        this is inappropriate.

So one situation where it is detrimental, four where it is useless,
one  where it is either useless or detrimental, and one where it is
either useless or inappropriately done. It is also an acid test: do
transcoders modify requests from full-Web capable mobile browsers? If
so, something seriously weird is going on, as the excuse has generally
been to make full-Web content available to non-full-Web capable devices.

>From this examination, one can only conclude that proponents of the
preservation of the original user-agent do not have to justify their
position and established practice. Rather, the onus is on the
proponents of the substitution of the user-agent to argue in favour of
their approach, which disrupts established practice. There is
basically only one use case where changing the mobile user agent to a
desktop user agent might help, but it remains to demonstrate:

a) The relevance of the scenario. Perhaps people at Google could let
one of their crawlers roam over a few tens of thousands of WWW sites
to gather statistics on the relative frequency of each aforementioned
scenario.
b) The benefits resulting from handling that specific scenario.
c) That (a) and (b), taken together, are so overwhelming that they
more than compensate the disruptions caused in all other use-cases.

If another use case outside the framework I have presented here pops
up, this does not reduce the need for an assessment based on (a), (b),
(c). 

As a final remark, I would like to note that transcoders have been
operating in the mobile Web for a long time. It started with
adaptation of HTML for PDA (Web clipping) and HTML to HDML conversion,
continued with HTML to WML, before arriving at the current crop of
content adaptation. In the old times, developers of content adaptation
software were wary of modifying the user-agent: turning generic WWW
content into a form suitable for mobile devices is so fraught with
difficulties that one would take every chance to let a server return
mobile optimized content (based on the user-agent) if it could. It is
only fairly recently that, without much justification, transcoders
have started in a
systematic way to overwrite the user-agent field.

I think I have said everything I wanted regarding the CTG. The
document  requires quite some rework -- nothing exceptional, since it
is a draft. I will lean back and wait for the results of this round of
revisions. Till then, readers of the WMLprogramming and W3C BPWG lists
can rejoice in the knowledge that my long-winded posts are abating at
last.


E. Casais
LC-2054 from casays <casays@yahoo.com>
The styleguide should spell out very clearly "The Transcoder is NOT
allowed to change the User-Agent String".
LC-2005 from EdPimentl <edpimentl@gmail.com>
- the styleguide should spell out very clearly "The Transcoder is NOT 
allowed to change the User-Agent String".
  I understand that the current document says "do not change headers", but
at the same time, there are clauses ("the user has specifically requested a
restructured desktop experience") which would allow abusive transcoders to
find an excuse and keep being abusive of the rights of content owners.
Preventing transcoders from changing the UA string is an effective way to
avoid this abuse.
LC-1996 from Luca Passani <passani@eunet.no>
* Section 4.1.5 Bullet points one and 3 are get-out-of-jail-free cards  
for non-transparent proxies to ignore no-transform and do other anti- 
social things. They should either be tightened up considerably, or  
removed.
LC-2071 from Mark Nottingham <mnot@mnot.net>
* Section 4.1.5 What is a "restructured desktop experience"?
LC-2072 from Mark Nottingham <mnot@mnot.net>
* Section 4.1.5 "proxies should use heuristics including comparisons  
of domain name to assess whether resources form part of the same "Web  
site."  I don't think the W3C should be encouraging vendors to  
implement yet more undefined heuristics for this task; there are  
several approaches already in use (e.g., in cookies, HTTP, security  
context, etc.); please pick one and refer to it specifically.
LC-2073 from Mark Nottingham <mnot@mnot.net>
5) Section 4.1.5.

Statement to be added:

"The request MUST NOT be altered Whenever the URI of the 
request indicates that the resource being accessed is able
to provide mobile-optimized content, e.g. the domain is
*.mobi, wap.*, m.*, mobile.*, pda.*, imode.*, iphone.*,
or the leading portion of the path is /m/ or /mobile/."

Rationale: The guidelines make the assumption that all
requests may first undergo a transformation before possibly
falling back on a transformationless mode of operation. 
This is unwarranted, and does not correspond to the way
many deployed proxies operate. 

Obviously, it is rather pointless to go all the way to 
send a request to the server and wait for its response
in order to detect whether the resources accessed are for
mobile use, when it is already possible to do this by
inspecting the request of the client. The addition to the
guidelines covers this situation, and corresponds to the
state of the art in transformation proxies. It is also
consistent with the heuristic serving to determine whether
a response is already mobile-optimized. Following this
new guideline improves the performance of the entire 
content delivery chain without loss of functionality,
and is congruent with the stated objective of the 
guidelines of not disturbing mobile-optimized content.
LC-2049 from casays <casays@yahoo.com>
Hello,

As the technical lead for SingleClick Systems mobile development, I'm 
writing to protest the W3C's failure to provide a clear rule against the 
modification of the User-Agent header.

As mobile developers, my team spends a lot of time creating the mobile 
experience *we* want our users to see. If our users are subjected to the 
confusing experience of transcoding, we lose money.

I urge the W3C to adopt the standards set forth by Luca Passani's 
Manifesto, of which I'm sure you're aware.

Sincerely,
Terren Suydam
LC-2017 from Terren Suydam <terren@singleclicksystems.com>
4.1.5 Alteration of HTTP Header Values

RFC 2616 already says a lot about this. See sec 13.5.2 for example.

"The theoretical idempotency of GET requests is not always respected
by servers. In order, as far as possible, to avoid mis-operation of
such content, proxies should avoid issuing duplicate requests and
specifically should not issue duplicate requests for comparison
purposes."

First of all, do you mean "safe" or "idempotent"?  That you refer only
to GET suggests safety, but the second sentence suggests you are
referring to idempotency.  So please straighten that out.  Oh, and
there's nothing "theoretical" about GET's safety or idempotency; it's
by definition, in fact.

Secondly, if the server changes something important because it
received a GET request, then that's its problem.  Likewise, if it
changes something non-idempotently because it received a PUT request,
that's also something it has to deal with.  In both cases though, the
request itself is idempotent (and safe with GET), so I see no merit to
that advice that you offer ... unless of course the problem you refer
to is pervasive which clearly isn't the case.

I also wonder if most of 4.1.5 shouldn't just defer to 2616.  As is,
large chunks of this section (as well as others) specify a protocol
which is a subset of HTTP 1.1.  (see also the RFC 2119 comment above)
LC-2036 from Mark Baker <distobj@acm.org>
5) Section 4.1.5.

Statement to be added:

"In so far as the transformation carried out by the proxies is
to make content intended for a certain class A of devices 
available to devices of another class B, then requests MUST NOT
be modified whenever a client of a certain class is accessing
content intended for its class. 

If the class of request (either mobile-optimized or full-Web) is
not unambiguously determined from the URI pattern, the proxy
MUST take into account the original user-agent to avoid 
unnecessary transformations."

Rationale: It is obviously pointless to transform full-Web 
content accessed by full-Web capable devices (or vice-versa, 
transforming mobile-optimized content for devices with mobile 
browsers). Two cases illustrate the situation.
a) When full-Web devices such as advanced HTC PDAs, iPhones 
or tablets access the Web, there is no guarantee that an 
established server will include a no-transform directive; in
fact, it might explicitly leave it out to allow transformation
to cater for non-full-Web capable devices. Further, the 
proposed heuristics will not work: the MIME types of returned
content will indicate full-Web content (e.g. text/html), as
well as the DOCTYPE (e.g. -//W3C//DTD HTML 4.01//EN).
b) When i-Mode terminal accessing i-Mode applications, there is
no guarantee that the corresponding servers return a no-transform
directive (since it is irrelevant for i-Mode applications). 
Heuristics may not work either, since content is largely returned
as text/html, and without any DOCTYPE declaration.
LC-2053 from casays <casays@yahoo.com>
4.1.5 Alteration of HTTP Header Values

Other than to comply with transparent HTTP operation, proxies should not modify request headers unless:

  1. the user would be prohibited from accessing content as a result of the server responding that the request is "unacceptable" (see 4.3.3 Server Rejection of HTTP Request);

  2. the user has specifically requested a restructured desktop experience;

  3. the request is part of a sequence of requests to the same Web site and either it is technically infeasible not to adjust the request because of earlier interaction, or because doing so preserves consistency of user experience.

These circumstances are detailed in the following sections.

Note:

In this section, the concept of "Web site" is used (rather than "origin server") as some origin servers host many different Web sites. Since the concept of "Web site" is not strictly defined, proxies should use heuristics including comparisons of domain name to assess whether resources form part of the same "Web site".

* Section 4.1.5.1 Proxies (and other clients) are allowed to and do  
reissue requests; by disallowing it, you're profiling HTTP, not  
providing guidelines.
LC-2074 from Mark Nottingham <mnot@mnot.net>
4.1.5.1 Content Tasting

The theoretical idempotency of GET requests is not always respected by servers. In order, as far as possible, to avoid mis-operation of such content, proxies should avoid issuing duplicate requests and specifically should not issue duplicate requests for comparison purposes.

* Section 4.1.5.2 Again, not specifying the heuristics is going to  
lead to differences in behaviour, which will cause content authors to  
have to account for this as well.

* Section 4.1.5.2 "A proxy must not re-issue a POST/PUT request..." Is  
this specific to POST and PUT, or all requests with bodies, or...?
LC-2075 from Mark Nottingham <mnot@mnot.net>
I don't understand the need for 4.1.5.2.  The second paragraph in
particular seems overly specific, as proxies should obviously not be
retrying POST requests unless an error - any error - was received.
PUT messages can be retried because they're idempotent.
LC-2037 from Mark Baker <distobj@acm.org>
4.1.5.2 Avoiding "Request Unacceptable" Responses

A proxy may reissue a request with altered HTTP header values if a previous request with unaltered values resulted in the origin server rejecting the request as "unacceptable" (see 4.3.3 Server Rejection of HTTP Request). A proxy may apply heuristics of various kinds to assess, in advance of sending unaltered header values, whether the request is likely to cause a "request unacceptable" response. If it determines that this is likely then it may alter header values without sending unaltered values in advance, providing that it subsequently assesses the response as described under 4.3.4 Receipt of Vary HTTP Header below, and is prepared to reissue the request with unaltered headers, and alter its subsequent behavior in respect of the Web site so that unaltered headers are sent.

A proxy must not re-issue a POST/PUT request with altered headers when the response to the unaltered POST/PUT request has HTTP status code 200 (in other words, it may only send the altered request for a POST/PUT request when the unaltered one resulted in an HTTP 406 response, and not a "request unacceptable" response).

4.1.5.3 User Selection of Restructured Experience

Proxies may offer users an option to choose to view a restructured experience even when a Web site offers a choice of user experience. If a user has made such a choice then proxies may alter header values when requesting resources in order to reflect that choice, but must, on receipt of an indication from a Web site that it offers alternative representations (see 4.2.3.2 Indication of Intended Presentation Media Type of Representation), inform the user of that and allow them to select an alternative representation.

Proxies should assume that by default users will wish to receive a representation prepared by the Web site. Proxies must assess whether a user's expressed preference for a restructured representation is still valid if a Web site changes its choice of representations (see 4.3.4 Receipt of Vary HTTP Header).

>From 4.1.5.4, "When requesting resources that form part of the
representation of a resource (e.g. style sheets, images), proxies
should  make the request for such resources with the same headers as
the request for the resource from which they are referenced.".  Why?
There may be lots of reasons for using different headers on these
requests.  For example, I'd expect the Accept header to be different
for a stylesheet than for an image.  What are you trying to accomplish
with this restriction?
LC-2039 from Mark Baker <distobj@acm.org>
* Section 4.1.5.4 Use of the term 'representation' is confusing here;  
please pick another one.

* Section 4.1.5.4 Using the same headers is often not a good idea.  
More specific, per-header advice would be more helpful.
LC-2076 from Mark Nottingham <mnot@mnot.net>
4.1.5.4 Sequence of Requests

When requesting resources that form part of the representation of a resource (e.g. style sheets, images), proxies should make the request for such resources with the same headers as the request for the resource from which they are referenced.

For the purpose of consistency of representation, proxies may request linked resources (e.g. those referenced using the a element) that form part of the same Web site as a previously requested resource with the same headers as the resource from which they are referenced.

When requesting linked resources that do not form part of the same Web site as the resource from which they are linked, proxies should not base their choice of headers on a consistency of presentation premise.

Original headers MUST not be changed (User-Agent string has a special
place, but also the UAProf x-wap-profile is very very relevant).
LC-2006 from EdPimentl <edpimentl@gmail.com>
4.1.5.5 defines a protocol.  This should be in an Internet Draft, not
in a guidelines document.
LC-2040 from Mark Baker <distobj@acm.org>
2) Section 4.1.5.5

Statement to be inserted:

"Except when explicitly provided for by RFC2616 to comply
with HTTP operations, a proxy MUST NOT delete HTTP header 
fields received upstream from the client or downstream
from the server."

Rationale: deployed transcoders have been known to filter
out entire HTTP fields, preventing servers from performing
adequate content delivery. In some environments, this 
behaviour seems to have affected x-wap-profile in particular.
The statement makes it clear that deleting HTTP header fields
is in violation of the Web standards.
LC-2046 from casays <casays@yahoo.com>
- original headers MUST not be changed (User-Agent string has a special 
place, but also the UAProf x-wap-profile is very very relevant). This makes
it unnecessary to explain how original header values are recast to
different headers (this is not supposed to happen in any case). In short,
4.1.5.5 should be removed.
LC-1997 from Luca Passani <passani@eunet.no>
4.1.5.5 Since User-Agent has been the topic of some controversy in
comments, just wanted to voice support for the recommendation as
written here. While it is vital to preserve information about the
mobile device, this does not imply that User-Agent cannot be changed
if that information is otherwise preserved. Preserving the User-Agent
through a transforming proxy is misleading; the request is *not*
coming from a mobile device, but through a proxy. The origin server
should be aware of this.
LC-2014 from Sean Owen <srowen@google.com>
* Section 4.1.5.5 This is specifying new protocol elements; this is  
becoming a protocol, not guidelines.
LC-2077 from Mark Nottingham <mnot@mnot.net>
4.1.5.5 Original Headers

When forwarding an HTTP request with altered HTTP headers proxies must include in the request copies of the unaltered header values in the form "X-Device-"<original header name>. For example, the User-Agent header has been altered, an X-Device-User-Agent header must be added with the value of the received User-Agent header.

Note:

The X-Device- prefix was chosen primarily on the basis that this is a already existing convention. It is noted that the values encoded in such header may not ultimately derive from a device, they are merely received headers. The treatment of received X-Device headers, which may happen where there are multiple transforming proxies, is undefined (see D Scope for Future Work).

4.1.6 Additional HTTP Headers

Irrespective of the presence of a no-transform directive:

  • proxies should add the IP address of the initiator of the request to the end of a comma separated list in an X-Forwarded-For HTTP header;

  • proxies must include a Via HTTP header (see 4.1.6.1 Proxy Treatment of Via Header).

* Section 4.1.6.1 When a proxy inserts the URI to make a claim of  
conformance, exactly what are they claiming -- all must-level  
requirements are met? Should-level? What is the use case for this  
information?
LC-2078 from Mark Nottingham <mnot@mnot.net>
4.1.6.1 Proxy Treatment of Via Header

Proxies must (in accordance with compliance to RFC 2616) include a Via HTTP header indicating their presence and should indicate their conformance to this Recommendation by including a comment in the Via HTTP header consisting of the URI "http://www.w3.org/ns/ct".

When forwarding Via headers proxies should not alter them in any way.

Note:

According to [RFC 2616 HTTP] Section 14.45 Via header comments "may be removed by any recipient prior to forwarding the message". However, the justification for removing such comments is based on memory limitations of early proxies, most modern proxies do not suffer such limitations.

The use of MUST on the CTG when referring to the role of the server
should not be allow, since irresponsible transcoding companies will use
this to disrupt service and destroy the user experience set us back many
years.
We can accept RECOMMENDED, and only RECOMMENDED.
LC-2007 from EdPimentl <edpimentl@gmail.com>
4.2 Server Response to Proxy

* Section 4.2.1 Requiring servers to respond with 406 is profiling  
HTTP; HTTP currently allows the server to send a 'default'  
representation even when the headers say that the client doesn't  
prefer it.
LC-2079 from Mark Nottingham <mnot@mnot.net>
4.2.1 Use of HTTP 406 Status

Servers should respond with an HTTP 406 Status (and not an HTTP 200 Status) if a request cannot be satisfied with content that meets the criteria specified by values of the HTTP request headers.

* Section 4.2.2 "Servers must include a Cache-Control: no-transform  
directive if one is received in the HTTP request." Why?
LC-2080 from Mark Nottingham <mnot@mnot.net>
4.2.2 "Servers must include a Cache-Control: no-transform directive if
one is received in the HTTP request."  Why?  What does the
transformability of a request body have to do with the
transformability of the associated response body?
LC-2041 from Mark Baker <distobj@acm.org>
4.2.2 Server Origination of Cache-Control: no-transform

Servers must include a Cache-Control: no-transform directive if one is received in the HTTP request.

Servers should include a Cache-Control: no-transform directive if, for any reason, they wish to inhibit transformation of the response.

Note:

Including a Cache-Control: no-transform directive can disrupt the behavior of WAP/WML proxies, because it can inhibit such proxies from converting WML to WMLC.

4.2.3 Varying Representations

Servers should take account of user agent capabilities and formulate an appropriate experience according to those capabilities. Servers should provide a means for users to select among available representations, should default to the last selected representation and should provide a means of changing the selection.

* Section 4.2.3.1 "Serves may base their actions on knowledge... but  
should not choose an Internet content type for a response based on an  
assumption or heuristics about behaiour of any intermediaries." Why not?
LC-2081 from Mark Nottingham <mnot@mnot.net>
4.2.3.1

If a server varies its representation according to examination of received
HTTP headers then it must include a Vary HTTP header indicating this to be
the case. If, in addition to, or instead of HTTP headers, a server varies
its representation based on other factors (e.g. source IP Address) then it
must, in accordance with [RFC 2616
HTTP]<http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/#ref-HTTP>,
include a Vary header containing the value '*'.

What should contain the Vary HTTP Header when a server varies its
representation according to examination of received HTTP headers?

Best Regards
LC-2008 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
4.2.3.1 Use of Vary HTTP Header

If a server varies its representation according to examination of received HTTP headers then it must include a Vary HTTP header indicating this to be the case. If, in addition to, or instead of HTTP headers, a server varies its representation based on other factors (e.g. source IP Address) then it must, in accordance with [RFC 2616 HTTP], include a Vary header containing the value '*'.

Servers may base their actions on knowledge of behavior of specific transforming proxies, as identified in a Via header, but should not choose an Internet content type for a response based on an assumption or heuristics about behavior of any intermediaries. (e.g. a server should not choose Content-Type: application/vnd.wap.xhtml+xml solely on the basis that it suspects that proxies will not transform content of this type).

4.2.3.2

" In HTML content it should indicate the medium for which the
representation is intended by including a link element identifying in its
media attribute the target presentation media types of this representation
and setting the href attribute to a valid local reference (i.e. use the
fragment identifier (see [RFC
3986]<http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/#ref-rfc-3986>
section 3.5<http://www.tools.ietf.org/html/rfc3986.html#section-3.5>) added
to the URI of the document being served to point to a valid target within
the document)."

Why it has to be a fragment identifier within the page? If you do so,
strictly speaking you are saying that an specific fragment of the current
page is an alternative representation for the media handheld for the
current page. That's not true, as the whole page is such representation.

Proposed Amendment:


As per RFC 3986 section 4.4 [Amended: was RFC 1808 initially] an empty
relative URI href="" resolves to complete base URL, so it is suggested to
use this mechanism to point to the current resource

<link rel="alternate" media="handheld" type="text/html" href="" />

(another option is to suggest the usage of the URI that points to the
current resource. )

Best Regards
LC-2009 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
4.2.3.2

Note:

"The presence of link elements which do not contain a valid local reference
does not indicate one way or another whether this representation is
formatted for the presentation media types listed."
This note is useless, it says nothing.

Proposed Amendment:

As per my previous comments on this, the note should be dropped.

Best Regards
LC-2011 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
4.2.3.2

"In addition it should include link elements identifying the target
presentation media types of other available representations by setting the
media attribute to indicate those representations and the href attribute to
a URI without a fragment identifier."

This is totally wrong as I may  have other representations of the current
resource (for example in RDF or as text) in specific sections of my page
and in that case I could use a fragment identifier.

Proposed Amendment: Avoid the suggestions about fragment identifiers as
they are misleading

Best Regards
LC-2010 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
4.2.3.2 Indication of Intended Presentation Media Type of Representation

If a server has distinct representations that vary according to the target presentation media type, it should inhibit transformation of the response by including a Cache-Control: no-transform directive (see 4.2.2 Server Origination of Cache-Control: no-transform).

In HTML content it should indicate the medium for which the representation is intended by including a link element identifying in its media attribute the target presentation media types of this representation and setting the href attribute to a valid local reference (i.e. use the fragment identifier (see [RFC 3986] section 3.5) added to the URI of the document being served to point to a valid target within the document).

In addition it should include link elements identifying the target presentation media types of other available representations by setting the media attribute to indicate those representations and the href attribute to a URI without a fragment identifier.

Note:

The presence of link elements which do not contain a valid local reference does not indicate one way or another whether this representation is formatted for the presentation media types listed.

Note:

Some examples of the use of the link element are included below in B Example Transformation Interactions.

2) Section 4.3

Following item to be added to the guidelines:

A Content Transformation Proxy receiving a response that contains a
non-empty meta-tag "Copyright" MUST NOT restructure or recode the
content, nor dependent resources (such as pictures or videos, or other
textual content).

Rationale: The following content alterations may constitute willful
copyright violations:
a) Insertion of additional links, advertisements, navigation elements
(e.g. scroll bars), or extraneous content.
b) Modification of the look and feel (e.g. through reorganization of
the page, filtering out elements such as pictograms). 

The following alterations may also constitute defacement of trade
marks and other registered elements:
c) Changing the representation of trademarks, logos and other
registered design elements, as a change in the representation may
affect its legibility, the colours or the colour palette.

The following alterations may also constitute disactivation of or
bypassing IPR protections:
d) Re-encoding images or videos which embed steganographic IPR
protection information may eliminate or render ineffective these
mechanisms.

A copyright meta-tag, in the absence of any other indication, is
enough to signal that the content must not be transformed. This
meta-tag is widely used in the WWW, and its inclusion in content as a
meta-tag (invisible to end-users) is precisely intended to control
automatic processing in the delivery chain.
LC-2020 from casays <casays@yahoo.com>
4.3 Proxy Forwarding of Response to User Agent

1) Section 4.2.2:

Statement to be inserted:

"As per sections 13.5.2 and 14.9.5 of RFC2616, proxies MUST
NOT modify neither the header fields Content-Encoding,
Content-Range and Content-Type, nor the body originating
from a server that includes a no-transform directive in
its response. Proxies that do not follow this rule do not
conform to the HTTP protocol."

Rationale: there actually are transcoders which, despite 
no-transform directives, modify the body of the responses 
from server. The statement reminds the users of such proxies 
that they must be configured so as not to violate IETF 
standards.
LC-2045 from casays <casays@yahoo.com>
Consistently with my other comment that no extra content should be added 
to transcoded web sites, I think that this should apply even more 
strongly to mobile-optimised sites. Unfortunately, I see a lot of 
transcoder deployments where operators and/or transcoder vendors feel 
entitled to add advertisement and extra navigation bars to existing 
mobile optimisec ontent. Because of this, I suggest the following 
addition as a note to "4.3.1":

"Note: It should be stressed that, in case of a |Cache-Control: 
no-transform| directive,  adding  any extra content (such as banners, 
navigation bars and links not available in the original application) is 
not admissable"

Thank you

Luca Passani
LC-2091 from Luca Passani <passani@eunet.no>
On section 4.3.6 it is mentioned the possibility of also checking the
existence of meta HTTP-Equiv directives on the HTML response in addition to
standard HTTP headers. However, this should be explicitly clarified and if
so should apply to any server-generated header.

Proposed Amendment:

Clarify explictly if  proxies should check standard HTTP headers or meta
HTTP-Equiv headers or both.

Best Regards
LC-2013 from JOSE MANUEL CANTERA FONSECA <jmcf@tid.es>
4.3.1 Receipt of Cache-Control: no-transform

If the response includes a Cache-Control: no-transform directive then proxies must not alter it other than to comply with transparent HTTP behavior and other than as follows.

If a proxy determines that a resource as currently represented is likely to cause serious mis-operation of the user agent then it may advise the user that this is the case and must provide the option for the user to continue with unaltered content.

* Section 4.3.2 Why can't proxies transform something that has already  
been transformed?
LC-2082 from Mark Nottingham <mnot@mnot.net>
4.3.2 "If the response includes a Warning: 214 Transformation Applied
HTTP header, proxies must not apply further transformation. "  Why?
The transformation indicated by the warning may have been the result
of a server-side transformation which a client-side proxy may deem
suboptimal, and so want to retransform.  I see no problem with that.
LC-2042 from Mark Baker <distobj@acm.org>
4.3.2 Receipt of Warning: 214 Transformation Applied

If the response includes a Warning: 214 Transformation Applied HTTP header, proxies must not apply further transformation.

* Section 4.3.3 Sniffing content for error messages is dangerous, and  
also unlikely to work. E.g., will you sniff for all languages and all  
possible phrases? How will you avoid false positives? Remove this  
section and require content providers to get it right. People may  
still do this in their products, but there's no reason to codify it.
LC-2083 from Mark Nottingham <mnot@mnot.net>
4.3.3 Server Rejection of HTTP Request

For compatibility with servers that do not implement this Recommendation (see 4.2.1 Use of HTTP 406 Status), a proxy may treat responses with an HTTP 200 Status as though they were responses with an HTTP 406 Status if it has determined that the content (e.g. "Your browser is not supported") is equivalent to a response with an HTTP 406 Status.

* Section 4.3.4 What's the purpose behind this behaviour?
LC-2084 from Mark Nottingham <mnot@mnot.net>
4.3.4 Receipt of Vary HTTP Header

If, in response to an HTTP request with altered headers that was not preceded by an HTTP request with unaltered headers, a proxy receives a response containing a Vary header referring to one of the altered headers then it should request the resource again with unaltered headers, it should update whatever heuristics it uses so that unaltered headers are presented first in subsequent requests for this resource and it should resume the behavior described under 4.1.5.2 Avoiding "Request Unacceptable" Responses to avoid rejection of subsequent requests.

4.3.5 Link to "handheld" Representation

If the response is an HTML response and it contains a <link rel="alternate" media="handheld" /> element, the CT-proxy should request and process the referenced resource, unless the resource referenced is the current resource as determined by the presence of link elements as discussed under 4.2.3.2 Indication of Intended Presentation Media Type of Representation.

3) Section 4.3.6

Under "Examples of mobile specific DOCTYPEs:", add:

	-//WAPFORUM//DTD WML 1.3//EN
	-//WAPFORUM//DTD WML 1.1//EN

Rationale: WML is still in use in the mobile Web. Responses of this
type are precisely the kind that should not be transformed, as WML is
intrinsically targeted at mobile devices only. WML can also be
delivered over HTTP, so the draft applies to this content as well.
LC-2021 from casays <casays@yahoo.com>
4) Section 4.3.6

The proposed heuristics seem to fail completely for i-mode sites, because:
a) i-Mode sites often do not have peculiar URL distinguishing them
from (non-mobile) sites, and in any case the prefix imode.* is not
included in the list of URI patterns to check for.
b) i-Mode sites return mostly their content as text/html.
c) i-Mode sites do not include a DOCTYPE.
d) The markup for i-Mode does not cater for the utilization of the
link element as proposed in the draft, which is therefore not included
in i-Mode content.
e) i-Mode servers do not have much use for the "no-transform"
directive, and hence do not necessarily implement it.
LC-2022 from casays <casays@yahoo.com>
Hi, I think that CTG should mention the fact that, in case of 
transcoding, no extra content should be injected without the consent of 
the original content owner. The idea is to avoid that W3C 
protocols/guidelines implicitly endorse the attempt by  those who  
manage the transcoder to monetize on the effort/investment of other 
people. Of course, there is also a point that injecting extra content 
will invariably affect usability negatively and as such should be avoided.

I suggest the following addition:

"4.3.6.3 Injection of external content
In its effort to optimise the user experience of non-mobile optimised 
sites, a proxy *should not* inject extra content into the transcoded 
pages, where the term 'extra content' refers to text, links, banners  
and other multimedia content which is not available on the original 
untranscoded page. Addition of links aimed at implementing pagination 
and navigational shortcuts is admissible.

Note: For clarity, it is emphasised that W3C does not endorse injection 
of third-party content into a transcoded page without the explicit 
consent of the content owner"

Can this comment be added to the tracking system?

Thank you

Luca Passani
LC-2090 from Luca Passani <passani@eunet.no>
- the "|application/xhtml+xml" MIME type should be the basis for an 
heuristics that informs transcoders that no transcoding must be applied. 
The rationale for this is obvious: this MIME type is being used for 
mobile content virtually exclusively these days
LC-1998 from Luca Passani <passani@eunet.no>
4) Section 4.3.6.

Complete the statement:

	"the URI of the response (following redirection 
	or as indicated by the Content-Location HTTP header) 
	indicates that the resource is intended for mobile 
	use (e.g. the domain is *.mobi, wap.*, m.*, mobile.*"

ADD: 	, pda.*, imode.*, iphone.*,

	"or the leading portion of the path is /m/ or /mobile/);"

Rationale: URL with pattern imode.* and pda.* have been in
use for many years, and unambiguously indicate sites that
are optimized for i-Mode devices or for PDA (Palm, PPC, 
IEMobile). URL of the form iphone.* have started to appear,
providing experience specifically tailored to i-Phones; they
do not need, and should not be transformed either.
LC-2048 from casays <casays@yahoo.com>
- There should be restrictions over how short a  page transcoders are 
allowed to reformat.  In no case should a page smaller than 10kb be 
reformatted (ideally this threshold should be higher, but 10kb will make 
it consistent with BT, so it would be a step in the right direction)
LC-1999 from Luca Passani <passani@eunet.no>
- Navigation bars: this is something that I would like to introduce in 
the Manifesto too. In no event should a transcoder add extra footers or 
headers (logos, extra navbars, advertisement and similar) without the 
consent of the content owner.
LC-2000 from Luca Passani <passani@eunet.no>
- The list of "safe" URL patterns should be improved to support iphone.*
and  */iphone/
LC-2002 from Luca Passani <passani@eunet.no>
3) Section 4.3.6

The third bullet under "examples of heuristics" is to be split 
into two points:

"the Content-Type of the response are known to be specific to the 
device or class of device. At a minimum, the following MIME types
intended for mobile Web browsers MUST represent mobile-optimized 
content:

Browsing XHTML-related
	application/vnd.wap.xhtml+xml
	application/xhtml+xml
Browsing WML-related
	text/vnd.wap.wml
	application/vnd.wap.wmlc
	text/vnd.wap.wml+xml
	text/vnd.wap.wmlscript
	application/vnd.wap.wmlscriptc
	image/vnd.wap.wbmp
	application/vnd.wap.wbxml
Browsing and downloading
	application/vnd.wap.multipart.mixed
	application/vnd.wap.multipart.related
	application/vnd.wap.multipart.alternative
	application/vnd.wap.multipart.form-data

In addition, the following MIME types of the form */x-up-* 
SHOULD be considered as representing mobile-optimized 
content, at a minimum:

Legacy Openwave
	image/x-up-wpng
	image/x-up-bmp

The range of MIME types is intended to cover typical mobile
browsing applications.

Transformations specified by the relevant standards are allowed
(WAP-236 WAE specifications 19.12.2001, WAP-192 WBXML specifications 
25.7.2001, WAP-191 WML specifications 19.2.2000 and predecessors, 
WAP-193 WMLScript specifications 25.10.2000).

In accordance with Internet standards and practices, a proxy 
SHOULD determine whether a content is mobile-optimized FIRST by 
examining the HTTP header field content-type, before inspecting 
the XML declaration and its associated DOCTYPE."

Rationale: Inspection of the HTTP field Content-type is an
usual mode of operation amongst transcoders. It is also simpler
and safer than applying heuristics on DOCTYPE, because inspecting 
the content of a body requires one to deal with character encoding 
issues (see RFC3023, XML 1.1 sections 4.3.3 and E), or parsing 
multipart-structured content; these are unnecessary when handling 
HTTP fields. Finally, specifying a minimum set of required MIME 
types to take into account helps ensure that proxies will exhibit a
standard behaviour, and that non-textual content types for which 
there is no DOCTYPE (notably mobile-specific image formats) are 
properly dealt with. A normative document cannot leave full freedom 
to implementors to select whichever subset of content types are to 
be considered mobile-optimized or not.

4) Section 4.3.6

The second part of the bullet split as described in (b) is to contain
the following:

"other aspects of the response such as the DOCTYPE are known to be 
specific to the device or class of device.

At a minimum, the following DOCTYPEs MUST be considered as 
mobile-specific:

XHTML mobile profile
	-//OMA//DTD XHTML Mobile 1.2//EN
	-//WAPFORUM//DTD XHTML Mobile 1.1//EN 
	-//WAPFORUM//DTD XHTML Mobile 1.0//EN
XHTML basic
	-//W3C//DTD XHTML Basic 1.1//EN
	-//W3C//DTD XHTML Basic 1.0//EN
	-//OPENWAVE//DTD XHTML 1.0//EN
	-//OPENWAVE//DTD XHTML Mobile 1.0//EN
XHTML i-Mode
	-//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/1.0) 1.0//EN
	-//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/1.1) 1.0//EN
	-//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/2.0) 1.0//EN
	-//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/2.1) 1.0//EN
	-//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/2.2) 1.0//EN
	[list completed in
http://lists.w3.org/Archives/Public/public-bpwg-comments/2008JulSep/0150.html
with:
	-//i-mode group (ja)//DTD XHTML i-XHTML (Locale/Ver.=ja/2.3) 
1.0//EN]
Compact HTML
	-//W3C//DTD Compact HTML 1.0 Draft//EN
	-//BBSW//DTD Compact HTML 2.0//EN

The following DOCTYPEs MUST be considered as mobile-specific. 
Transformations explicitly provided for by the relevant standards 
are allowed (WAP-192 WBXML specifications 25.7.2001, WAP-236 WAE 
specifications 19.12.2001, WAP-191 WML specifications 19.2.2000 
and predecessors, WAP-193 WMLScript specifications 25.10.2000).

WML
	-//WAPFORUM//DTD WML 1.0//EN
	-//WAPFORUM//DTD WML 1.1//EN
	-//WAPFORUM//DTD WML 1.2//EN
	-//WAPFORUM//DTD WML 1.3//EN
	-//WAPFORUM//DTD WML 2.0//EN
	-//PHONE.COM//DTD WML 1.1//EN
	-//OPENWAVE.COM//DTD WML 1.3//EN

The range of MIME types is intended to cover typical mobile
browsing applications."

Rationale: A normative document cannot leave full freedom to 
implementors to select whichever subset of DOCTYPEs are 
considered mobile-optimized or not. This helps ensure that
transformation proxies exhibit a standard behaviour.
LC-2052 from casays <casays@yahoo.com>
4.3.6 Proxy Decision to Transform

In the absence of a Vary or no-transform directive (or a meta HTTP-Equiv element containing Cache-Control: no-transform) proxies should apply heuristics to the response to determine whether it is appropriate to restructure or recode it (in the presence of such directives, heuristics should not be used.)

Examples of heuristics:

  • The Web site (see note) has previously shown that it is contextually aware, even if the present response does not indicate this;

  • a claim of mobileOK Basic [mobileOK Basic Tests] conformance is indicated;

  • the Content-Type or other aspects of the response (such as the DOCTYPE) are known to be specific to the device or class of device;

    Examples of mobile specific DOCTYPEs:

    -//OMA//DTD XHTML Mobile 1.2//EN
    -//WAPFORUM//DTD XHTML Mobile 1.1//EN 
    -//WAPFORUM//DTD XHTML Mobile 1.0//EN
    -//W3C//DTD XHTML Basic 1.1//EN
    -//W3C//DTD XHTML Basic 1.0//EN)
  • the user agent has linearization or zoom capabilities or other features which allow it to present the content unaltered;

  • the URI of the response (following redirection or as indicated by the Content-Location HTTP header) indicates that the resource is intended for mobile use (e.g. the domain is *.mobi, wap.*, m.*, mobile.* or the leading portion of the path is /m/ or /mobile/);

  • the response contains client-side scripts that may mis-operate if the resource is restructured;

  • the response is an HTML response and it includes <link> elements specifying alternatives according to presentation media type.

5) Section 4.3.6.1

I miss any discussion or reference in the document about the issue of
character encodings.

Transforming content across different charsets is a mine-field and
affects a number of aspects:
a) Content may rely upon widely different character encodings,
depending on the targetted devices and markets. In particular, the
trio China -   Japan - Korea (CJK) continues to rely on a number of
encodings (such as Shift_JIS, BIG5, etc) whose handling is a complex
matter; for instance, there are not necessarily bijective mappings
between these encodings and others, including UTF-8. 
b) Documents may have multi-encoding representations. Different
encodings may be associated with external entities through the charset
attribute (see HTML 4.0.1). How transformation proxies deal with such
a situation is left undefined.
c) Similarly, the draft does not explain what happens when a server
associates an attribute accept-charset to a form, and whether proxies
respect or manipulate such information.
d) In i-Mode, and at least in the Softbank environment (Japan),
unreserved character points in the character encoding space are used
to represent pictograms. Any attempt to convert these characters
directly will fail; they should therefore not be transformed, but
preserved, taking into account the fact that the character points thus
referred to differ between Unicode and Shift_JIS, and that DoCoMo and
Softbank do not use the same code points for the same pictograms.

A consequence of all this is that if a proxy does not operate natively
with the character encoding of the content returned by the server, or
is not able to ensure a bijective mapping between this encoding and
other encodings it deals with, recurrent and irrecoverable problems
will creep.
A simple way that could go some way towards alleviating this risk
would be to forbid any transformation if the server announces (either
via the HTTP field Content-type: charset=..., the XML declaration, or
a meta-tag) an encoding different from ASCII or perhaps UTF-8.
LC-2023 from casays <casays@yahoo.com>
4.3.6.1 Alteration of Response

A proxy should strive for the best possible user experience that the user agent supports. It should only alter the format, layout, dimensions etc. to match the specific capabilities of the user agent. For example, when resizing images, they should only be reduced so that they are suitable for the specific user agent, and this should not be done on a generic basis.

If a proxy alters the response then:

  1. It must add a Warning 214 Transformation Applied HTTP header;

  2. The altered content should validate according to an appropriate published formal grammar;

  3. It should indicate to the user that the content has been transformed for mobile presentation and provide an option to view the original, unmodified content.

I am the founder of Goowallet a Mobile Banking / Payment private label
service provider

After reading the Last Call comments we are very concern that many of these
recommendations will seriously impact security, privacy and trust.

We are therefore 100% oppose to allowing Disrupting HTTPS they way
transcoder do today is probably illegal and certainly unethical. HTTPS is
built to guarantee end2end security.
Breaking end2end security is probably illegal.
Men in the Middle/Interfering with HTTPS should not be permissible under
any circumstances.
 Making(allowing) it possible for an Operator to now attempt to dismantle
the security of the internet in favor of transcoding, will seriously and
significantly and negatively impact the banking and financial industry.
Data protection rules and regulations. If allow, this will also impact the
national security of all law abiding nations.
LC-2004 from EdPimentl <edpimentl@gmail.com>
6) Section 4.3.6.2

The possibility to break the end-to-end security of an HTTPS
connection is unacceptable and must be forbidden. This jeopardizes the
set-up of mobile e-commerce, which had difficulties to get established
in part because of the point-to-point, hop-wise secure connection with
WTLS, and makes a sham of security for other applications that require it.

Besides, there is no guarantee that transformations performed by a
proxy preserve the content being exchanged between client and server
to a point that does not further disturb the secure exchange. As an
example, there is no explicit prohibition in the draft against turning
POST requests into GET ones, the resizing of images may make visual
captchas unreadable, and reordering elements may make forms or
security information difficult to figure out at the client side.
LC-2024 from casays <casays@yahoo.com>
Dom,

thanks for your request for review.

With respect to the guidelines regarding the rewriting of HTTPS
URIs, we notice that any such rewriting will break any use of TLS
for authenticating the client to the server (e.g., use of TLS client
certificates). Similarly, any applications on top of HTTPS that rely
on TLS channel bindings would detect the proxy's intervention as an
attack, and lead to a broken user experience; see RFC 5056 for more
details about channel bindings.

We recommend that you discuss this aspect with the IETF TLS Working
Group.

Regards,
LC-2085 from Thomas Roessler <tlr@w3.org>
c) Similary, the guidelines leave completely open the way of how
to "provide the option to avoid decryption...", and do not 
require it to be an OSTENSIBLE one. If the users miss the
alternate (or rather, the original) link, they may unwillingly
and unconsciously access the server without the expected
security.

As an example, a small icon (perhaps representing a key) in a
corner of the first page accessed via HTTPS, and linking to the
end-to-end HTTPS link, fully conforms to the guidelines. How 
many users would notice it and understand its significance?
LC-2028 from casays <casays@yahoo.com>
d) Informing the user that there are security implications in 
the way he chooses to access the server, and providing him with 
an alternative link to it risks causing the following reactions:
i. WWW-beginners may simply not bother reading the advice and 
always take the default action, which according to the
guidelines seems to correspond to taking the less safe,
point-to-point HTTPS connection.
ii. Somewhat WWW-knowledgeable users, aware of the existence of
Trojan horses and phishing, may reel at the invitation to try 
alternative links. If they are curious and examine the URI of
the current page, they may further suspect foul play, as the
rewritten URI may not match the one they accessed originally.
iii. Expert WWW-users will understand the implications of the
proxy set-up, but may be wary at using its services for HTTPS
links -- after all, what is the guarantee that the proxy will
not misuse or unintentionally disclose private information in
a point-to-point connection? And if there is a proxy acting as
middle-man, what is the guarantee that the end-to-end HTTPS
link is actually an end-to-end one and the proxy is not just
performing some other tricky manipulations?

Overall, fiddling with HTTPS connections risks reducing, rather
than increasing, the willingness of end-users to access the
mobile Web. A relevant point is that these end-users may 
actually assign the fault with the untrustworthy connections
to the content or application provider, rather than to the 
operator of the proxy.
LC-2029 from casays <casays@yahoo.com>
e) The guidelines allow the client to go through a first
point-to-point session establishment with the proxy, and if so
desired, through a second end-to-end session establishment with the
server. 

Establishing an HTTPS connection is a somewhat heavy process for
wireless devices, requiring the delivery and possibly acceptance of
certificates. A double initiation procedure reduces, rather than
increases, usability at session start.
LC-2030 from casays <casays@yahoo.com>
4.3.6.2 I think the Note here is a good one, but may be worth
expanding, since it is apparently already unclear to some how HTTPS
works here. The very purpose of HTTPS is to ensure that content is not
modified or read by third parties in transit, which means a
transforming proxy cannot jump into an HTTPS conversation between
mobile device and origin server. So there's not actually a question of
whether it's illegal or unethical -- it's simply not possible (unless
you have cracked SSL). It can only create a secure connection between
the mobile device and itself, and between itself and the origin
server. This is indeed a situation that the end user needs to
understand:

I suggest wording along these lines, take it or leave it as you see fit --

URIs which begin with the https scheme, when accessed, are secured
against eavesdropping and modification by third parties by the SSL
protocol. It is therefore not possible for a third-party transforming
proxy to participate directly in such a connection between mobile
device and origin server. Transforming proxies may still transform
content of https resources, but at best, it involves creating a
separate secure connection between device and proxy, and between proxy
and origin server. These communications are secure but the secured
content is of course visible to the transforming proxy. This may of
course be undesirable to an end user.

Therefore if a proxy rewrites https links, replacements links MUST at
least use the https scheme as well, and the proxy MUST use https to
communicate with the origin server. In addition the proxy MUST clearly
advise the user that the potentially sensitive contents of the
communication will be visible to the proxy, and must give the user an
option to opt out.
LC-2015 from Sean Owen <srowen@google.com>
f) In the absence of any requirements regarding the reliability
of proxies and their operating environment, one can only 
wonder why anybody would choose a point-to-point connection
through an uncontrollable middle-man over an end-to-end one
if the intent is to access private information safely over 
the mobile Web. The experience of WTLS taught some hard lessons 
there.
LC-2031 from casays <casays@yahoo.com>
Having look at the conversation you are having here, I think there are 
conflicting information about how HTTPS is handled by transcoding 
servers. I understand that not all transcoders work the same, but some 
do perform a man-in-the-middle-attack, and IMO this should not be 
endorsed by the W3C guidelines.

The way many transcoders work is that they run instances of real web 
browsers (talking about tens or hundreds of Internet Explorer instances 
running in the memory of the server here). This means that there is no 
way for content owners to protect against transcoders simply because the 
server is talking to a legitimate web browser, exchanging real 
certificates, logging-in with real passwords, establishing secure SSL 
connetions and all the rest.

The point of the Content Transformation Guidelines seems to be "some users
may want to continue using the service at the cost of degrading 
security". Well, this is not up to the user to decide, I am afraid. 
HTTPS is also about non-repudiation and the fact that users must not be
able to say "I did not do it" at a later stage. The fact that 
transcoders have found a technical way to by-pass HTTPS security does not
mean that they have the right to do it. Nor does it mean that 
end-users can take advantage of it.

Luca
LC-2016 from Luca Passani <passani@eunet.no>
g) The guidelines rely upon a fundamentally flawed assumption:
in the HTTPS connection, the client is the only party concerned 
with security, and which must take a decision as to whether to 
access resources over a point-to-point or end-to-end link. 

This is incorrect: there are actually two parties to the secure
connection, client and server, both with legitimate security 
concerns. The server has thus as much a right to determine whether 
it wants to provide services over a point-to-point connection 
as the client. I can very well imagine that for instance 
banking, electronic commerce or social networking application
servers may decide to sever point-to-point connections rather
than providing services over them, and inform the end-user 
about the reasons.

Unfortunately, because of the flawed assumption of the guidelines,
there is strictly no way a server may reliably detect whether
it is communicating over a point-to-point link or not. 

Consider:
i. The proxy rewrites links but the replacement links must have
HTTPS; hence for the server communication obviously takes place 
over HTTPS.
ii. If the proxy preserves the HTTP header fields (such as 
user-agent, accept, accept-charset, etc), which is actually
encouraged by the guidelines, then the proxy cannot detect
that transformations may be taking place.
iii. Further, the "via" HTTP header field does not constitute
a proper mechanism to detect the presence of a transformation
proxy, and whether HTTPS is point-to-point or end-to-end.
First, the comment "http://www.w3.org/ns/ct" indicating the
presence of a transformation proxy is not mandatory, as per
the guidelines. Secondly, RFC2616 authorizes proxies to use 
a pseudonym instead of a domain name for the "received-by" 
part of their hop, which does not necessarily have a meaning 
for servers.

The server is therefore not in a position to take educated
decisions as to its secure communications with clients through
a transformation proxy.
LC-2032 from casays <casays@yahoo.com>
Overall, the guidelines follow the rule that accessing the
WWW is the prime intent of the end-user, and that security
comes only second. Hence the approach of defaulting to the
transformation chain, with the possibility of opting out of
it. This is a questionable assumption precisely in the
context of secure transactions. There, secure access is the
paramount requirement, and must therefore be fulfilled by
any proxy set-up, with the possibility to opt-in to the
unsecure transformation chain.
LC-2033 from casays <casays@yahoo.com>
- Messing with HTTPS should not be permissible under any circumstances. 
Disrupting HTTPS they way transcoder do today is probably illegal and 
certainly unethical. HTTPS is built to guarantee end2end security. 
Breaking end2end security is probably illegal and certainly not an 
activity that W3C should endorse in any way.
LC-2001 from Luca Passani <passani@eunet.no>
a) Tthe guidelines state: "[...] and must provide the
option to avoid decryption and transformation of the 
resources the links refer to."

This stipulation theoretically allows manipulations of the
HTTPS stream that are not strictly related to decryption and 
transformation of the content. 

What is required is that the client may establish an HTTPS 
connection with the server in the exact, undisturbed context
as if the proxy were a transparent one, performing no
transformations whatsoever.
LC-2026 from casays <casays@yahoo.com>
b) The guidelines do not state that the users "must be advised
of the security implications of rewriting HTTPS links" BEFORE
they have a chance to perform any operation with the target site.
If the advice takes place after an operation, then users may 
unknowingly access the server through the point-to-point HTTPS
connection instead of the end-to-end one.

As an example, a small icon (perhaps representing a question
mark) in a corner of the first page accessed via HTTPS, and
pointing to a description of the consequences of the rewritten
HTTPS links, fully conforms to the guidelines. How many users
would notice it? How many would click on it, take the time to
read its content fully (and understand it), before performing
any further action?
LC-2027 from casays <casays@yahoo.com>
4.3.6.2 HTTPS Link Re-writing

If the response contains links whose URIs have the scheme https proxies may only rewrite them so that they can transform the content of linked resources, if the following provision is met. If a proxy does rewrite such links, it must advise the user of the security implications of doing so and must provide the option to avoid decryption and transformation of the resources the links refer to.

If a proxy re-writes HTTPS links, replacement links must have the scheme https.

Note:

For clarity it is emphasized that it is not possible for a transforming proxy to transform content accessed via an HTTPS link without breaking end to end security.

5 Testing (Normative)

Operators of transforming proxies should make available interfaces that facilitate testing of Web sites accessed through them and should make such interfaces available through normal Internet access paths.

2) Sections A and D

Since 2005, the Open Mobile Alliance has been working on a 
Standard Transcoding Interface, and has published specifications
for it. The usage scenario is different: the STI is meant for 
servers offering transformation services on demand via a Web
services interface, whereas the usage scenario of the CTG is 
for proxies that intercept all HTTP flows between clients and 
servers. However, there are several aspects that may overlap 
-- in the requirements or the definition of the acceptable 
limits during transcoding (e.g. content size). A reference 
to this standard, and a discussion of the relation between 
the CTG and the OMA specification is in order.
LC-2051 from casays <casays@yahoo.com>
A References

CT Landscape
Content Transformation Landscape 1.0, Jo Rabin, Andrew Swainston (eds), W3C Working Draft 25 October 2007 (See http://www.w3.org/TR/ct-landscape/)
RFC 2119
Key words for use in RFCs to Indicate Requirement Levels, , Request for Comments: 2119, S. Bradner, March 1997 (See http://www.ietf.org/rfc/rfc2119.txt)
RFC 2616 HTTP
Hypertext Transfer Protocol -- HTTP/1.1 Request for Comments: 2616, R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee, June 1999 (See http://tools.ietf.org/html/rfc2616)
RFC 3986
Uniform Resource Identifier (URI): Generic Syntax, Request for Comments: 3986, T. Berners-Lee, R. Fielding, L. Masinter, January 2005 (See http://tools.ietf.org/html/rfc3986)
Device Independence Glossary
W3C Glossary of Terms for Device Independence, Rhys Lewis (ed), W3C Working Draft 18 January 2005
Best Practices
Mobile Web Best Practices 1.0 Basic Guidelines, Jo Rabin, Charles McCathieNevile (eds), W3C Recommendation, 29 July 2008 (See http://www.w3.org/TR/2008/REC-mobile-bp-20080729/)
mobileOK Basic Tests
W3C mobileOK Basic Tests, Sean Owen, Jo Rabin (eds), W3C Working Draft, 10 June 2008 1l: To update (See http://www.w3.org/TR/mobileOK-basic10-tests/)
XHR
The XMLHttpRequest Object, Anne van Kesteren (ed), W3C Working Draft, 15 April 2008 (See http://www.w3.org/TR/XMLHttpRequest/)

B Example Transformation Interactions (Non-Normative)

Note:

The following examples refer to requests with the GET method.

B.1 Basic Content Tasting by Proxy

Request resource with original headers

If the response is a 406 response:

If the response contains Cache-Control: no-transform, forward it

Otherwise re-request with altered headers

If the response is a 200 response:

If the response contains Vary: User-Agent, an appropriate link element or header, or Cache-Control: no-transform, forward it

Otherwise assess whether the 200 response is a form of "Request Unacceptable"

If it is not, forward it

If it is, re-request with altered headers

B.2 Optimization based on Previous Server Interaction

Proxy receives a request for resource P that it has not encountered before

Proxy forwards this request

Response is 200 OK containing the text "Unsupported browser. Please get a different one or use a CT proxy."

Proxy determines that this equates to a 406 Status and re-requests the resource from the origin server with altered headers (emulating a well known desktop browser)

Response is a desktop oriented representation of the resource

Proxy transforms this response into content that the user agent can display well and forwards it

Proxy receives a further request for the resource P

Based on evidence from the previous interaction (e.g. that there was no Vary header, that the response was not targeted at only the previous user in that there was no Cache-Control: private directive) the CT proxy forwards the request with altered headers

Response is a desktop oriented representation of the resource

Proxy transforms this response into content that the user agent can display well and forwards it

B.3 Optimization based on Previous Server Interaction, Server has Changed its Operation

Proxy receives a request for resource P, that it has previously encountered as in B.2 Optimization based on Previous Server Interaction

Proxy forwards request with altered headers

Response is 200 OK containing a Vary: User-Agent header

Proxy notices that behavior has changed and re-issues request with original headers

Response is 200 OK and proxy forwards it

B.4 Server Response Indicating that this Representation is Intended for the Target Device

Proxy receives a request for resource P

Proxy forwards request with original headers

Response is 200 OK with Vary: User-Agent and <link type="alternate" media="handheld" href="P#id" /> where id is a document local reference

Proxy forwards response as designed specifically for the requesting device

B.5 Server Response Indicating that another Representation is Intended for the Target Device

Proxy receives a request for resource P

Proxy forwards request with original headers

Response is 200 OK with <link type="alternate" media="handheld" href="Q" /> and Q is not P

Proxy requests Q with original headers

Response is 200 OK and proxy forwards it

C Applicability to Transforming Solutions which are Out of Scope (Non-Normative)

There are a number of well-known examples of solutions that seem to their users as though they are using a browser, but because the client software communicates with using proprietary protocols and techniques, it is the combination of the client and the in-network component that is regarded as the HTTP User Agent. The communication between the client and the in-network component is therefore out of scope of this document.

Additionally, where some kind of administrative arrangement exists between a transforming proxy and an origin server for the purposes of transforming content on the origin server's behalf, this is also out of scope of this document.

In both of the above cases, it is recommended that when forwarding requests to origin servers that proxies adhere to the provisions of this document in respect of providing information about the device and the original IP address.

D Scope for Future Work (Non-Normative)

D.1 POWDER

The BPWG believes that POWDER will represent a powerful mechanism by which a server may express transformation preferences. Future work in this area may recommend the use of POWDER to provide a mechanism for origin servers to indicate more precisely which alternatives they have and what transformation they are willing to allow on them, and in addition to provide for Content Transformation proxies to indicate which services they are able to perform.

"D.2 link HTTP Header

The BPWG believes that the link HTTP header which was removed from 
recent drafts of HTTP, and which is under discussion for 
re-introduction, would represent a more general and flexible mechanism 
than use of the HTML link element, as discussed in this recommendation."

This is totally misleading.

The link header was removed in RFC2616 (RFC, not a draft), and that was 
in 1999 (so, not "recent").

BR, Julian
LC-1995 from Julian Reschke <julian.reschke@gmx.de>
D.2 link HTTP Header

The BPWG believes that the link HTTP header which was removed from recent drafts of HTTP, and which is under discussion for re-introduction, would represent a more general and flexible mechanism than use of the HTML link element, as discussed in this recommendation.

D.3 Sources of Device Information

The process of adapting content at the origin server, or transforming it in a proxy is likely to have a dependency on a repository of device descriptions. An origin server's willingness to allow a transforming proxy to transform content may depend on its evaluation of the trustworthiness of device description data that is being used. There is scope for enhancement of the trust relationship by some means of indicating this.

3) Cascaded proxies.

a. Section 4.1.3

Statement to be inserted:

"Whenever the requester is another transformation proxy, the
receiving proxy MUST treat it as a non-browser agent. The
receiving proxy SHOULD rely upon the presence of alternative
X-Device- HTTP fields and the values in the via HTTP field as
per 4.1.6.1 to detect that it is placed downstream from a 
chain of proxies."



b. Section 4.3.2

Statement to be inserted:

"As per section 14.46 of RFC2616, 214 Transformation applied
MUST be added by an intermediate cache or proxy if it applies 
any transformation changing the content-coding (as specified 
in the Content-Encoding header) or media-type (as specified in 
the Content-Type header) of the response, or the entity-body 
of the response, unless this Warning code already appears in 
the response. 

A proxy receiving a 214 code MUST NOT change it."


c. Section D.4

Eliminate entirely.

Rationale: together with 4.3.1, 4.3.2, 4.1.2, 4.1.5.5, 4.1.6.1, 
4.3.6.1, these changed sections entirely solve the issue of cascading 
proxies in a standards-compliant way.
LC-2047 from casays <casays@yahoo.com>
D.4 Inter Proxy Communication

There is scope for further work to define how multiple proxies may inter-operate. A common case of multiple proxies is where a network provider transforming proxy and a search engine transforming proxy are both present.

D.5 Amendment to and Refinement of HTTP

The BPWG believes that amendments to HTTP are needed to improve the inter operability of transforming proxies. For example, HTTP does not provide a way to distinguish between prohibition of any kind of transformation and the prohibition only of restructuring (and not recoding or compression).

At present HTTP does not provide a mechanism for communicating original header values (hence the use of X-Device- headers as discussed under 4.1.5 Alteration of HTTP Header Values).

A number of mechanisms exist in HTTP which might be exploited given more precise definition of their operation - for example the OPTIONS method and the HTTP 300 (Multiple Choices) Status.

E Administrative Arrangements (Non-Normative)

It is noted that there are means which fall outside of the scope of this document for establishing user preferences with content transformation proxies. It is anticipated that proxies will maintain preferences on a user by user and Web site by Web site basis, and will change their behavior in the light of changing circumstances as discussed under 4.3.4 Receipt of Vary HTTP Header.

F Acknowledgments (Non-Normative)

The editor acknowledges contributions of various kinds from members of the Mobile Web Best Practices Working Group Content Transformation Task Force.

The editor acknowledges significant written contributions from:

Unbound annotations:

In general, I must state unequivocally that our experience with
current transformation proxies deployed throughout the world has
always been negative, since all proxies seem to transform original
mobile content regardless, with results ranging from passable to
outrageously unusable.  The draft, while an interesting attempt to
bring some order in the wild practices that abound in the mobile Web,
is still vague and incomplete in several points, and thus, in its
present form, may not stem some of the more egregious forms of
transcoding we have witnessed so far.
LC-2025 from casays <casays@yahoo.com>
Here's my comments.  In summary, the group really needs to decide
whether this is a guidelines document, or a protocol.  It can't be
both.  A lot of work remains.
LC-2043 from Mark Baker <distobj@acm.org>
To the W3C Mobile Web Best Practices Working Group:

The Internet Architecture Board has reviewed the subject document, and
notes that 
it has previously reviewed related work done in the IETF in the Open
Pluggable 
Edge Services (OPES) Working Group.  In its preview and review of OPES
work, the 
IAB expressed its concerns about privacy, control, monitoring, and
accountability 
of such services in RFC 3238 [ http://tools.ietf.org/html/rfc3238 ].

We have no specific architectural concerns with the "Content Transformation

Guidelines" document as written; it does seem to take into account the
questions 
raised during the OPES discussions.  We would like, though, to make that
explicit 
by specifically documenting that you reviewed and considered the issues in
RFC 
3238.

Barry Leiba, for the Internet Architecture Board ( http://iab.org )
LC-2097 from Barry Leiba <leiba@watson.ibm.com>
My name is Dennis Bournique. I write about mobile browsing, primarily from
a user perspective, at http://wapreview.com.  I've done a little web
development, mostly mobile specific sites, but I'm by no means an expert on
the technical side of this issue.

Putting on my user hat, I'd like to make a request that the Content
Transformation Guidelines include a requirement that content transformation
proxies "must" provide end users (consumers of web content) with a way to
turn off transformation both globally and on a site by site basis.

As an end user, I’ve experienced both the joys and the frustrations of
using content transformation proxies.

In general, I believe in content transformation as a valuable tool to make
web content, which would otherwise be difficult or impossible to use,
available through the limited browsers of many mobile phones.

I have also been frustrated when a carrier or content provider unilaterally
imposes content transformation with no way for me to disable it. I've been
unable to access content through content transformation proxies that was
previously available on the same device using a direct connection. This has
happened both with installable content such as midlets and ringtones and
also with pure html and xhtml pages, including mobile optimized pages and
those that are not.  I have also seen my secure end to end HTTPS traffic
being forced through content transformation proxies, exposing it to the
potential for a "man in the middle" attack.

I understand that the Guidelines are intended to prevent these sorts of
problems by specifying when content transformation proxies must allow
content to flow directly between server and user agent without
modification. This is good, but no technical solution can ever be perfect. 
There will always be edge cases where content transformation does more harm
than good. For this reason it is important that end users have the option
to opt-out of content transformation.  

I propose that the Guidelines be amended to include the following or
similar language.

"...1. Content transformation proxies, if they are modifying traffic
between a server and a user agent in any way, MUST provide a mechanism
allowing the end user to resubmit the request and disable content
transformation for the duration of the current session."

"...2. Content transformation proxies, must provide a means for end users
of that proxy to disable all content transformation until they take
explicit action to re-enable it."
LC-2065 from Dennis Bournique <db@wapreview.com>
Old text
Request resource with original headers

If the response is a 406 response:

If the response contains Cache-Control: no-transform, forward it

Otherwise re-request with altered headers

If the response is a 200 response:

If the response contains Vary: User-Agent, an appropriate link element or
header, or Cache-Control: no-transform, forward it

Otherwise assess whether the 200 response is a form of "Request
Unacceptable"

If it is not, forward it

If it is, re-request with altered headers

BUT WHERE IS THE TRANSCODING?
New Text:

Request resource with original headers

If the response is a 406 response:

If the response contains Cache-Control: no-transform, forward it

Otherwise re-request with altered headers

If the response is a 200 response:

If the response contains Vary: User-Agent, an appropriate link element or
header, or Cache-Control: no-transform, forward it

Otherwise assess whether the 200 response is a form of "Request
Unacceptable"

If it is not, TRANSCODE it

If it is, re-request with altered headers
LC-2089 from Heiko Gerlach <heiko.gerlach@vodafone.com>
A purely editorial note: your markup is reusing the ID sec-purpose 
(perhaps others) more than once. This makes the document invalid:

<a name="sec-purpose" id="sec-purpose">From the point of view of this 
document, Content Transformation is the
                     manipulation in various ways, by proxies, of 
requests made to and content
                     delivered by an origin server with a view to making 
it more suitable for mobile
                     presentation.</a></p><p><a name="sec-purpose" 
id="sec-purpose">The W3C Mobile Web Best Practices Working Group neither 
approves nor disapproves of Content Transformation, but
                     recognizes that is being deployed widely across 
mobile data access networks. The
                     deployments are widely divergent to each other, 
with many non-standard HTTP
                     implications, and no well-understood means either 
of identifying the presence of
                     such transforming proxies, nor of controlling their 
actions. This document
                     establishes a framework to allow that to 
happen.</a></p><p><a name="sec-purpose" id="sec-purpose">The overall 
objective of this document is to provide a means, as far as is
                     practical, for users to be provided with at least a 
</a><a 
href="http://www.w3.org/TR/di-gloss/#def-functional-user-experience">"functional

user experience"</a>
                     <a href="#ref-DIGLOSS">[Device Independence 
Glossary]</a> of the Web, when mobile, taking into account the
                     fact that an increasing number of content providers 
create experiences specially
                     tailored to the mobile context which they do not 
wish to be altered by third
                     parties. Equally it takes into account the fact 
that there remain a very large
                     number of Web sites that do not provide a 
<em>functional user
                     experience</em> when perceived on many mobile 
devices.</p>
LC-2064 from Elliotte Harold <elharo@metalab.unc.edu>