[AVT] Comments on draft-ietf-avt-rtp-3gpp-timed-text-01.txt

Magnus Westerlund <magnus.westerlund@ericsson.com> Tue, 18 May 2004 14:04 UTC

Received: from optimus.ietf.org (www.iesg.org [132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA15619 for <avt-archive@odin.ietf.org>; Tue, 18 May 2004 10:04:13 -0400 (EDT)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1BQ53q-00020i-5a for avt-archive@odin.ietf.org; Tue, 18 May 2004 09:55:14 -0400
Received: (from exim@localhost) by www1.ietf.org (8.12.8/8.12.8/Submit) id i4IDtETt007728 for avt-archive@odin.ietf.org; Tue, 18 May 2004 09:55:14 -0400
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1BQ4xy-0008WP-DI; Tue, 18 May 2004 09:49:10 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1BQ4qC-00061F-34 for avt@optimus.ietf.org; Tue, 18 May 2004 09:41:08 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA14707 for <avt@ietf.org>; Tue, 18 May 2004 09:41:05 -0400 (EDT)
Received: from ietf-mx.ietf.org ([132.151.6.1] helo=ietf-mx) by ietf-mx with esmtp (Exim 4.32) id 1BQ4q9-0000sU-T7 for avt@ietf.org; Tue, 18 May 2004 09:41:06 -0400
Received: from exim by ietf-mx with spam-scanned (Exim 4.12) id 1BQ4p8-0000Sg-00 for avt@ietf.org; Tue, 18 May 2004 09:40:04 -0400
Received: from albatross.ericsson.se ([193.180.251.49]) by ietf-mx with esmtp (Exim 4.12) id 1BQ4o4-0007gv-00 for avt@ietf.org; Tue, 18 May 2004 09:38:56 -0400
Received: from esealmw143.al.sw.ericsson.se ([153.88.254.118]) by albatross.ericsson.se (8.12.10/8.12.10/WIREfire-1.8b) with ESMTP id i4IDctWR019206 for <avt@ietf.org>; Tue, 18 May 2004 15:38:55 +0200 (MEST)
Received: from esealnt612.al.sw.ericsson.se ([153.88.254.118]) by esealmw143.al.sw.ericsson.se with Microsoft SMTPSVC(6.0.3790.0); Tue, 18 May 2004 15:38:55 +0200
Received: from ericsson.com (research-1fd0e1.ki.sw.ericsson.se [147.214.34.60]) by esealnt612.al.sw.ericsson.se with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id L1CYGSLK; Tue, 18 May 2004 15:38:55 +0200
Message-ID: <40AA11EF.4020407@ericsson.com>
Date: Tue, 18 May 2004 15:38:55 +0200
X-Sybari-Trust: ef7f4035 08d63d2e 06b283dd 00000138
From: Magnus Westerlund <magnus.westerlund@ericsson.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
X-Accept-Language: sv, en-us, en
MIME-Version: 1.0
To: Jose Rey <rey@panasonic.de>, IETF AVT WG <avt@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
X-OriginalArrivalTime: 18 May 2004 13:38:55.0832 (UTC) FILETIME=[77307580:01C43CDD]
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by albatross.ericsson.se id i4IDctWR019206
X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on ietf-mx.ietf.org
X-Spam-Status: No, hits=0.0 required=5.0 tests=AWL autolearn=no version=2.60
Content-Transfer-Encoding: quoted-printable
Subject: [AVT] Comments on draft-ietf-avt-rtp-3gpp-timed-text-01.txt
Sender: avt-admin@ietf.org
Errors-To: avt-admin@ietf.org
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Id: Audio/Video Transport Working Group <avt.ietf.org>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable

Hi,

Here is my comments on the draft.

1. Why is there a empty line in the header between author company and 
publication date? Also expires should be on the line after the filename 
to keep things together.

2. The status of this memo section is not correct. Section 10 of RFC 
2026 has been replaced. With the publication RFC 3667 and 3668 it should 
as I understand it now read:

Status of this Memo

     By submitting this Internet-Draft, I (we) certify that any
     applicable patent or other IPR claims of which I am (we are) aware
     have been disclosed, and any of which I (we) become aware will be
     disclosed, in accordance with RFC 3668 (BCP 79).

     By submitting this Internet-Draft, I (we) accept the provisions of
     Section 3 of RFC 3667 (BCP 78).

     Internet-Drafts are working documents of the Internet Engineering
     Task Force (IETF), its areas, and its working groups.  Note that
     other groups may also distribute working documents as Internet-
     Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other documents
     at any time.  It is inappropriate to use Internet-Drafts as
     reference material or to cite them other than as "work in progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/1id-abstracts.txt

     The list of Internet-Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html

------ end of quote -------
Please notice that certain things in the declaration depends on document 
type and what the author is willing to give away. For a WG document 
going for standards track this will be most common appearance. These 
details will become more clear when IETF has published new Internet 
draft guidelines on the homepage.

One can also add the following paragraph to clarify things further as it 
is a wg document:

    This document is a submission of the IETF AVT WG.  Comments should
    be directed to the AVT WG mailing list, avt@ietf.org.

3. The copyright notice on the first page. I am not certain it is 
compatible with the one on the last page. Please remove.

4. Change log: Can you please restrain the change log to only be 
difference since the previous version? It becomes quite long otherwise, 
and I don't find it that useful.

5. Section 1. "Unit or Access Unit" Shouldn't the last sentence include 
a "so building what is called a (access) unit."?

6. Section 3. Marker bit: I think it should be a normative word in 
there, like  "the marker bit SHALL be set to 1 if the RTP packet 
includes one or more whole text samples or the last fragment of a text 
sample; otherwise set to 0."

7. Section 3. "
    Timestamp: the timestamp MUST indicate the sampling instant of the
    earliest (or unique) text sample contained in the RTP packet."
What is a "unique" text sample? Or are you simply meaning the "only" 
text sample.

8. Section 3. Timestamp: I think there a some improvement one can make 
to the timestamp derivation text when multiple samples or fragments are 
present.
- First I think the normative definition of the order of the samples or 
fragments should be done in another chapter.
- Then the normative definition of how to calculate the TS for a 
subsequent sample does not seem to properly take into account that other 
text samples may not carry duration.


9. Section 3:
   "If the timed text is streamed from a 3GP file, the timestamp
    clockrate MUST be copied directly from the value of "timescale" in
    the Media Header Box for that text track."
Why is this a MUST? I understand that one must maintain the correct 
synchronization, however if a sender determines to rewrite everything 
upon transmission, is it something that will not work.

10. Section 3.1.2:

   "In order to transport these larger text samples using RTP, it could
    be argued that a careful encoding be used to transform the original
    large sample into smaller self-contained text samples that fit into
    the given transport MTU.  This would comply with the ALF principle,
    as described in the guidelines for RTP payload formats, RFC 2736
    [14].  It would also need additional pre-processing previous to RTP
    encapsulation and that senders understand the modifiers format.
    However, given the low probability of fragmentation, it is believed
    that the overhead of this pre-processing is not worth and it is more
    appropriate to encode text samples without taking the path MTU into
    account.  In this manner, this payload format meets a trade-off by
    intentionally leaving out this pre-processing and making the
    fragmented samples less robust to packet losses."

I would recommend that one does in fact tries to ensure that one has 
samples that fits the network, but that to ensure functionality also in 
these cases when this is difficult, or not possible due to other 
constraints the fragmentation mechanism is present. However one SHOULD 
take the MTU into account.

11. Section 3.1.2: "The most important consequence of this design choice 
is that while text string fragments can be displayed in the absence of a 
previous text fragment, modifiers for that text string are useless if 
they are not completely received."
I think it is a bit unclear what "they" are intended to be, the 
modifiers or the text string.

12. Section 3.1.2: Second bullet: I think one can extend the conclusion 
why this is good to become the following:
"This reduces complexity by minimizing the number of fragments and thus 
also improve the error robustness."

13. Section 3.1.2, third bullet:
     "in order to fill up the remaining bits of a packet, piggybacking
      of sample descriptions MAY be performed.  Also fragments of past
      samples MAY be piggybacked.  For this purpose the server MAY
      reserve a certain amount of buffer to store already sent units for
      piggybacking."

What is meant by "In order to fill up the remaining bits ...". I guess 
you mean "If space in the packet up to the MTU, and there is available 
bit-rate, then sample description MAY be piggybacked. Also is really 
piggybacked the right word? Does not a sample description need to have a 
  specific place in the packet, and thus can't be added at the end?

14. Section 3.1.4:
"        o Although units containing modifier boxes or fragments thereof
            do not include a duration field, they make use of the RTP
            timestamp to group together.  Therefore, they SHOULD be
            transmitted in the same order as they appear in the sample
            and be placed as near as possible to the text to which they
            apply.  Logically, this does not apply for retransmitted or
            redundant packets or for units that are piggybacked into
            other packets."

Is this really correct that the modifier boxes does not need duration? 
As I understand it any modifier box is connected to the text sample by 
having the same timestamp. Thus it would require a modifier box to have 
a duration field if they should be possible to aggregate separately in 
regards to the text strings.

15. Section 3.1.4: The concept of unknown duration is a bit loosely 
defined. I haven't checked 26.245, but is it really defined there? I 
understand the desire have functionality to define a sample as being 
active until replaced. However if this is not defined in the 3GPP spec, 
I think we must be cautious of defining such a mechanisms. If it is 
defined, does it use a special explicit indicator that this is the case? 
Otherwise I think it should use such a SDUR field indicator.

16. Section 3.1.5: I think the buffer concept is somewhat garbled in 
this paragraph. To me it seems to talk about about both sender side 
buffers and receiver buffers, without making any real differentiation.

17. Section 3.2: I think this section before going into clear definition 
of all types, should give a clear definition of how different blocks can 
be aggregated, and how one derive the necessary information. How the 
packets must be built to be able to recover which type 1 that a type 3 
belongs to. I think there are some oversight here of not providing SDUR 
also for type 3 and 4. I think the timestamp derivation for all packets 
types must be made much more clear.

18. Section 3.2.1: Definition of "R" bits. Please also include that they 
SHALL be ignored by the receiver. Otherwise they will not be possible to 
use without external signalling.

19. Section 3.2.1: Type list: I think the types are in all case a single 
fragment, not multiple ones that the plural form indicates.

20. Section 3.2.1:
        "Two TYPEs (1 & 2) are defined for units containing text strings
         another two (3 & 4) for units not containing text strings (thus
         no timing attributes) and a final TYPE 5 for sample descriptions
         (also lacking timing attributes)."
This sentences missing a "," after "text strings".

21. Section 3.2.2: I thing the following sentences has some language 
problems:
   "In this case, TYPE 1 units MUST not
    have contents.  This means that the LEN field MUST have a value of 6
    (0x0006).  Otherwise, the LEN field MUST be always greater than 6
    (0x0006)."

I think it should read more like:
"In this case, the TYPE 1 unit MUST not have contents, and the LEN field 
SHALL have a value of 6. Otherwise the LEN field SHALL always be greater 
than 6."

22. Section 3.2.4:
"Note that the SLEN, SIDX and SDUR fields are not present.  This is
    because: a) these fragments do not contain text strings and b) these
    types of fragments are applied over text string fragments, which
    already contain this information. "

I think here would, unless better clarified elsewhere a place to 
actually define how this type 3 is bound to its TYPE 1 or TYPE 2s.

23. Section 3.2.4:
"   o The TOTAL/THIS field indicates whether the unit contains a part of
      or the whole of the modifiers: if TOTAL=THIS, then all modifiers
      are included here.  In this case, TOTAL=THIS MUST be greater than
      one, because there cannot be a sample of modifiers without text
      strings.  Otherwise, this unit just contains the first fragment."

Why must the value of total and this be larger than 1 although all 
modifiers are contained in one sample?

24. Section 3.2.6: I think it is appropriate that the TYPE 5 header is 
mandatory to implement, otherwise we will have problems with 
interoperability. And the text claims that the implementation of this 
format is rather minor, if true, then there is no reason for having it 
optional.

25. Section 4.
"... the simplest option for packet loss resilient transport is to
    send the same RTP packet or the same text samples (or fragments)
    again."

This sentence contains a error: "the same RTP packet" SHALL NOT be 
repeated. Repeating a packet implies to send two packets with the same 
RTP sequence number. Thus preventing RTP's reporting mechanism to 
discern the two different packets. So please remove this part, it is 
fully sufficient to repeat the payloads.

26. Section 4.
"  o in repeated packets, all RTP header fields MUST keep their
      original values except the sequence number that MUST be increased
      to comply with RTP. "
I think this should talk about repeating RTP payload, and thus requiring 
all payload specified fields (TS, M, PT) to be the same. I don't see a 
reason to demand that any other field are actually the same. If one over 
specifies things, any future usage of an RTP profile will be extremely 
problematic.

27. Section 4.
"If single units are repeated in packets different from their
    originals, care SHALL be taken to preserve their original timing."
What is here meant with original timing?

28. Section 4:
" A possible solution MAY be that the encoder provides a static default 
sample description to be used for these cases."
I think one should clarify that this would need to be a application 
specific definition.

29. Section 5. I would recommend that either use stricter need for 
congestion control. Otherwise it might be wise to simply use the 
following sentence:

     Congestion control for RTP SHALL be used in accordance with RFC 3550
     [?], and any applicable RTP profile, e.g. RFC 3551 [?].

30. section 6.2: I think that one shall have a SMIL reference at the 
first use of the name.

31. Section 7.1: Please fix the indentation so it is easier to read:

parameter: aslödsaösödlkskaaölsldsaööl
            adsösaödsdsadslaösaöldsa

32. Section 7.1: The definition of how the value should be written is 
unclear. First one thinks that z should be multiplied with the sum of 
x*256+y, while it appears that the number should be concatenated. Please 
clarify.

33. Section 7.1: Spldesc: It is two possibilities. Also I think possible 
values "both" and "out" should be used mutually exclusive.

34. Section 7.1, tx3g: Please add a reference to the BASE64 RFC. It is 
number 3548.

35. Section 7.1 tx3g: I think this is a optional parameter, as it is 
allowed to be empty. Define the lack of it as equal to empty string.

36. Section 7.1, layer, tx, ty, heigth, width: What is the allowed value 
range here. If it is a 32-bit integer please be explicit about it.

37. Section 7.1: I think that you need to be a bit more clear about how 
the values are defined. Instead of some half formal definition of the 
value, either write it clearly in word, or use formal syntax notation. I 
think writing it in words are often sufficient.

38. Section 8.1: I think we have two bullets that are equivalent to two 
others. The third and fourth, seem to be possible to replace with only 
the fifth.

39. Section 8.2: I think there are a few problems with the offer answer 
section. The first major one is the sample descriptions. How is they 
handled in a bi-directional session? Are they defined per direction, and 
how does then unicast differentiate multicast. Or are they shared 
between all participants. But are then the answerer allowed to add his 
to the list the offerer provide. These are hard problems but they need 
to be spelled out.

40. Section 8.2: How is the version handled in an interoperable way? Is 
a answerer allowed to downgrade the version he accepts to receive? Thus 
enabling the offerer to tell that he is ready to receive and send 
version X, while the answerer declares what it accepts to receive, and 
thus most probably also send.

41. Section 8.2: The section uses server, while an offer/answer session 
rarely has these designations of the participants.

42. Section 8.2: Bullet 2 contains a forbidden character, which is not 
within the lower 7 bit range of ascii.

43. Section 8.2: Last paragraph: Offer/Answer defines its reject 
procedure based on the external signalling protocol (SIP). So avoid 
using reject, by setting a certain value. It can signal or indicate, but 
reject is really something that affects the whole session. Also I think 
that one can avoid having spldesc to result in rejected setup in most 
cases.

44. Section 10.
"These types of attack may easily be avoided by using authentication."
I think the right thing to avoid these attacks are "integrity protection 
and source authentication". You need the integrity protection to ensure 
that nobody can change the content en-route, and source authentication 
to ensure that the sender really is which he claims to be. In many cases 
implicit source authentication is sufficient.

45. Section 11.2: Reference 10, is now available as RFC 3711.


Cheers

Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund@ericsson.com



_______________________________________________
Audio/Video Transport Working Group
avt@ietf.org
https://www1.ietf.org/mailman/listinfo/avt