.
Last update: 1997-05-20
9945-2-25 Class: Defect situation The standards states what it states, and conforming implementations must conform to this. However, concerns have been raised about this which are being referred to the Sponsors of the standard for consideration as a future amendment. Topic: tr Relevant Sections: 4.64.3, 4.64.7 Defect Report: ----------------------- In Section 4.64.3 - Options {of tr}, the standard states that the -c option means ``complement the set of characters specified by string1. See 4.64.7.'' [Draft 12 of ISO/IEC 9945-2:1993 (July 1992), p. 482, line 10472], and in Section 4.64.7 - Extended Description {of tr}, the standard states that [:class:] ``[r]epresents the range of collating elements between the range endpoints, inclusive, as defined by the current setting of the LC_COLLATE locale category.'' [Ibid., p. 484, lines 10544-10546] In Section 4.64.7 - Extended Description {of tr}, the standard states that [i]f the -c option is specified, the complement of the characters specified by string1-the set of all characters in the current character set, as defined by the setting of LC_CTYPE, except for those actually specified in the string1 operand- shall be placed in the array in ascending collation sequence, as defined by the current setting of LC_COLLATE. [Ibid., p. 485, lines 10590-10594] However, if the character set is ISO 646, for example, then the command tr -c '[:print:]' '?' which should translate all unprintable characters to question-mark characters, will pass bytes with the high bit set through unchanged. This is clearly wrong, not historical practice, and violates the principle of least astonishment. The tr utility is a binary file manipulator. May we interpret the wording of lines 10590-10594 as If the -c option is specified, the complement of the characters specified by string1-the set of all possible machine byte patterns, except for those actually specified in the string1 operand-shall be placed in the array in ascending collation sequence, as defined by the current setting of LC_COLLATE. to fix this problem? WG15 response for 9945-2:1993 ----------------------------------- The standard is clear in its requirement that the -c option in this case will place the complement of the characters specified in string1 in the array. It is also clear that the definition of the complement is "the set of all current characters in the current character set, except for those actually specified in string1". This precludes an interpreation allowing the set of all possible machine byte patterns to be added to the complement set. The implementation must follow these requirements. Concern over the wording of this area of this standard has been forwarded to the sponsors. Rationale for Interpretation: ----------------------------- None. _____________________________________________________________________________