.
Last update: 1997-05-20
9945-2-85 _____________________________________________________________________________ Topic: ERE's Relevant Sections: 2.8.4.1.2 Defect Report: ----------------------- From: [email protected] (Jeff Hendrikse) Date: Tue, 15 Nov 1994 14:53:57 -0500 Sections: 2.8.4.1.2, ERE Special Characters, lines 3069-3072 and B.5.3, Returns, line 424. Problem: Section 2.8.4.1.1 states, with respect to the repeat characters *, +, ?, and {, that "Any of the following uses produces undefined results: - If these characters appear first in an ERE, or immediately following a vertical line, circumflex, or left parenthesis." This implies that, for instance, the RE "*foo", has undefined results. In section B.5.3, discussing the return codes from the regexec and regcomp C API's, the table B-10 includes the error: "REG_BADRPT ?, *, or + not preceded by a valid RE" This text seems to overlap and contradict the previous text. If the repeater is at the beginning of a RE, then it is not preceded by a valid regular expression, which then results in the error. This section implies that the same RE, "*foo", would result in the error REG_BADRPT, since the NULL character preceding the repeat character is not a valid RE. We would like to see clarification of these two points. Recommendation: It is requested that the implementation be allowed undefined results if the repeat character appears first in the regular expression. Historically, this condition would either be treated as an error, or the repeat character would not be treated specially, as is the case with BRE's. If the repeat character appears after a regular expression which is not a valid expression, this condition should trigger the error. So, the expression "*foo" will produce undefined results, while the expression "f+*oo" would case a REG_BADRPT (or REG_BADPAT) error condition. WG15 response for 9945-2:1993 ----------------------------------- The standard does not require the implementation to detect any particular error, nor to return an error in any particular situation. It only requires that the listed errors only be returned when the indicated error is detected by the implementation. So, regcomp() may return REG_BADRPT if given the pattern "*foo", since the '*' certainly isn't preceeded by a valid ERE specified by the standard. It may also do just about anything else, since the interpretation of this ERE is undefined. The interpretation request is based on the conclusion that regcomp (&preg, "*foo", 0); could reasonably dump core, because the interpretation of "*foo" is undefined. The behavior of regcomp() with a pattern such as '*foo' produces undefined results. A conforming application shall not expect the return code REG_BADRPT from regcomp(), if it uses an ERE with a repeat character appearing first or following any of the characters mentioned in section 2.8.4.1.2. The standard clearly states behavior for regular expressions and conforming implementations must conform to this. Rationale ------------- None. Forwarded to Interpretations group: 16 Nov 94 Response received: Feb 10 1995 Proposed Resoln forwarded: 13th Feb 1995 Finalised: March 28th 1995 _____________________________________________________________________________