[MDSAL-661] Generated enforcers fail to intepret length restriction Created: 23/Feb/21  Updated: 19/Apr/21  Resolved: 24/Feb/21

Status: Resolved
Project: mdsal
Component/s: Binding codegen
Affects Version/s: None
Fix Version/s: 8.0.0, 6.0.9, 5.0.17, 7.0.6

Type: Bug Priority: Medium
Reporter: Robert Varga Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to YANGTOOLS-1224 Data codecs may mis-enforce String le... Resolved

 Description   

Currently this YANG:

 

  typedef foo { 
    type string { 
      length 1; 
    } 
  } 

 

translates to this enforcement code:

 

    private static void check_valueLength(final String value) { 
        final int length = value.length(); 
        if (length == 1) { 
            return; 
        } 
        CodeHelpers.throwInvalidLength("[[1..1]]", value); 
    } 

which looks good on surface, but has a slight flaw. RFC7950 specifies that:

 

 

A "length" statement restricts the number of Unicode characters in
the string.

What we are checking is the number of code units, not code points. The difference becomes obvious when characters outside the Basic Multilingual Plane are encountered:

 

// U+1F31E
final String str = "�"
// Encodes as "\u0xD83C\u0xDF1E", i.e. two code units
assertEquals(2, str.length());
// The two code units together yield a single code point
assertEquals(1, str.codePointCount(0, str.length());

 

 

 


Generated at Wed Feb 07 20:10:27 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.