PORTING.SIGNED.CHAR

Use of 'char' without explicitly specifying signedness

The PORTING checkers identify code that might rely on specific implementation details in different compilers. The PORTING.SIGNED.CHAR checker detects situations in C code, in which 'char' is used without an explicitly specified sign. (This checker applies only to C, since these signedness issues are picked up by C++ compilers.)

Vulnerability and risk

The 'char' data type isn't precisely defined in the C standard, so an instance may or may not be considered to be signed. Some compilers allow the sign of 'char' to be switched using compiler options, but best practice is for developers to write unambiguous code at all times to avoid problems in porting code.

Mitigation and prevention

Always specify the sign of the 'char' type. This is best done by a using a typedef or #define definition that is then rigorously used everywhere.

Vulnerable code example

Copy
   static char *s = "Hello, \xABWorld\xBB!\n"; 
   /* return next char, or -1 upon end of stream */
   int get_next_char() { 
     return *s ? *s++ : -1;
   } 
   int main() { 
     int ch;
     while ((ch = get_next_char()) > 0) {
       putchar(ch);
    }
    return 0;
  }

When char is unsigned in this example, it works as expected, and the string is printed on standard output with 'World' enclosed in angle quotes (in Latin-1 encoding):

Copy
Hello, B«WorldB»!

When char is signed, the code prints only part of the string, up to the opening angle quote:

Copy
Hello,

Fixed code example

Copy
   typedef unsigned char UCHAR;
   static UCHAR *s = "Hello, \xABWorld\xBB!\n";
   /* return next char, or -1 upon end of stream */
   int get_next_char() { 
     return *s ? *s++ : -1;
   } 
   int main() { 
     int ch;
     while ((ch = get_next_char()) > 0) {
       putchar(ch);
    }
    return 0;
  }

In the fixed code, unsigned char is used instead of char. Changing the condition of the while loop to (ch = get_next_char()) != -1) wouldn't fix the problem, as the stream would be terminated on the '\xFF' character anyway.