C/C++ knowledge base reference

The C/C++ knowledge base file is a text file with the extension .kb. During every analysis of your code, Klocwork automatically generates a knowledge base, which contains a record for each function that operates in the code. To change the way Klocwork understands your code, you can create your own knowledge base records, place them in a .kb file, and import them into the Klocwork analysis of your code. To see examples of how to create knowledge base records to tune Klocwork analysis, see Tuning C/C++ analysis.

Knowledge base syntax

As well as knowledge base records, a .kb file can contain blank lines and comments (preceded by #).

A knowledge base record contains the following fields separated by spaces:

<function_name> <function_key> <record_kind> <specification>

where

  • <function_name> is the fully qualified name of the function or class method to which this knowledge base record applies. You can use wildcards (*) in class or namespace names, but not in method names.

    You can also use wildcards to represent types in templates and template functions, up to one level. For example: std::vector<*>::insert (templates) or mytemplatefunction<*> (template functions)

  • <function_key> is the function key; it is based on function arguments (signature) and is used to distinguish C++ overloaded functions. The following patterns may also be used:
    • "-" (hyphen) for all C functions and any C++ functions that are not overloaded
    • nothrow for nothrow operators
    • placement for new and delete operators. Note that to avoid confusing the Klocwork analyzer, when manually editing the knowledge base, you must use a special function key for functions named "new" or "delete": a double hyphen (--). Otherwise, the analyzer will interpret these calls as calls to the built-in C++ operators new and delete.

    You can also use wildcards for the <function_key> field. You can use a single wildcard (*) to match any signature.

    You can specify the number of arguments in a function key by using the syntax "@args(N)". For example:
    foo @args(2) RET 1 : $$ EQ(0)
    This KB record will match any function "foo" in the global namespace that has two arguments of any type.

    Examples

    # <function_name> <function_key> <record_kind> <specification>
    1. read_data - DMEM ,MRF,1
    2. MyClass::fail - NORET  
    3. MyClass::new nothrow RET env:EQ(0)
    4. MyClass::new placement DMEM ,MRF,2
    5. *::read * DMEM ,MRF,1

    The easiest way to write a knowledge base record with an explicit function key is to take an automatically generated knowledge base record for the appropriate function or method and modify its <record_kind> and <specification> fields accordingly.

  • <record_kind> is the name of the knowledge base record, such as ALLOC or XMRF
  • <specification> defines the characteristics you want to apply to the knowledge base record

The <specification> field differs for every knowledge base record, and is defined as a socket expression or a conditional socket expression.

A <socket_expression> is typically defined in the form:

$<arg_number> : <value> | $$ | [1->] .<field_name>

where

  • <arg_number> is the number of the argument, or parameter, in the function. Zero (0) designates the 'this' argument for class methods.
  • <value> is the value you're assigning to the argument, which can be a single value or a range in square brackets, with some records using range operators EQ, NE, GE, and LE, and others using symbols =, !=, ==, <, >, <=, and >=. For example, $1 EQ(0) $2 [4,16] specifies that argument 1 must be equal to zero, and argument 2 must be between 4 and 16. A socket expression can also include the usual logical (&&, |) and arithmetic (+, -, *) operators.
  • $$ specifies the return value of the function.
  • <field_name> is the simple or qualified name of a field in a class or structure, and -> indicates a pointer to the field name. For example, $1->x designates field x of the structure pointed to by the first argument ($1) of the function.

Socket expressions can also use properties:

  • charlength - string length in characters
  • arraysize - allocation size of buffer in array elements
  • bytesize - allocation size of buffer in bytes

and functions:

  • if (a,b,c) - logical expression a is evaluated, and the result of the function is b if a is true, and c if a is false
  • min(a,b) - minimum of two values
  • max(a,b) - maximum of two values
  • formatted_printf(a, b) - returns the length of the string that would be created by a printf-like function if the format string is an argument a, and printed arguments start from argument b

A conditional socket is expressed in the form:

<precondition> : <socket_expression> : <postcondition>

where

  • <precondition> is a condition that must be met for the operation designated by the socket expression, or 1 to indicate that there is no precondition and the operation always happens
  • <postcondition> is a condition that is met if the operation succeeds

Using a KB for a virtual method

For virtual methods, you can apply the KB record to a base class and all its children classes, but only through the interface of the base class specified. For example, if you want to apply the KB record:

Base::func * RET 1 : $$ EQ(5)

to the following code:

class Base
{
public:
    virtual int func();
};
 
class Child : public Base
{
public:
    virtual int func();
};
 
void foo()
{
    Child *c = new Child();
    Base *b = c;
 
    int x = b->func(); // Since this is a pointer of class Base, then the KB will be applied here even if the object is of type Child. Klocwork will know that x == 5 after the call.
    int y = c->func(); // Since this is a pointer of class Child, then the KB will NOT be applied here even if the class derives from Base. So, Klocwork will not know anything about the value of y.
}

Note that the KB is applied only to the Base interface, not the Child interface. You can add a new KB specifically for the Child class (the virtual method) to specify the same behavior (or a different one). This allows you to specify a KB for Child classes; However, it is only applied to the Child class objects.

Specification fields for record kinds

ACQUIRE and RELEASE records

ACQUIRE and RELEASE records specify rules for resource handling. An ACQUIRE record specifies that a function can acquire (allocate) a resource object of the kind identified by its descriptor. A RELEASE record specifies that a function can release (free) the identified resource object.

Specification field syntax

<resource_kind> ':' <conditional_socket> | 'ignore'

where

  • <resource_kind> denotes the kind of resource handling to match, such as FILE or pthread_mutex
  • <conditional_socket> identifies the value that is used as a resource descriptor for the acquired resource
  • ignore can be used to skip the acquire or release function in the code

ACQUIRE and RELEASE function pairs

The RH.LEAK checker searches for the following pairs of functions as ACQUIRE/RELEASE pairs:

pthread_attr_init / pthread_attr_destroy pthread_mutexattr_init / pthread_mutexattr_destroy pthread_condattr_init / pthread_condattr_destroy pthread_barrierattr_init / pthread_barrierattr_destroy pthread_rwlockattr_init / pthread_rwlockattr_destroy posix_trace_attr_init / posix_trace_attr_destroy

These functions relate mainly to the pthread library (pthread.h) and manipulate "attribute" objects of different kinds which are used to initialize corresponding objects such as "thread", "mutex", and "barrier".

Example 1

In the following code snippet, the RH.LEAK checker would normally report two issues at line 37:

1   #include <sys/types.h>
2   #include <sys/stat.h>
3   #include <sys/mman.h>
4   #include <fcntl.h>
5   #include <pthread.h>
6  
7   struct semaphore {
8   pthread_mutex_t lock;
10  unsigned count;
11  };
12  typedef struct semaphore semaphore_t;
13 
14  semaphore_t *
15  semaphore_create(char *semaphore_name)
16  {
17  int fd;
18  semaphore_t *semap;
19  pthread_mutexattr_t psharedm;
20  pthread_condattr_t psharedc;
21  
22  fd = open(semaphore_name, O_RDWR | O_CREAT | O_EXCL, 0666);
23  if (fd < 0)
24  return NULL;
25  ftruncate(fd, sizeof(semaphore_t));
26  pthread_mutexattr_init(&psharedm);
27  pthread_mutexattr_setpshared(&psharedm, PTHREAD_PROCESS_SHARED);
28  pthread_condattr_init(&psharedc);
29  pthread_condattr_setpshared(&psharedc, PTHREAD_PROCESS_SHARED);
30  semap = (semaphore_t *) mmap(NULL, sizeof(semaphore_t),
31  PROT_READ | PROT_WRITE, MAP_SHARED,
32  fd, 0);
33  close(fd);
34  pthread_mutex_init(&semap->lock, &psharedm);
35  pthread_cond_init(&semap->nonzero, &psharedc);
36  semap->count = 0;
37  return semap;
38 }
37:Resource acquired to 'psharedm' at line 26 may be lost here.
37:Resource acquired to 'psharedc' at line 28 may be lost here.

To suppress these issues in the Klocwork analysis, you can add the following records to the knowledge base:

   pthread_mutexattr_init - ACQUIRE ignore
   pthread_mutexattr_destroy - RELEASE ignore
   pthread_condattr_init - ACQUIRE ignore
   pthread_condattr_destroy - RELEASE ignore

Example 2

File stream manipulations by fopen and fclose functions from the standard C library (stdio.h) can be described by the following records:

   fopen - ACQUIRE FILE : 1 : $$ : $$ NE (0)
   fclose - RELEASE FILE : 1 : $1 : 1

In this example, the first record says that the value returned by a call to 'fopen' designates a resource descriptor, 'FILE'. The function performs its action, file opening, without any preconditions (as indicated by 1), but only if the returned value is not null. The second record says that the 'fclose' function always closes the file by passing its descriptor as the first argument.

Example 3

The following records can be used to describe pthread mutex manipulations by 'pthread_mutex_init' and 'pthread_mutex_destroy' functions from the pthread.h library:

   pthread_mutex_init - ACQUIRE pthread_mutex : 1 : *$1 : $$ EQ(0)
   pthread_mutex_destroy - RELEASE pthread_mutex : 1 : *$1 : 1

These records specify that the 'pthread_mutex_init' function inits a mutex object pointed to by the first argument when the return value is equal to 0, and the 'pthread_mutex_destroy' function always destroys the mutex by the pointer to its descriptor.

ALLOC records

ALLOC records specify that a function allocates memory only under some conditions on its parameters (preconditions), or sets the returned result to specific values if memory allocation was successful (postconditions). To see examples of how to create ALLOC records to tune Klocwork analysis, see Tuning C/C++ analysis.

Specification field syntax

   <group> ':' <conditional_socket> | ignore

where

  • <group> specifies a memory-function group
  • ignore can be used to skip the allocation function in the code

Example 1

The following knowledge base record:

   search_data - ALLOC stdc : $1 GE(0), $2 GE(0) : *$4 : $$ GE(0)

says that function 'search_data' allocates memory belonging to the 'stdc' allocation group when its first and second parameters are greater than or equal to zero. The allocated memory is passed through the dereferenced fourth parameter, and if the allocation was successful, the function return value is greater than or equal to zero.

Example 2

In the following snippet, Klocwork would normally report a false positive memory leak for the variable 'p' in the function 'use_precond':

1   void alloc_precond(int a, void** p)
2   {
3   if (a>0) 
4   *p = malloc(10);
5   }
6
7   void use_precond(int input_data)
8   {
9   void* p;
10  if (input_data==0) {
11  alloc_precond(input_data, &p);    // process it

To avoid this false positive, the ALLOC record automatically generated for the function 'alloc_precond' is:

    alloc_precond - ALLOC stdc : $1 GE(1) : *$2 : 1

This record says that the new memory is allocated if the parameter 'a' is greater than or equal to 1. When the function 'alloc_precond' is called in the 'use_precond' example, the value of the argument 'input_data' is zero, so no memory is allocated.

Example 3

In the following snippet, Klocwork would normally report a memory leak for the variable 'ptr' in the function 'use_postcond':

1   int alloc_postcond(int** p)
2   {
3   int* q = malloc(sizeof(int));
4
5   if (q) {
6   *p = q;
7   return 0; // success
8   }
9   else {
10  return -1; // fail
11  }
12}
13
14  int* use_postcond()
15  {
16  int* ptr;
17  int res = alloc_postcond(&ptr);
18
19  if (res == 0)
20  return ptr; // return for further processing
21  
22  return 0;
22  }

The ALLOC record automatically generated for the function 'alloc_postcond' is as follows:

   alloc_postcond - ALLOC stdc : 1 : *$1 : *$1 NE(0), $$ EQ(0)

The record says that if the new memory was allocated in 'alloc_postcond', the return value of this function is zero, and that *p is non-zero). When this condition is satisfied ('res == 0'), the newly allocated memory is returned for further processing, and no memory leak occurs in 'use_postcond'.

BAA and IAA records

Bounds of array access (BAA) records describe how a function accesses arrays through pointers passed to it: read, write, read/write, access interval, and access unit size. Internal Array Access (IAA) records describe the name and size of local arrays, as well as how functions access these arrays.

Specification field syntax

BAA:

   <ReadWrite> ':' <precondition> ':' <socket-expression> ':' <Interval>[','<UnitSize>] ':' <postcondition> 

IAA:

   <ReadWrite> ':' <precondition> ':' <ArrayName> ':' <ArraySize> ':' <Interval>[','<UnitSize>]

where

  • <ReadWrite> is R, W, or RW
  • <Interval> is '['<boundary specification>','<boundary_specification>']'
  • <boundary_specification> is a socket expression that specifies the array boundary
  • <UnitSize> is a positive number
  • <ArrayName> is the identifier or the array
  • <ArraySize> is a positive number

Example 1

strdup - BAA R:1:$1:[0,charlength($1)]:1

This record says that strdup reads its first argument in the range from 0 to the string length of its first argument. Essentially, this record specifies that strdup expects a zero-terminated string as its first argument.

Example 2

send_data - BAA R:1:$1:[0,if($2==0,charlength($1),$3+1)]:1

This record says that the function send_data reads a number of bytes from the buffer pointed to by the first argument. The number of bytes is taken from the second argument, unless the second argument is zero, in which case the first argument is treated as a string, and the size is automatically evaluated as the string length of the third argument plus one.

Example 3

store_and_eval - IAA W:1:temp_buffer:4:[0,$1]

This record says that the function store_and_eval accesses an internal buffer named temp_buffer, and the range of this access is from 0 to the value of the first argument.

Example 4

sprintf - BAA W:1:$1:[0,formatted_printf(2,3)]

This record says that sprintf writes as many bytes to the buffer pointed to by its first argument as needed to print its arguments according to formatted output rules.

Byte-order records

The byte-order records specify host-to-network or network-to-host byte-order conversion functions that receive their input data or return their output data through a variable. The byte-order records describing each type of function are:

  • BO.HTON.I-host-to-network conversion in (I)
  • BO.HTON.O-host-to-network conversion out (O)
  • BO.NTOH.I-network-to-host conversion in (I)
  • BO.NTOH.O-network-to-host conversion out (O)
  • BO.READ-file read
  • BO.RECV-network receive
  • BO.SEND-network send
  • BO.WRITE-file write

Specification field syntax

<conditional_socket>

where <conditional socket> describes the variable used to carry the input or output data

Example 1

hton - BO.HTON.I 1 : $1 : 1

This record says that conversion function hton gets its first argument as input data to convert and expects it to be in host byte order.

Example 2

hton - BO.HTON.O 1 : $$ : 1

This record says that conversion function hton returns a converted value in network byte order.

Example 3

read - BO.READ 1 : *$2 : 1

This record says that function read returns a value read from a file through a variable pointed to by the second argument.

BPS records

Buffer property settings (BPS) records describe how a function changes buffer properties.

Specification field syntax

   <property name>'='<boundary specification>

Example 1

strdup - BPS charlength($$)=charlength($1)
strdup - BPS bytesize($$)=charlength($1)+1

These records for string-duplication functions say that

  • the size of the newly allocated chunk is equal to the length of the string passed as the first argument
  • the space occupied by a newly allocated buffer is the length of its first argument plus one (for a zero-byte terminator)

Example 2

strcpy - BPS charlength($1)=charlength($2)

This record says that the buffer passed as the first argument to strcpy has a string of the same length as the string passed as the second argument.

CONC and LOCK records

The CONC and LOCK records relate to functions that lock and unlock variables, threads, mutexes, and handlers, under certain conditions. The CONC and LOCK records are:

  • CONC.CONDSIGNAL-specifies that a function unlocks threads that have been locked on a condition variable
  • CONC.CONDWAIT-specifies that a function locks the calling thread on a condition variable, and releases the locked mutex
  • CONC.LOCK-specifies that a function locks a handler
  • CONC.LOCK.TRY-specifies that a function tries to lock a handler
  • CONC.UNLOCK-specifies that a function unlocks a handler
  • CREATETHREAD-specifies functions that create threads, for example, pthread_create in Linux, or CreateThread in Windows

  • LOCK-specifies that a function locks a variable
  • LOCK_START-specifies that a function locks a variable that corresponds to a single lock on the current thread

  • LOCK_UNLOCK-specifies that a function locks and unlocks a variable

  • UNLOCK-specifies that a function unlocks a variable
  • UNLOCK_START-specifies that a function unlocks a variable that corresponds to a single unlock for a function that creates a thread

Specification field syntax

CONC.CONDSIGNAL, CONC.LOCK, CONC.LOCK.TRY, CONC.UNLOCK, LOCK, UNLOCK

   <conditional_socket>

where <conditional_socket> specifies the conditions for the lock or unlock

CONC.CONDWAIT

   <precondition> ':' <socket_expression> ':' <socket_expression> : <postcondition>

where

  • <precondition> and <postcondition> specify the conditions for the operation
  • the <socket_expression> fields define the condition variable and the mutex to be unlocked

Example 1

pthread_cond_signal - CONC.CONDSIGNAL 1 : *$1 : 1

This record says that the function pthread_cond_signal always unlocks at least one of the threads that are blocked on the specified condition variable pointed to by its first argument, and returns 0 if the operation is successful.

Example 2

pthread_cond_wait - CONC.CONDWAIT 1 : *$1 : *$2 : 1

This record says that the function always releases the mutex pointed to by its second argument, and causes the calling thread to lock on the condition variable pointed to by its first argument.

Example 3

pthread_mutex_lock - CONC.LOCK 1 : *$1 : $$ EQ(0)

This record says that the function pthread_mutex_lock always tries to lock the object pointed to by its first argument, and returns 0 if the operation is successful. If the mutex is already locked, the calling thread blocks until the mutex becomes available.

DBZ records

DBZ records relate to functions that can cause division by zero, either returning a zero constant value, writing a zero constant value to a variable pointed to by their arguments or doing a division operation depending on input arguments. The DBZ records are:
  • DBZ.SRC - specifies a function call used as a source for division by zero. This means that this function can return a zero constant value either directly or indirectly through output arguments;
  • xDBZ - specifies a function that uses an argument as a divisor for a division operation without checking it for the zero constant value.

Specification field syntax

DBZ.SRC

<conditional_socket>

where

  • <conditional_socket> identifies the variable that can be assigned a zero constant value and the conditions for the assignment.

xDBZ

<conditional_socket>

where

  • <conditional_socket> specifies the conditions for doing a division by zero.

Example 1

foo - DBZ.SRC $1 EQ(0) : $$ : 1
bar - DBZ.SRC $1 LE(-1) : *$2 : 1

The first record of this example shows that the function 'foo' returns a zero constant value as its return value if its first argument is equal to 0.

The second record shows that the function 'bar' writes a zero constant value into the memory referenced by the second argument if its first argument is not greater than -1.

In this code snippet:

int dbz_01(int total) {
    int x = 0;
    int count = foo(x);
    return total / count;
}

Klocwork would detect a division by zero when using the variable 'count' as a divisor, because 'x' is equal to 0 when it is passed as the first parameter in the call to the function 'foo'. Function 'foo' returned a zero constant value, which is assigned to variable 'count'.

In this code snippet:

int dbz_02(int x) {
      int may_be_zero;
      bar(x, &may_be_zero);
      if (x > 0) {
           return x / may_be_zero;
      }
      return (-x) / may_be_zero;
}

Klocwork would detect a division by zero for the last use of the variable 'may_be_zero', because it would be reached if variable 'x' is less than 0, and if 'x' is less than zero, a call to function 'bar' would write a zero constant value into variable 'may_be_zero'.

Example 2

blah - xDBZ $3 GE(1): $1 : 1

This record says that function 'blah' uses the first argument as a divisor of a division or modulo operation without checking it for the zero constant value if the third argument is greater than or equal to 1.

FREE and SAFE_FREE records

The FREE and SAFE_FREE records relate to functions that free memory. Klocwork normally issues a report if functions from different groups are used to allocate and then free memory-for example, mixing C and C++ memory management functions, or mixing scalar and vector memory management functions. The FREE record is used to define specific allocation and freeing behavior.

Specification field syntax

   <alloc_group> <expression> [post: <postcondition>]

where

  • <alloc_group> identifies a memory-function group
  • <expression> specifies which argument is freed

Example 1

realloc - FREE stdc $1 post: $$ NE(0)

This record says that realloc is an stdc function that frees memory passed with the first argument, and returns a non-null result when memory is freed.

Example 2

hsplit - SAFE_FREE stdc $1->tbl_array

This record says that the first argument of the function hsplit points to a structure. Memory referenced by the field tbl_array of this structure is released, and a new value is assigned to this field.

Hash Salt records

The following hash salt records relate to functions that compute hash or hash-based derived key values so that the salt can be provided as an argument. They can be used to extend the RCA.HASH.SALT.EMPTY checker.

  • RCA.HASH.SALT records specify the salt argument of these functions.
  • RCA.HASH.SALT.SIZE records specify the salt size or the salt length argument of these functions.

Specification field syntax

RCA.HASH.SALT

<conditional_socket>

where

  • <conditional_socket> identifies the argument intended for providing the salt to the function.

RCA.HASH.SALT.SIZE

<conditional_socket>

where

  • <conditional_socket> identifies the argument intended for providing the salt size or length to the function.

Example

generateHash - RCA.HASH.SALT 1 : $2 : 1
generateHashWithSaltSize - RCA.HASH.SALT 1 : $2 : 1
generateHashWithSaltSize - RCA.HASH.SALT.SIZE 1 : $3 : 1

These records specify that the generateHash and generateHashWithSaltSize functions use the second argument as a salt for hash computation, and that generateHashWithSaltSize uses the third argument to determine the size of the array or string that is passed as a salt.

NNTS.SRC records

NNTS.SRC records relate to the detection of a non null-terminated string problem, and designate functions that can return non null-terminated strings.

Specification field syntax

   <conditional_socket> ':' <subtype> ':' <size> ':' <src_expression>

where

  • <conditional_socket> identifies a variable that can be non null-terminated and its conditions
  • <subtype> is ncpy for a function that copies memory buffers from a source location (strncpy, memcpy), or size for read-type functions (read, fread)
  • for the size subtype, <size> is a <range_condition> that specifies the new size of the possible non null-terminated buffer
  • for the ncpy subtype, <src_expression> is a socket expression that identifies variables used as source buffers

Example

strncpy - NNTS.SRC 1 : $1 : 1 : ncpy : [$3] : $2
read - NNTS.SRC 1 : $2 : 1 : size : [$3]

The first record of this example shows that the function 'strncpy' can return a non null-terminated string as its first parameter, if its third parameter is less than or equal in length to its second parameter. The second record shows that the function 'read' can return a non null-terminated string as its second parameter. The size of the buffer is passed in its third argument.

NPD records

NPD records relate to functions that can cause null pointer dereferencing, either returning a null value or writing a null value to a variable pointed to by their arguments. The NPD records are:

  • NPD-specifies a function that dereferences an argument without checking them for null (and if the null is passed to the function, a runtime error occurs)
  • NPD.SRC-specifies a function call used as a source
  • xNPD-specifies a function call used as a sink

Specification field syntax

NPD

   <arg_number>

where <arg_number> is the parameter number to dereference

NPD.SRC

   <conditional_socket>

where <conditional_socket> identifies the variable that can be assigned to the null value and the conditions for the assignment

xNPD

   <range_condition> ':' <arg_number>

where <range_condition> specifies the conditions for dereferencing

Example 1

myElemCopy - NPD 1
myElemCopy - NPD 2

The records in this example show that the function myElemCopy dereferences the first two arguments without checking them for null.

In this code:

tElem *bar(tElem *e1, tElem *e2)
{
  if (!e1 && !e2) return NULL;
  if (!e1) e1 = createElem();
  myElemCopy(e1,e2);
  return e1;
}

Klocwork would detect a possible null pointer dereference for a call to myElemCopy, because if e2 is null and e1 isn't null, the second argument passed to myElemCopy would be null, causing the application to fail.

Example 2

foo - NPD.SRC $1 EQ(0) : $$ : 1
xff - NPD.SRC $2 LE(-1) : *$1 : 1

The first record of this example shows that the function 'foo' returns a null pointer value as its return value if its first argument is equal to 0. The second record shows that the function 'xff' writes a null pointer value into the memory referenced by the first argument if its second argument is not greater than -1.

In this code snippet:

int npd_01(int t) {
      if (!t) { 
            char *s = foo(t);
            return *s != '\0';
      }
      return 0;
}

Klocwork would detect a null pointer dereference for the dereferencing of variable 's', because 't' is equal to 0 when it's passed as the first parameter in the call to the function foo. Function foo returned null, which is assigned to variable 's'.

In this code snippet:

int npd_02(int w) {
      int *p;
      xff(&p, w);
      if (w > 0) {
           return *p;
      }
      return -*p;
}

Klocwork would detect a null pointer dereference for the second dereference of variable 'p', because it would be reached if variable 'w' is less than 0, and if 'w' is less than zero, a call to function xff would write a null value into variable 'p'.

Example 3

bcopy - xNPD $3 GE(1): 1

This record says that function bcopy dereferences the first argument without checking it for null if the third argument is greater than or equal to 1.

PWD_INPUT records

PWD_INPUT records relate to functions that identify password field and take the password as input from user.

  • PWD_INPUT.SRC records specify the functions that identify password field.

  • PWD_INPUT.SINK records specify the functions that accepts password as input from user.

Specification field syntax

<socket-expression>

where <socket_expression> defines the source functions PWD_INPUT.SRC or the sink functions PWD_INPUT.SINK

Example 1:

QLineEdit::setEchoMode  QLineEdit*,QLineEdit::EchoMode, PWD_INPUT.SRC $1[1,3] : *$0 :1
QLineEdit::text const\ QLineEdit*, PWD_INPUT.SINK 1: *$0 :1

These records specify the QT framework functions. Argument 1 of setEchoMode is an enum which should have a value between 1 to 3 to be a password field. Hence, it is added as a precondition in PWD_INPUT.SRC KB.

Example 2:

gtk_entry_set_visibility -         PWD_INPUT.SRC $2 EQ(0): *$1 :1
gtk_entry_set_input_purpose -      PWD_INPUT.SRC $2[8,9] : *$1 :1
gtk_entry_get_text -               PWD_INPUT.SINK 1: *$1 :1

These records specify the gtk framework functions. Argument 1 of gtk_entry_set_visibility should have a value of 0 to be a password field. Hence, it is added as a precondition in PWD_INPUT.SRC KB. Similarly, Argument 1 of gtk_entry_set_input_purpose is an enum which should have a value of either 8 or 9 to be a password field. Hence, it is added as a precondition in PWD_INPUT.SRC KB.

Example 3:

wxTextCtrl::\#constructor   *      PWD_INPUT.SRC $6 EQ(2048) : *$0 : 1
wxTextCtrl::GetValue *             PWD_INPUT.SINK 1: *$0 :1

These records specify the wxwidgets framework functions. Argument 6 of constructor should have a decimal value of 2048 to be a password field. Hence, it is added as a precondition in PWD_INPUT.SRC KB.

R and W records

R records specify that a function reads the memory of an argument, part of an argument, or the memory pointed to by an argument or part of an argument. W records specify that a function writes the memory.

Specification field syntax

   <simple_condition> ':' <socket_expression> | 'dummy'

where

  • <simple_condition> says that the function either always reads or writes it ('1') or might read or write it ('env')
  • <socket_expression> designates the value which is read or written
  • 'dummy' means that the called function does not read any passed values or memory reachable by them

Example 1

As shown in this code snippet, function point_getXY reads the x and y fields of structure pointed to by the first argument:

struct Point { int x, y; };
int point_getXY(struct Point * p) {
    return p->x * p->y;
}

Here are some corresponding knowledge base records:

point_getXY - R 1:$1
point_getXY - R 1:$1->Point::x
point_getXY - R 1:$1->Point::y

Example 2

In this code snippet, function check_buf is a dummy function:

int check_buf(char * buf) {
    return 1;
}

Here is a corresponding knowledge base record:

check_buf - R dummy

RET records

All the RET records specify characteristics related to return values for functions. The RET records include:

  • RET- specifies values and symbolic conditions returned by a function either directly (via the return value) or indirectly (via a pointer passed as an argument) whenever the specified preconditions are met
  • CHECKRET- specifies the number of times a result was checked for null before the value was returned
  • CONDNORET- specifies functions that terminate the execution of a process or thread depending on the value of their parameters
  • NORET- specifies functions that never return, such as exit and abort functions
  • RETARG- specifies functions that return the value of their arguments, either when a argument is always the return value of a function or when it's returned through another argument
  • xRET-specifies a dependency between a returned value and values modified by a function through its arguments

For examples of using NORET and CONDNORET in tuning Klocwork analysis, see Tuning C/C++ analysis.

Specification field syntax

RET

     <precondition> : <postcondition>  

CHECKRET

   <checked> ',' <total>

CONDNORET

   <socket_expression>

RETARG

   'RETARG' '1' ':' <socket_expression> '=' <socket_expression>

xRET

   <range_value-1> ':' <socket_expression> <range_value-2>

where

  • <range_value-1> specifies the return value
  • <range_value-2> specifies a value returned by reference

Example 1

foo - RET 1: $$ NE(0)  
foo - RET $1 GE(1), $2 EQ(0) : $$ GE(1), $$==$1, *$3 EQ (-1)  

Each RET record declares that if the function receives parameters that match <precondition>, it returns the values and conditions specified by the <postcondition>. The first record says that for any given input this function returns (directly) a non-zero value. The second record says that if the 1st argument is greater than or equal to 1 and the 2nd argument is equal to 0, the function ‘foo’ returns a value that is equal to the value of the 1st argument and also known to be greater or equal to 1, and also the function ‘foo’ sets the value of the variable, pointed by the 3rd argument, to -1.

Example 2

kwapi_cfgparam_getParameterValue - CHECKRET 31,43

This CHECKRET record says that the number of calls to the function when the result is checked for null is 31, and the total number of calls is 43.

Example 3

check_status - CONDNORET ($1!=0)

This CONDNORET record says that check_status aborts program execution if its first argument is not equal to zero.

Example 4

It is important for Klocwork to know that control flow never returns from some calls. For example, in this snippet:

if (p==NULL) myAssertFunction("p == NULL");
strcpy(p,"Some string");

if Klocwork knows that myAssertFunction never returns, then it knows that following a call to strcpy never passes NULL as first argument. Otherwise, Klocwork issues a warning about a possible null pointer dereference.

myAssertFunction - NORET

This NORET record identifies myAssertFunction as a function that doesn't return.

Example 5

RETARG records specify that a function returns the value of its argument. Both of the following scenarios are supported:

  • when a value passed through an argument always returned as the return value of the function
  • when a value is returned through another argument (written into pointed or referenced memory)

A simple example of the first type of return is shown in the following code. For example, with the following definitions:

1   typedef struct Point { int x, y, z; } Point;
2   int point_getX(Point* p_p) {
3   return p_p->x;
4   }
5   Point p;

In the following statement:

   y = point_getX(&p);

the value of p.x is assigned to y.

point_getX - RETARG 1:$$=$1-->Point::x
reassign_filestream_1 - RETARG 1:*$1=*$2

The first record in this RETARG example says that point_getX returns field x of its first argument. The second record says that reassign_filestream_1 writes the value of the variable pointed to by the second argument into the variable pointed to by the first.

Example 6

acpi_ex_get_object_reference - XRET  EQ(0): *$2 NE(0)

In this example, the xRET record says that acip_ex_get_object_reference returns 0 if it sets the dereferenced second variable to a value other than zero.

SETZERO records

SETZERO records specify that a function writes null bytes to the memory pointed to by an argument or part of an argument.

Specification field syntax

   <socket_expression> ':' <Interval>

where

  • <socket_expression> identifies the memory that is written by the function
  • <Interval> is '['<boundary specification>','<boundary_specification>']'
  • <boundary_specification> is a socket expression that specifies the range of written memory that is filled with null bytes

Example

ResetName - SETZERO $1 : [0,$2]
initNode - SETZERO $1-><id>Node::name</id> : [160,160]

The first record of this example shows that the function 'ResetName' fills the first bytes of the memory area pointed to by the first parameter with null bytes. The number of written bytes is passed as the second argument. The second record shows that the function 'initNode' writes one null byte to the buffer 'name' (member of the structure passed through the first argument), using the constant 160 as an index.

SLEEP records

SLEEP records specify that a function may block program execution for a significant amount of time.

Specification field syntax

   <socket_expression> ':' <Interval>

where

  • <precondition> is a condition that must be met for the block operation, or 1 to indicate that there is no precondition and the block always happens

Example

read - SLEEP 1
WaitForSingleObject - SLEEP $2 NE(0)

The first record says that the function 'read' may suspend process execution for a period of time. The second record shows that the function 'WaitForSingleObject' may block process execution if its second argument is not equal to 0.

SQL Injection data records

The SQL Injection data records relate to functions that execute sql, or prepare statement objects that denote sql commands. The data records are:

  • SQLExec specifies a function that executes a SQL query

  • SQLProp specifies a function that propagates SQL queries from one parameter to another (strings, or prepared SQL statement objects)

  • SQLValidate specifies a function that checks a SQL query

Specification field syntax for SQLExec and SQLValidate

  <socket_expression>

where

<socket_expression> defines the characteristics of the SQL query.

The following examples use the SQLite C/C++ API to illustrate use of the data records:

Example 1

  sqlite3_exec - SQLExec $2

This record says that the second argument passed to sqlite3_exec denotes a SQL command, and will be executed in the call to sqlite3_exec.

Example 2

sqlite3_reset - SQLValidate $1

This record says that the first argument passed to sqlite3_reset denotes a SQL command, and will be validated in the call to sqlite3_reset. The validation, if successful, guarantees that the SQL command denoted by the first argument will be safe for execution subsequently (for example, using sqlite3_exec).

Specification field syntax for SQLProp

  <socket-expression> : <socket-expression>

where

<socket_expression> defines the characteristics of the SQL query and the object to which it is propagated.

The following example uses the SQLite C/C++ API to illustrate the use of this data record:

Example 3

  sqlite3_prepare_v2 - SQLProp $2 : *$4

This record says that the second argument of sqlite3_prepare_v2 denotes a SQL command which is used to construct the object pointed to by the fourth argument. Effectively, the SQL command stored in the second argument is propagated to the object pointed to by the fourth argument.

Tainted and unsafe data records

The tainted data records relate to functions that may return or use unvalidated data. The tainted data records are:

  • TaintedIntData specifies a function that returns an integer
  • TSCheckPT specifies a function that checks a string
  • TSCheckXSS specifies a function that cleans the query string. This record uses one socket expression, which is used to list up the functions in the KB on which the analysis should be aborted without reporting the defect.

  • TSFMTsink specifies a function that returns a format string
  • TSprop and TSsrc specify a function that returns a string
  • TSSinkXSS specifies a function that can write to the stdout stream. This record uses two socket expressions: the first socket gives the index of function parameter in which the tainted string can be passed, and the second (optional) socket gives the parameter index, which accepts FILE* type input. You cannot use $0 (object) or $$ (return variable). If the expression has one socket (for example, TSSinkXSS $3) there is no sub-selective condition. If the expression has two sockets (for example, TSSinkXSS $3 : $1) you cannot create KB for the functions "stdout" "_iob[1]" "&_iob[1]" "__acrt_iob_func(1)" "&__iob_func()[1]" "__iob_func()[1]" "(__getreent())->_stdout").

  • UnsafeAllocSizeAccepter specifies a function that uses data for memory allocation size
  • UnsafeArrayIndexAccepter specifies a function that uses data as an array index
  • UnsafeBinopAccepter specifies a function that uses data in an arithmetic binary operation
  • UnsafeLoopBoundAccepter specifies a function that uses data as a loop bound

Specification field syntax

   <socket-expression>

where <socket_expression> defines the characteristics of the tainted data

Example 1

win32_fread - TaintedIntData *$1

This record says that the variable pointed to by the first argument of win32_fread has a tainted value after the call.

Example 2

<function name> - TSCheckPT 1 : $<parameter number> : 1

In this record <function name> is the name of the function used to neutralize the file path and <parameter number> is the number of the parameter that receives the file path that requires neutralization.

Example 3

strcat - TSprop $1 : $1 , $2

This record says that the buffer passed to strcat as the first argument is tainted after returning from the call if the first or second arguments point to buffers with tainted data.

Example 4

str_new - UnsafeAllocSizeAccepter $1

This record says that the function str_new uses the first argument to calculate the size of memory that needs to be allocated, but does not check if the value is valid.

Example 5

puts - TSSinkXSS $1

In the puts() function, the first parameter can be assigned a tainted string. It does not have any parameter of FILE* type.

Example 6

fprintf - TSSinkXSS $3 : $1

In the fprintf() function, the third parameter can be assigned a tainted string. Its first parameter is FILE* type, which should be checked against stdout.

Example 7

func - TSCheckXSS $1

If a tainted string is passed in the first parameter in the func() function, the analysis should be aborted without reporting the defect.

Username and password records

Username and password records relate to functions that perform some kind of authentication and take the specified data as input parameters. These records extend the HCC checker.

  • HCC.SINK.USER records specify the username argument of such functions.
  • HCC.SINK.PWD records specify the password argument of such functions.

Specification field syntax

   <socket-expression>

where <socket_expression> defines the username function argument HCC.SINK.USER or the password function argument HCC.SINK.PWD

Example 1

db_connect - HCC.SINK.USER 1 : $2 : 1
db_connect - HCC.SINK.PWD 1 : $3 : 1 

These records specify that the db_connect function uses the second argument as the username and the third argument as the password for authentication.

xERRNO records

xERRNO specifies whether a function sets the ERRNO value. ERRNO, a global variable used in C, is defined in errno.h header file.

The value of ERRNO is set automatically for some function calls in C. The value of the ERRNO variable can be used to identify which error was encountered.

Specification field syntax

   <flag>

where

  • <flag> defines the value that indicates whether the function is setting value of ERRNO

  • <flag> value 0 means that the function call does not set the value of ERRNO, as documented by the C standard

  • <flag> value 1 means that the function call sets the value of ERRNO, as documented by the C standard

Example 1

fopen - xERRNO 0
ftell - xERRNO 0

These records specify the C functions that do not set the value of ERRNO. Therefore, flag value 0 is used.

Example 2

fgetwc - xERRNO 1
strtoull - xERRNO 1

These records specify the C functions that set the value of ERRNO. Therefore, flag value 1 is used.

XMRF records

XMRF records specify if a function retains or transfers ownership of allocated memory by pointers passed into it. To see examples of how to create XMRF records to tune Klocwork analysis, see Tuning C/C++ analysis.

Notes on backwards compatibility

Previously, the knowledge base record for specifying ownership transfer was DMEM MRF/NMRF. Now, the record is called XMRF, and the syntax of the record has changed from the DMEM syntax. However, Klocwork still supports existing DMEM MRF/NMRF records.

Specification field syntax

   <socket-expression> ':' <retention_flag>

where <retention_flag> is either 0 to signal that the caller does not retain ownership, or 1 to show that the caller does retain ownership

Example

f_act - XMRF $3 : 1
f_test - XMRF $2 : 0

These records indicate that function calling f_act retains ownership of the third argument, and function calling f_test transfers ownership of the second argument to f_test.