4. Configuring semantic analyses

Different stylecheck rule categories

  1. Syntactical checks

    For example, “are goto statements used?” or naming conventions. Most of the Misra-C / Misra-C++ / Autosar rules fall in this category.

  2. Semantic checks

    • Detect runtime errors like Division-by-zero, uninitialized variables, unused definition, over- or underflows, null dereference, etc.

    • Require deep semantic analysis.

    • Used e.g. by MisraC2012Directive-4.1.

  3. Some rules that belong to both categories.

Rules that belong to the “semantic” category usually

  • require more resources for analysis (time and memory),

  • can produce false-positives (not avoidable due to theoretical limits),

  • have more extensive settings.

Hence, the default settings deactivate semantic checks.

4.1. How to recognize expensive rules?

Attention

You still have to activate StaticSemanticAnalysis for these rules to work.

4.2. Workflow

Todo

Add picture of workflow

  1. Run base analysis.

  2. Run all semantic checks using information from base analysis (possibly invoking its own expensive analysis).

Some configuration options influence both workflow stages, e.g., EntryPoints, Externals, or Resources.

4.3. Configuring StaticSemanticAnalysis

This rule triggers the basic semantic analysis. It computes

  • data flow,

  • control flow, e.g. call graph, dead code sections,

  • pointer analysis, and

  • numerical values.

Pointer analysis computes targets for dereferencing pointers, for example:

int x;
int y;
int *p = get() ? &x : NULL;
*p = 42; // p = 0 or p points to x

The basic semantic analysis can execute an additional module (”Abstract Interpretation”) to improve precision for

  • information where a pointer can possibly be null, and

  • numerical information (useful for checks like div-by-zero, overflow, or out-of-bounds)

Its configuration contains options that influence base analyses or all following semantic checks.

4.3.1. Top-level options

Todo

This section might be out-of-date

global_classic_options

Controls precision of the base analysis, e.g. pointer analysis and activation of Abstract Interpretation module.

The name is due to a future mode, FSPTA, which is not yet available. The current analysis mode will then be called “global classic mode”.

abstract_interpretation_options

Controls precision of the Abstract Interpretation module.

callgraph_options

General assumptions about calls and the structure of the call graph.

exception_options

Assumptions regarding the analysis of exceptions (only relevant for C++ code).

enable_filtering_heuristics

If active, rules are allowed to make unsound assumptions to exclude improbable findings. This way false positives might be reduced, but it is also possible that true findings are no longer reported.

logging_level

Controls the detail level of the logging output.

4.3.2. Options for increasing precision

Todo

This section might be out-of-date

/Analysis/AnalysisControl/StaticSemanticAnalysis/global_classic_options.routine_context_granularity = N

Caller of a function are separated into at most N contexts (N = 0, 1, …, 8) Larger number of contexts: potentially more precision, but more expensive (set to N=1 for large projects).

/Analysis/AnalysisControl/StaticSemanticAnalysis/global_classic_options.abstract_interpretation

Activates Abstract Interpretation module for division-by-zero, overflow-check and null dereference check.

/Analysis/AnalysisControl/StaticSemanticAnalysis/global_classic_options.process_subobject_casts

TODO

4.3.3. Options for increasing speed

Todo

This section might be out-of-date

/Analysis/AnalysisControl/StaticSemanticAnalysis/global_classic_options.use_pointer_filter

Assumes that character, floating point or boolean types cannot store a proper pointer. This should be valid in almost all projects and helps speed up the analysis.

/Analysis/AnalysisControl/StaticSemanticAnalysis/global_classic_options.restrict_bit_size_for_integers_storing_pointers

Assumes that a proper pointer has at least the size of (void*) Should be valid in almost all projects.

4.4. Configuring EntryPoints

Checks use EntryPoints as starting point for analyses. We can distinguish two main scenarios for the analysed projects.

  1. It has one entry point, namely the main() function.

  2. It has multiple entry points, that have to be configured. This might be the case if

    • it consists of multiple executables,

    • it uses interrupt service routines,

    • it uses invocation mechanisms provided by the OS or framework, or

    • it is actually a library and has a public API.

To configure multiple entry points, you have three options.

Option 1: Set /Analysis/AnalysisControl/StaticSemanticAnalysis/callgraph_options.treat_uncalled_functions_as_alive to true

All functions that are never called by another one are automatically treated as entry points. In the following example, h() and f1(...) will be treated as entries.

void g() {}
void h() {} // entry
void f1(int* x) { f2();} // entry
void f2() { g();}

Pro:

  • Easy configuration.

Con:

  • Can lead to many entry points and large analysis times.

  • Usually not correct.

Option 2: Use /Analysis/AnalysisControl/Environment/EntryPoints rules

Explicitly specify entry points with the following rules.

  • By name: EntryPoints-EntriesByName.

  • By specific keywords in the signature (e.g. IRQ_HANDLER): EntryPoints-EntriesByLinkname.

  • By file locations: EntryPoints-EntriesByPath.

Pro:

  • More efficient and more precise than treat_uncalled_functions_as_alive option.

Con:

  • Requires more effort for maintaining configuration.

Option 3: Manually maintain a “driver” module that non-deterministically calls relevant entry points.

You can write a separate program, similar to a test suite, that prepares the initial state of your project and then calls functions either totally non-deterministically or with the constraints you choose.

int main(void)
{
   init_library();
   while(1){
      switch(random())
      {
         case 0: lib_func1(); break;
         case 1: lib_func2(); break;
         ...
      }
   }
}

Pro:

  • Most precise way of specifying actual program behaviour.

  • Can be re-used for testing or other QS methods.

  • Can easily be kept under version-control.

Con:

  • Requires the most effort.

  • Difficult to ensure that it represents the actual behaviour of the project.

4.5. Configuring Externals

Todo

Not yet in slides

4.6. Configuring Resources

Todo

Not yet in slides