ASN1SCC

An open source ASN.1 compiler

ASN1SCC (https://github.com/ttsiodras/asn1scc) is an open source ASN.1 compiler that was built to meet the requirements of embedded/space platforms. It generates C and Spark/Ada code and supports the uPER (Unaligned Packed Encoding Rules) and ACN (ASN.1 Control Notation) encodings.

No dynamic memory

The code generated by ASN1SCC, as well as the run-time library, never use dynamic memory functions (such as malloc). All memory requirements (i.e. the size of the encoded and decoded buffers for each ASN.1 message) are calculated at compile time. By doing so, all required memory can be statically reserved at compile-time, thus guaranteeing that there will be no failure due to lack of memory at run-time.
For example, an ASN.1 type describing an array of integers…
…triggers the generation of the following macro definition and C structure from ASN1SCC:
Notice that the maximum size of the encoded data is available at compile-time, via the macro AnArray_REQUIRED_BYTES_FOR_ENCODING. This means that the user code can statically reserve the necessary space at compile-time, in global or static variables, thus guaranteeing the availability of the necessary space:
The variable length arrays of ASN.1 (i.e. SEQUENCE SIZE(Smin .. Smax) OF T are mapped by ASN1SCC to C structures. These structures contain an inline fixed-size array of T with size Smax (i.e. the maximum possible extent of the array) as well as an integer field that stores the actual number of the elements used. The fact that the actual data are stored as an inline array and not as a heap-allocated block pointed to by a pointer (or similarly, a linked list with pointers to heap allocated objects), means that sizeof(AnArray) will always represent the maximum memory needs for the target type. This is true regardless of the complexity of the type (e.g. arrays of sequences containing arrays, etc) – and allows reservation of all the necessary memory space at compile-time.

Automatic Statement Coverage

Critical software like space applications, must meet a set of guidelines – such as specific coding conventions, or thresholds about branch and statement coverage levels. ASN1SCC itself has its own regression checking suite, where thousands of ASN.1 grammars are used to drive the following sequence (for each ASN.1 grammar):

  • ASN1SCC processes the grammar and generates encoders and decoders for its types.
  • A template “main” function is created that encodes the main (root) grammar message, and then decodes it from the encoded data.
  • The generated “main” combined with the ASN1SCC generated encoders and decoders is compiled, and the generated binary is executed.
  • Checks are performed to verify that the message field data remained the same across encoding and decoding.
  • Further checks are done to see that no internal abnormal state was encountered in the encoder and the decoder during this “round trip” (even if t his internal abnormality did not result in any externally visible errors). This includes memory access validations from Valgrind*.

* Valgrind: an instrumentation framework for dynamic analysis of executables ( http://valgrind.org/ )

This test suite provides a baseline of confidence for the correctness of ASN1SCC and the code it generates. However, a valid question posed by a number of early users of ASN1SCC (space companies and ESA) was…

“How can we have increased confidence that code generated by ASN1SCC will correctly process all the – theoretically infinite! – messages that are possible under a particular ASN.1 grammar?”

The caveat here being that, obviously, a project-specific grammar cannot be tested as part of the ASN1SCC quality assurance process – since ASN1SCC is not conceived with any particular grammar in mind.

To address this issue ASN1SCC was enhanced further: it can now automatically generate a set of unit tests for a given input grammar. These unit tests are in fact sets of ASN.1 variable assignments – i.e. data assignments for the grammar’s types. The key point is that these data assignments, upon encoding and decoding, exercise the ASN1SCC-generated encoders and decoders to 100% statement coverage. At the end of this test, the user knows for a fact, that (a) the encoders and decoders were exercised by the automatically generated unit tests so that all their code was executed, without any abnormal state arising at runtime, and (b) that the message data were perfectly preserved in the round trip (original data => encoded message => decoded message => same data).

Assume for example that a simple ASN.1 grammar describes messages containing double precision numbers:

In the course of encoding, decoding and otherwise handling these messages, the code generated by ASN1SCC has to consider all the possible “states” for this number:
When fed with this grammar, ASN1SCC generates a number of test cases, declared as ASN.1 variable assignments:

ASN1SCC compiles these test cases and generates code for the target languages. The generated code, in tandem with a specially made test harness, is subsequently compiled by the compiler – and the resulting executable is then spawned under a coverage checking tool6. For each individual message (“test1”, “test2”, etc) the message content is enco ded, decoded and verified to survive the round trip. At the end of the execution, the coverage checker reports the statement coverage in the encoders and decoders, and the test harness verifies that this is 100%.

Notice that the process is automated – there’s no human involvement. If, for example, the generated test messages do not exercise a part of the code, the user will then get a notice from the coverage checker, that a part of the encoder/decoder is not tested. In that case, he can indicate this to the ASN1SCC developers (so that the required additional test case is generated) – or he can still identify the parts in question, and manually create additional tests for these cases, to reach 100% coverage. This feature is not currently available in any other ASN.1 compiler.

SPARK/Ada support

To increase the quality of the generated sources in environments where very high reliability is expected, the Ada versions of the encoders/decoders generated by ASN1SCC use SPARK/Ada annotations.

The SPARK language consists of a restricted, well-defined subset of the Ada language that uses annotated meta-information – in the form of Ada comments. This meta-information describes the desired component behavior and the individual runtime requirements. It therefore allows static analysis to be performed at compile-time as a further, automated check on program correctness. The static analysis verifies code invariants, preconditions and postconditions – that is, it applies Design by Contract principles to accurately formalize and validate the expected runtime behavior of the ASN1SCC-generated encoders and decoders.

Assume for example the following Ada function definition, which divides two integers:

The programmer implicitly knows that the divisor must never be zero. However, this knowledge is available only to the programmer, and not to the compiler. Therefore if this function is called by user code that calculates the arguments at run-time via an algorithm, the code may end up being called with a divisor value of zero – and an error will then appear and potentially crash the application at run time. To address this, SPARK allows programmers to enrich their interfaces with pre and post conditions. For example, the above definition in SPARK can be annotated as follows:

The line starting with –# is a SPARK annotation which explicitly informs the SPARK examiner that the DIV function requires the divisor parameter to never be zero. This means that the SPARK examiner must check at compile time all the places where the DIV function is called and make sure that it is never called with divisor equal to zero. If this is not the case, the examiner reports an error.

**Gcov, coverage checking ( http://gcc.gnu.org/onlinedocs/gcc/Gcov.html )

ASN1SCC’s SPARK backend emits encoders and decoders that are annotated with SPARK annotations, thus allowing the static analysis to detect invalid usage (data-wise) of the API. In fact, PER-visible ASN.1 constraints are transformed into semantically equivalent SPARK annotations, and will therefore be thoroughly checked by static analysis, guaranteeing that a buffer overflow during encoding/decoding is impossible.

For example, the following procedure is used in the encoding of a Boolean type in uPER. If the encoded value is true, then bit value 1 is written in the uPER stream – otherwise bit value 0 is written.

The derives annotation informs SPARK about the dependencies of outputs from the input values. The pre annotation says that whenever this function is called the value of curPos counter plus one must be within the bounds of the outputStream array. Finally, the post annotation says the value of the curPos counter will be increased by one by the end of this function.

The important impact of these annotations is that SPARK will be able to enforce these restrictions at compile time and thus an out of range exception in lines 13 and 15 is impossible.

Integration with legacy systems – the ACN encoding

The major benefit of ASN.1 is that the encoding and decoding process is independent of the programming language, the hardware platform and the Operating System. Moreover, the standardized ASN.1 encoding schemes offer additional and significant benefits, such as speed and compactness for PER or decoding robustness for BER, etc. However, the existing ASN.1 encodings provide no means for the protocol designer to control the final encoding (i.e the binary format at the bit level). This is a problem for situations where a new ASN.1-based system has to communicate over a binary protocol with an existing legacy system. For example, the PUS (Packet Utilization Standard), which is used in space missions to encode Telemetry/Telecommands, cannot be specified in ASN.1 using any of the existing encodings (BER,PER,etc).

To address this issue, we designed and implemented a new ASN.1 encoding, known as ACN (ASN.1 Control Notation). ACN allows protocol designers to control the format of the encoded messages at the bit level. In ACN, users can specify how each ASN.1 type will be encoded. Attributes can be set, such as the bit length of an integer, its endianness (big/little), its alignment etc. Moreover, for aggregate fields such as SEQUENCE, CHOICE and SEQUENCE OF the user can define optionality patterns, choice determinants, length fields etc.

For example, to encode an unsigned integer in16 bits and align it to the next octet start (i.e. a stream offset in bits which is a multiple of 8), you would…

ASN.1     MyInt1::=INTEGER (0..2000)

ACN        MyInt1[encoding pos-int, size 16, align-to-next byte  ]

In the above example, the protocol designer specified via ACN that the ASN.1 type MyInt1 will be encoded as a positive integer, using 16 bits, and will always be byte aligned in the encoded buffer.

A more advanced example of ACN, based on PUS:

Figure 1: Automatic PUS encoding via ACN

Figure 1 shows a PUS packet. It consists of two main parts: (a) the Packet Header (area with the white background) and (b) the Packet Data Field (top right grey area).

The Packet Data Field part is actually a composite field consisting of multiple sub-fields. The Data Field Header has two enumerated fields (service type and service subtype) which determine the actual form of the “Application Data” field.

In other words, the “Application Data” field is a C HOICE type where the active alternative is determined by the combination of Data Field Header fields service type and service subtype. In ASN.1 this can be modeled as follows:

With the appropriate ACN syntax, one can bind the elements of this ASN.1 grammar into a valid PUS specification:

The ACN present-when keyword is used to match the specific combinations of values with the type that the ApplicationData CHOICE is carrying. Notice that:

1. The ApplicationData has a parameterized ACN encoding. This is shown with the angle brackets (‘<‘,’ >’). Parameterized encoding means that thistype cannot be decoded independently – it needs two extra values. In other words, the ACN decoder function for the ApplicationData type will have two extra parameters that must be passed by the caller.

2. Likewise, the msg field in the PacketDataField, which is a reference type to the parameterized type ApplicationData, has two additional arguments, the dataFieldHeader.service-type and dataFieldHeader.service-subtype. So, a two way binding has been established between the two fields in the DataFieldHeader and the CHOICE type ApplicationData. Two way binding means:

  • during the decoding process, the CHOICE ApplicationData will read the values of service-type and service-subtype in order to be decoded correctly
  • during the encoding process, the values of service-type and service-subtype will be updated automatically based on which of the alternatives a1, a2, a3 is present in the CHOICE ApplicationData.

3. The present-when syntax within the CHOICE ApplicationData expects a boolean expression (in fact, a boolean AND expression), i.e. when all comparisons match, the selected CHOICE target is used.

Automatic ICDs

Interface Control Documents (ICDs) describe the binary format of messages exchanged between entities (the “wire format”), and are widely used in the space domain e.g. in the specification of space/ground interfaces.

ASN1SCC can automatically create ICDs for a given ASN.1 grammar. This allows users who are not familiar with ASN.1 to easily understand the structure of the encoded messages, and if they so wish, manually implement the required encoding in their target environment, and interoperate. The ICD generator supports the uPER and ACN encodings.

For example in the following ASN.1 message…

… the compiler will create an Interface Control Document that will contain the following table:

Figure 2: ICD generator output

The above is an example of the visual layout of the generated document which demonstrates why this tabular / visual way of defining the data structures is more comprehensible for people not familiar with ASN.1.