CIDF APIs: Their Care and Feeding Brian Tung 1. Introduction The common currency of information exchange between components participating in the Common Intrusion Detection Framework (CIDF) is the General Intrusion Detection Object, or GIDO. The construction and encoding of GIDOs is defined in the Common Intrusion Specification Language (CISL) draft (q.v.). Some of the terminology in this draft (such as sentence and SID) can be found defined in that document. This document describes the API for encoding and decoding GIDOs, as well as transporting them "on the wire." It is the aim of these APIs to allow programmers to construct the GIDO in a straightforward manner and to send them to interested parties, without requiring them to know about the details of the encoding or transmission process. 2. Overview GIDOs can be thought of in two different ways: a logical form, and an encoded form. The logical form is a single S-expression, which usually contains sub-S-expressions, and contains human-readable information about an intrusion, or a prescribed response, or other attack-related data. Because different machines represent data in different ways, it is not recommended that CIDF components exchange GIDOs in their logical form (encoded in ASCII, say). Instead, the CISL draft defines a canonical encoding that takes the logical form of a GIDO, and produces an encoded form: an unambiguous sequence of bytes that represents the same information. This API specification describes the calls that an application programmer may make in order to construct the logical form of a GIDO, encode it for transmission to another component, and to decode encoded GIDOs back into their logical form. Auxiliary functions are also provided that will "pretty-print" logical GIDOs. 3. GIDO Construction Producing a GIDO for shipment (i.e., transmission) is a two-step process. First, you produce a tree structure representing the GIDO. Secondly, you encode the structure into a sequence of bytes. Consider the process of constructing a tree. We observe that SIDs can be divided into two groups: those that take one or more S-expressions as argument (i.e., verbs, roles, adverbs, conjunctions), and those that take a single datum or array of data as argument (i.e., atoms). We can thus represent an entire sentence as a tree, where each SID represents a node and the top-level SID is the root of the tree. Any S-expression must eventually contain some actual data (!), so each branch of our tree will eventually end in a number of leaves representing atom SIDs. To make this somewhat more concrete, if V is a verb SID, R1 and R2 are role SIDs, and A1 through A3 are atom SIDs, then the S-expression (V (R1 (A1 data1) (A2 data2) ) (R2 (A3 data3) ) ) can be represented by the following tree. V / \ / \ R1 R2 / \ \ / \ \ A1 A2 A3 | | | | | | data1 data2 data3 Because of the way that the encoding rules are defined, encoding a tree is simply a depth-first traversal and accompanying encoding of each of the nodes in turn. Therefore, in this case, we may encode the tree by encoding V, then encoding the subtree rooted at R1, then the subtree rooted at R2. A consequence of this is that if the above sentence were part of a conjoined sentence, such as (InSequence (...) (...) (...) ) then each of the component sentences could be extracted out of the encoding intact and treated on its own. Put another way, if the sentence has already been encoded on its own, it does not need to be re-encoded for insertion into such a conjoined sentence. Decoding the byte sequence back into a tree structure simply reverses the above procedure. Each SID code indicates, in a bitfield in the first byte, what kind of argument that SID takes: whether it is an elementary data type, an array, or a sequence of S-expressions. The parser then interprets the succeeding bytes accordingly. This API does not construct the "logical GIDO" from the tree, but functions are provided which allow printing of the tree in ordinary GIDO S-expression format. 4. GIDO Encoding/Decoding API Specification In this section, we describe each of the exported calls provided by the GIDO Encoding/Decoding API. This is for the benefit of both application programmers who will use these calls as well as implementors who want to provide this API to others. Unless otherwise noted, calls return 0 on success. Error code definitions are given in Section 5. 4.1. Type Definitions The following are the type definitions that will be used in the syntax for the API calls. typedef struct { int length; unsigned char *data; } cisl_data; typedef struct { char *sid; cisl_data *sid_code; } cisl_sid_table_entry; typedef struct _cisl_node { char *sid; cisl_data *sid_data; struct _cisl_node **child; } cisl_node; typedef cisl_node *cisl_tree; typedef int cisl_error; 4.2. Start-Up Calls 4.2.1. cisl_init_sid_table Syntax: cisl_error cisl_init_sid_table (char *filename); This call reads in a table of SIDs, their codes, and other (optional) information from the named file. The encoding and decoding calls will use this table to interpret the SIDs used in a GIDO. The format of this table is as follows. Each SID occupies one line, and includes text in the format: No specific fields and/or formatting have yet been defined for . Return values: Returns CISL_NO_TABLE if the named file either does not exist or is unreadable by the component, and returns CISL_INVALID_TABLE if the table is not well-formatted. 4.3. GIDO Construction Calls 4.3.1. cisl_init_tree Syntax: cisl_error cisl_init_tree (cisl_tree *tree); This call allocates the space and initializes a cisl_tree structure. Return values: Returns CISL_OUT_OF_MEMORY if space cannot be allocated. 4.3.2. cisl_attach_sid Syntax: cisl_error cisl_attach_sid (cisl_tree tree, char *sid); This call attaches the named SID to the root of the given tree. Return values: Returns CISL_NO_TREE if the tree does not exist. If the tree already has a SID at its root, then cisl_attach_sid clobbers the existing SID with the new one and returns 0. 4.3.3. cisl_attach_data Syntax: cisl_error cisl_attach_data (cisl_tree tree, data_type type, char *data); As described in Section 3, any given SID may take as argument either a single datum or array of data, or a sequence of one or more S-expressions. Only the former kind of SID, atom SIDs, may use this call, which attaches data with the named type, pointed to by the given location, to the root of the tree. Return values: Returns CISL_NO_TREE if the named tree does not exist, returns CISL_NO_SID if the tree does not have a SID attached to its root (using cisl_attach_sid), and returns CISL_INVALID_SID if the attached SID should take a sequence of S-expressions rather than data. 4.3.4. cisl_attach_child Syntax: cisl_error cisl_attach_child (cisl_tree parent, cisl_tree child); Only SIDs which take S-expressions as argument(s) may use this call, which attaches a child tree representing a single S-expression to the root of the parent. See Sections 3 and 4.3.3 for more information. Return values: Returns CISL_NO_TREE if the named tree does not exist, returns CISL_NO_SID if the tree does not have a SID attached to its root (using cisl_attach_sid), and returns CISL_INVALID_SID if the attached SID should take data rather than S-expressions. N.B.: The behavior of this call if the parent and child are incompatible (e.g., both are headed by role SIDs) is currently undefined. 4.4. GIDO Encoding/Decoding Calls 4.4.1. cisl_encode_tree Syntax: cisl_error cisl_encode_tree (cisl_tree tree, cisl_data **block); Encodes a completed tree into a sequence of bytes in the named block. Return value: Returns CISL_NO_TREE if the named tree does not exist, and currently returns only CISL_INVALID_TREE if a problem is encountered in encoding the tree. 4.4.2. cisl_decode_tree Syntax: cisl_error cisl_decode_tree (cisl_data *block, cisl_tree *tree); Decodes a sequence of bytes (which presumably was the result of a call to cisl_encode_tree) back into a GIDO tree. Return value: Returns CISL_NO_BLOCK if the named block does not exist, and currently returns only CISL_INVALID_BLOCK if a problem is encountered in decoding the block. 4.5. Documentary Calls 4.5.1. cisl_print_tree Syntax: cisl_error cisl_print_tree (cisl_tree *tree); Pretty-prints a tree into its normal S-expression form. (In other words, it does the reverse of the transformation given in Section 3.) Return value: Returns CISL_NO_TREE if the named tree does not exist, and currently returns only CISL_INVALID_TREE if a problem is encountered in printing the tree. 4.6. Search Calls 4.6.1. cisl_search_tree Syntax: cisl_search_tree (cisl_tree tree, cisl_tree path, int index, cisl_tree *return_path); Searches the given tree for the index-th path matching the specification, and returns it in the given return_path. A path is simply a unary tree; the programmer constructs it by successively adding nodes depth-first, until an atom SID is reached. At that point, no data need be attached to the atom SID node; any data that is present will be ignored. Return value: Returns CISL_NO_MORE_MATCHES if there are fewer than index matches of path inside tree. 4.7. Miscellaneous Calls 4.7.1. cisl_free_tree Syntax: cisl_error cisl_free_tree (cisl_tree tree); Recursively zeroes out the tree structure, and frees the associated pointers. Return value: Returns CISL_NO_TREE if the named tree does not exist, and currently returns only CISL_INVALID_TREE if a problem is encountered in freeing the tree. 4.7.2. cisl_free_data Syntax: cisl_error cisl_free_data (cisl_data *block); Zeroes out the named data structure, and frees the associated pointers. In addition to being an exported call, it may also be used by the preceding call, cisl_free_tree. Return value: Returns CISL_NO_BLOCK if the named data structure does not exist. 4.8. Error Codes The following error code values are to be used by API callers and implementors. #define CISL_SUCCESS 0 #define CISL_NO_TABLE 1 #define CISL_INVALID_TABLE 2 #define CISL_NO_TREE 3 #define CISL_INVALID_TREE 4 #define CISL_NO_SID 5 #define CISL_INVALID_SID 6 #define CISL_NO_BLOCK 7 #define CISL_INVALID_BLOCK 8 #define CISL_OUT_OF_MEMORY 9 #define CISL_NO_MORE_MATCHES 10 5. Message Layer API Specification In this section, we describe each of the exported calls provided by the CIDF Message Layer API. This is for the benefit of both application programmers who will use these calls as well as implementors who want to provide this API to others. Unless otherwise noted, calls return 0 on success. 5.1. Definitions The following definitions specify maximum values for message layer functions. #define MAXCIDFGROUPS 10 /* Maximum number of multicast groups */ #define MAXCIDFMSGLEN 65400 /* Maximum application message size */ #define MAXCIDFCLASSES 16 /* Maximum number of classes to which an application can subscribe */ The following global variable is used to store human readable error messages from the Message Layer. When the Message Layer returns an error code, it also writes to this string. char CIDFerrstr[80]; The CIDF Message Layer requires an configuration file that holds multicast group information. This information will eventually be replaced through use of a CIDF directory service. The format of the configuration file is as follows: # # This is the configuration file. It must be global across all # CIDF components. (lines starting with "#" are comments) # # Note: Hx is either a hostname or a IPv4 unicast address # group all: \ multicast = 224.1.2.3: \ member = H1, \ member = H2, \ member = ..., \ member = HN group response: \ multicast = 224.1.2.4: \ member = H2, \ member = H4, \ member = H6 # etc... 5.2. Start-Up Calls 5.2.1. cidfml_init Syntax: int cidfml_init (char* filename) This call reads in the specified CIDF configuration containing the multicast group information and initializes the Message Layer. If the filename is NULL, this function reads the default file "./cidf-ml.ini" is used. Applications must call cidfml_init prior to calling other CIDF Message Layer functions. Return values: CIDFML_PARAM_ERROR is returned under the following conditions: filename is NULL and "./cidf-ml.ini" does not exist CIDFML_RESOURCE_ERROR is returned on insufficient resources to initialize or socket call failure 5.2.2. cidfml_bind Syntax: int cidfml_bind (char * groups[], unsigned int groupcount) Subscribe to a CIDF group or list of groups (specified in the configuration file) and return a handle to be used in subsequent calls referencing this list of groups. Return values: handle (if return is >= 0) CIDFML_PARAM_ERROR is returned under the following conditions: groupcount > MAXCIDFGROUPS groups is NULL CIDFML_RESOURCE_ERROR is returned on a generic resource error 5.2.3. cidfml_restrict Syntax: int cidfml_restrict (int handle, ushort * classid, unsigned int classcount) Restrict delivery of messages to only those messages that match one of the class IDs specified in classid. These class ID are defined in CISL section 6.3. cidfml_restrict maybe applied to the handle at any time, however the restrictions do not apply to messages that are in the process of being delivered. That is, there is no guarantee that only these classes will be delivered until currently queued messages have been delivered. Only the most recent cidfml_restrict is applied to the handle. Return values: CIDFML_PARAM_ERROR is returned under the following conditions: handle is not valid classcount > MAXCIDFCLASSES classid is NULL CIDFML_RESOURCE_ERROR is returned on a generic resource error 5.3. Communication Calls 5.3.1. cidfml_recvfrom Syntax: int cidfml_recvfrom (int handle, void* data, unsigned int max_data, unsigned int timeout, char * group) Receive a CIDF message. Using a handle returned from cidfml_bind, wait up to timeout seconds for a message to be copied (up to max_data bytes) into data). If group is non-null, the ascii name of the received group will be copied into group. A timeout of 0xFFFFFFFF causes the application to block until the next message arrives. Only messages from the groups associated with the handle will be delivered, and only if they match the constraints specified in cidfml_restrict. Messages queued for delivery prior to a cidfml_restrict call may not match these constraints. This function strips off the Message Layer header and applies the cryptographic mechanisms prior to message delivery. Return values: actual bytes written to data (if return is >= 0) CIDFML_PARAM_ERROR is returned under the following conditions: handle is not valid data is NULL max_data > MAXCIDFMSGLEN CIDFML_RESOURCE_ERROR is returned on a generic resource error 5.3.2. cidfml_sendto Syntax: int cidfml_sendto (char * group[], unsigned int groupcount, void* data, unsigned int length) Send a CIDF message comprised of length number of octets from data to the group (or group list) specified. The message will be transmitted to each host that is a member of the group (or group list). This function adds the Message Layer header and applies the cryptographic mechanisms prior to message transmission. Return values: actual bytes copied from data (if return is >= 0) CIDFML_PARAM_ERROR is returned under the following conditions: group is NULL groupcount > MAXCIDFGROUPS data is NULL length > MAXCIDFMSGLEN CIDFML_RESOURCE_ERROR is returned on a generic resource error 5.4. Termination Calls 5.4.1. cidfml_close Syntax: int cidfml_close (int handle) Unsubscribe to the CIDF group or list of groups using the handle returned from cidfml_bind. The application may still receive messages from other groups using other valid handles. Return values: CIDFML_PARAM_ERROR is returned under the following conditions: handle is not valid CIDFML_RESOURCE_ERROR is returned on a generic resource error 5.4.2. cidfml_exit Syntax: void cidfml_exit(void) Release CIDF Message Layer resources. CIDF Message Layer calls from this process will fail following this functions. Return values: This call does not return any value. 5.5. Error Codes The following error code values are to be used by API callers and implementors. #define CIDFML_SUCCESS 0 #define CIDFML_PARAM_ERROR -1 #define CIDFML_RESOURCE_ERROR -2