2007-03-28

Implementing Polymorphism in C

A few months ago I was doing some work on Linux /proc fs. As you may already know, procfs (just like any other filesystem in Linux) implements the VFS API. The VFS API groups together common concepts between filesystems (e.g. retrieving an i-node from a pathname, assigning permissions to an i-node etc.). This allows for some polymorphism in the kernel code: a module requiring a VFS operation upon a file, does not need to know the actual type or inner-workings of the filesystem hosting this file.

In order to apply Objected Oriented Programming concepts to C code, one could map classes to structs, attributes to struct members and methods to function pointers. Having said this, there exists a small issue here that can be quite disturbing (other than the OOP->C mapping :-). A function pointer restricts (via the function prototype) the type of parameters passed to the function it is pointing to.

One can use void pointer parameters to allow for varying parameter types, but this decreases the code readability (i.e. function prototypes include void pointer arguments instead of meaningful types) and disables parameter type checking.

Another approach is to introduce a common structure, which will be passed as parameter to the polymorphic function, hiding all type-dependant data in a special field of this structure. This special field is usually a void pointer or a union and is used as private storage space for each implementation of the polymorphic function. This is common practice and can be found in many TCP/IP stack implementations. Should a void pointer technique be used, there would be some code readability issues regarding the structure definition. The union approach, although more verbose in nature, ties the definition of the special field with all possible types of private data and in this way, forces an update on the struct definition each time a new type is introduced.

So, how can we keep the function prototype as generic as possible and at the same time have descriptive parameter types without changing too much code?

Instead of employing a one-level inheritance scheme, where a base struct contains a special field for use by "sub-classes", we can reverse this model to allow for arbitrary inheritance levels (and get multiple inheritance as a bonus!):

"struct mydriver" is a struct used by an imaginary device driver. This driver adheres to some driver API and as such inherits "struct driver_API" from this API. The driver makes use of a character device and hence inherits some utility functions from "struct char_driver". Thus, "struct mydriver" would look something like this:

struct mydriver {
struct driver_API d_api;
struct char_driver char_dev;
time_t timeout;
int fd;
};

struct members "timeout" and "fd" are mydriver-specific. Now, let's say that we wish to override a function found in the "struct driver_API" with a mydriver-specific one which makes use of the "timeout" field. The function pointer "update_pci_timeout" yields this function's prototype as follows:

struct driver_API {
...
time_t (*update_pci_timeout)(struct driver_API *d_api,
time_t new_timeout);
...
};


The mydriver implementation of this function would look something like this:

time_t mydriver_update_pci_timeout(struct driver_API *d_api,
time_t new_timeout)
{
struct mydriver *mydriver_data;
...
mydriver_data = get_mydriver_ref(d_api);
if ((mydriver_data->timeout) < 1900){
...
}
...
}


Hmmm, what is this magical "get_mydriver_ref"? It returns a pointer to the child (outer "struct mydriver") given a pointer to the parent (inner "struct driver_API"). Of course this only works if "struct driver_API" was previously instantiated as a member of a "struct mydriver", but this is a safe assumption since the scope of our code is limited to the "mydriver" driver.

What follows is the code for "get_mydriver_ref". It uses a bit of GCC magic (macro "container_of") to get the address of the outer struct.

#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})

struct mydriver *get_mydriver_ref(struct driver_API *api){
return container_of(api, struct mydriver, d_api);
}


"container_of" is defined in <linux/kernel.h> of the kernel sources. You can see an example of its use with PROC_I (returns procfs-specific data for an i-node) in linux/fs/proc/inode.c