Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Post History

71%

+3 −0

Q&A How do you implement polymorphism in C?

First of all please note that polymorphism in C is clunky. It won't be pretty. We don't have this pointers, we don't have RAII, we don't have automatic constructors/destructors. Just accept that it...

posted 8mo ago by Lundin‭

Answer

#1: Initial revision by

Lundin‭ · 2024-10-17T10:07:18Z (8 months ago)

Copy Link

Raw

Markdown

First of all please note that polymorphism in C is _clunky_. It won't be pretty. We don't have `this` pointers, we don't have RAII, we don't have automatic constructors/destructors. Just accept that it the code won't look as pretty as in languages that have all of these features.

Second, we really ought to implement things like this proper with private encapsulation through "opaque types" as described at [How to do private encapsulation in C?](https://software.codidact.com/posts/283888) But if we ignore that important part and just come up with an example of polymorphism, it might go like this...

We cook up some manner of base class with a member variable and also a member function:

```c
typedef struct parent_t
{
  size_t size;
  void (*print)(const struct parent_t* this);
} parent_t;
```

Using function pointers like this to achieve "C++-like member functions" isn't really recommended because they take up extra memory in every object instance. It is better practice to make an external virtual table ("vtable") for that, but for now lets keep things simple and use this for illustration purposes.

Digging into the semantics of what C structs allow in terms of conversions, we might find this little rule in C17 6.7.2.1, emphasis mine:

> Within a structure object, the non-bit-field members and the units in which bit-fields reside have
addresses that increase in the order in which they are declared. **A pointer to a structure object,
suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in
which it resides), and vice versa.** There may be unnamed padding within a structure object, but not
at its beginning.

What this means is that any structure type pointer can be converted to/from a pointer to the type of the first member and vice versa - this is guaranteed well-defined behavior. "Suitably converted" means an explicit cast to the correct type.

So when inheriting a struct by declaring an inherited struct, we may take advantage of this by always allocating an instance of the parent as first member. This allows casts between the base class and the inherited class, while at the same time providing all the base class members through that member.

Lets say we want to inherit `parent_t` from above, both with some manner of "`int` behavior" and some "`double` behavior" - maybe we are implementing an array class or something:

```c
typedef struct
{
  parent_t parent;
  int data[];
} child_int_t;

typedef struct
{
  parent_t parent;
  double data[];
} child_double_t;
```

First thing is to come up with something to do with the `print` member of the parent. That's where the polymorphism will take place - we want to call `print` no matter what kind of inherited class we are pointing at. That function takes a `parent_t*` as parameter, so we can implement a child method which actually passes along for example a `child_int_t*`, because as mentioned, we can convert between these two types seamlessly.

```c
void print_int (const struct parent_t* this)
{
  const child_int_t* ithis = (const child_int_t*)this;
  for(size_t i=0; i<this->size; i++)    
    printf("%d ", ithis->data[i]);
  puts("");
}
```

In this function we can access both the parent's members and the inherited class' members. But naturally this will only work if `this` is actually of type `child_int_t` or else it would crash.

So when writing a constructor for `child_int_t` etc, we have to fill in the function pointer member of `parent_t` and set it to point to the correct function.

```c
child_int_t* child_int_create (const int* data, size_t size)
{
  size_t byte_size = sizeof(int[size]);
  child_int_t* obj = malloc(sizeof *obj + byte_size);
  if(obj==NULL) { return NULL; }
  
  memcpy(obj->data, data, byte_size);
  obj->parent.size = size;
  obj->parent.print = print_int;
  return obj;
}
```

In this example I used flexible array members and `malloc`, neither is by any means mandatory. With the malloc call we have allocated the child + the array in the child, and along with it an instance of `parent_t`. And that's it.

Full code example:

```c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct parent_t
{
  size_t size;
  void (*print)(const struct parent_t* this);
} parent_t;

typedef struct
{
  parent_t parent;
  int data[];
} child_int_t;

typedef struct
{
  parent_t parent;
  double data[];
} child_double_t;

typedef union
{
  child_int_t child_int;
  child_double_t child_double;
} common_initial_sequence;

void print_int (const struct parent_t* this)
{
  const child_int_t* ithis = (const child_int_t*)this;
  for(size_t i=0; i<this->size; i++)    
    printf("%d ", ithis->data[i]);
  puts("");
}

void print_double (const struct parent_t* this)
{
  const child_double_t* dthis = (const child_double_t*)this;
  for(size_t i=0; i<this->size; i++)    
    printf("%lf ", dthis->data[i]);
  puts("");
}

child_int_t* child_int_create (const int* data, size_t size)
{
  size_t byte_size = sizeof(int[size]);
  child_int_t* obj = malloc(sizeof *obj + byte_size);
  if(obj==NULL) { return NULL; }
  
  memcpy(obj->data, data, byte_size);
  obj->parent.size = size;
  obj->parent.print = print_int;
  return obj;
}

child_double_t* child_double_create (const double* data, size_t size)
{
  size_t byte_size = sizeof(double[size]);
  child_double_t* obj = malloc(sizeof *obj + byte_size);
  if(obj==NULL) { return NULL; }

  memcpy(obj->data, data, byte_size);
  obj->parent.size = size;
  obj->parent.print = print_double;
  return obj;
}

int main (void)
{
  int int_data[4] = {1,2,3,4};
  double double_data[4] = {1.0, 2.0, 3.0, 4.0};

  child_int_t* int_array = child_int_create(int_data,4);
  child_double_t* double_array = child_double_create(double_data, 4);
  
  parent_t* baseptr;
  baseptr = (parent_t*)int_array;
  baseptr->print(baseptr);
  baseptr = (parent_t*)double_array;
  baseptr->print(baseptr);
}
```

From there on we can add private encapsulation with opaque types. But then we'll soon run into the problem of truly `private` (only available to the parent) versus `protected` (only available to parent + inherited) versus `public` (available to anyone).

These can be solved by chopping up the parent class in a private and a protected section. The private one will remain truly private as per opaque type and the definition of the protected part we have to put in a header `parent_protected.h` which should only be used by inherited classes. Public access is only as per the provided API to the opaque type - in OO we shouldn't really have any public member variables anyway, but only provide access through setter/getter functions.

Another thing we might observe in my example is that there's a lot of code repetition. We _could_ actually show all of this into an "X-macro" list and generate template-like functions based on X-macros. But that's a topic for another post...

Communities

Post History