Webcam streaming, signals, and Computer Vision

In my previous blog post, I wrote about my webcam library, which I recently applied to an SDL application to show streaming pictures in an SDL frame, and it works pretty well. The code can be found in a branch on my GitHub.

When I posted about my webcam library on a social network, someone mentioned a C++ library that would do all the computer vision stuff I mentioned in the blog post. This library is called OpenCV. I took a quick look, and it seems very feature-rich. It is supposed to have a set of functions for accessing the webcam as well. I guess I should read more on motion analysis and object tracking, but first I’ll play with some of the basic features of OpenCV, such as image processing and image analysis.

Before I continue, there is another thing that stirred my interest in computer vision: Mercedes’ self-driving car in their new S series. This technology is based on a bunch of sensors built into the car, feeding the car software with lots of data to process, hopefully so that the software can make decisions on what to do with the car. This is quite another interesting area of IT I would like to explore!

But first a few things I learned when using the webcam library with the SDL application.

Signal handling

When I test my application, there is always a chance of the program puking out a segmentation fault. This happened quite often while the webcam was streaming: the program aborted its execution in the middle, leaving the webcam resource open and locked. This means that I could not access it anymore on a next run. I couldn’t find a way to reset the locked webcam – tried unplugging and plugging it back in, tried removing the webcam kernel module, and adding it back in, nothing helped, and a reboot was all I could do to get access to my webcam again.

So I thought it would be wise to add a handler that would close the webcam instance cleanly so I could use it again on the next run, without having to reboot. Here is when signal handling comes into play.

Programs receive signals from the OS, when the program is misbehaving, such as the segmentation fault (SIGSEGV) in my example, or when the user hits Ctrl-C, which generates an interrupting signal (SIGINT), or when another process sends a signal to the program using the kill system call. The latter can also easily be done with the command line using the kill command. This command sends a termination signal (SIGTERM) by default, but you can specify the kind of signal you want to send. Then the program needs to be able to react on the signal received, which is done by a signal handler.

A signal handler is registered in the program using a struct sigaction. As I am implementing this into a library, it needs its own private struct sigaction. This is initialized only once during a webcam_open. Here is the related snippet:

#include <signal.h>
 
static struct sigaction *sa;
 
struct webcam *webcam_open(const char *dev)
{
    /* ... */
 
    if (sa == NULL) {
        sa = calloc(1, sizeof(struct sigaction));
        sa->sa_flags = SA_SIGINFO;
        sigemptyset(&sa->sa_mask);
        sa->sa_sigaction = handler;
        sigaction(SIGSEGV, sa, NULL);
    }
 
    /* ... */
}

The handler assigned to sa_sigaction is the function name that will be called when the program receives a SIGSEGV. This function is defined as follows:

static void handler(int sig, siginfo_t *si, void *unused)
{
    int i = 0;
    fprintf(stderr, "A segmentation fault occured. Cleaning up...\n");
 
    for(i = 0; i < 16; i++) {
         if (_w[i] == NULL) continue;
 
         // If webcam is streaming, unlock the mutex, and stop streaming
         if (_w[i]->streaming) {
            pthread_mutex_unlock(&_w[i]->mtx_frame);
            webcam_stream(_w[i], false);
        }
        webcam_close(_w[i]);
    }
 
    exit(EXIT_FAILURE);
}

The handler function is expected to have exactly those arguments. As you see, this function needs a way to access all the initialized webcam instances, so I need to keep them in an array _w, which is private to the library itself. I have initialized this array as an array of 16 NULL entries, which should be enough for most applications. The webcam_open function also needs to put the instances in this array, so that the signal handler can access them.

static webcam_t *_w[16] = {
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL,
    NULL, NULL, NULL, NULL
};
 
struct webcam *webcam_open(const char *dev)
{
    /* ... */
 
    int i = 0;
    for(; i < 16; i++) {
        if (_w[i] == NULL) {
            _w[i] = w;
            break;
        }
    }
 
    /* ... */
}

I know this code is not flexible at all, but as most installations will have only one webcam, maybe two (3D vision, woo!), this array of 16 entries will suffice. The only thing that would go wrong here, is that the 17th opened webcam would not be registered, and the signal handler would not be able to cleanly close that one. Which means you won’t be able to access the 17th webcam after a segmentation fault.

But who needs 17 webcams? Oh, right, the Mercedes Benz…

What are your thoughts?