byuu's homepage

libco
About

libco is a library written in c++ and assembler to allow cooperative multithreading within otherwise pure c/c++ programs.


Download
libco v0.12 + source (2008-01-07)
License

libco is released to the public domain. Meaning, it has no license. You are free to do whatever you like with libco. Use it in your project, rename it, fork it, sell it if you think you can; whatever you like. No credit is necessary, neither in binary nor source code form, however it is always appreciated. I am especially interested in knowing of any projects that use this library, though this is only a courtesy and is definitely not a requirement.


Help Wanted

I would really like this to be a versatile library that is compatible on as many systems as possible. But to do that, I need help from anyone who is capable of porting this library to any other platforms other than the ones listed above. Since cooperative multithreading is not an inherent feature of c++, this generally means the library will have to be written in assembler on the target architecture. However, if the target operating system has its own cooperative multithreading API, a simple wrapper will be better than nothing.

Most importantly, I am looking for ports to x86-64 and the PowerPC G5 processors. However, I will happily take contributions for any new operating system or processor. I would even appreciate alternate ports to already-supported architectures, such as using GNU pthreads or somesuch API to implement this library on Linux.

I'm also not certain if libco_win32 will work on Windows 64-bit edition, or if libco_x86 will work on Mac OSx86. Confirmation would be very helpful.

If you're interested in helping, please e-mail me at setsunakun0 at hotmail.com.

Basically, the archive above contains one file, libco.h, and one test program, main.cpp. If you can implement the eight required functions in libco.h, and get main.cpp's tests to pass, then I would very much like to add your work into libco. Please note that I cannot accept GPL nor LGPL code, it is incompatible with my license.


Basic Concept of Multithreading

This is a crash course for those totally unfamiliar with cooperative multithreading, or what libco is for.


In computer science, typically each program being run is dubbed a process. However, these programs can optionally contain subprocesses (or other programs) inside of the main program. These are called threads. The term multithread refers to a process with more than one thread. Of threads, there are two basic ways of running a process' threads: pre-emptive multithreading, and cooperative multithreading.


Pre-emptive multithreading is presently the most common type of multithreading. In this setup, two or more "subprograms" run independantly of each other inside of a single program. These subprograms communicate with each other through various forms of messages, and it is up to the operating system to switch between the running threads as needed. The program itself is also capable of controlling when threads resume and pause by sending requests to the operating system. Depending on how many CPU cores exist within a system, the operating system will either run multiple threads at the same time on different CPU cores, or will pause a running thread on a given core, and resume a different thread on it. While a multi-core CPU system allows multiple threads to execute simultaneously, allowing for a major speed advantage, it also adds a significant overhead to program development by requiring programs to carefully avoid conflicts such as both threads accessing the same data at the same time. It also requires a tremendous overhead to switch between threads, as the operating system has to handle this for the application.


A common argument against cooperative multithreading is that the cost of switching threads is actually very small. And in relative terms to modern CPUs, a single thread switch is indeed fast enough to be irrelevant. However, the overhead continues to increase as more and more thread switches occur per second.
The common rebuttal here is that programs should be written to not require more than a few thread switches per second. In an ideal world, sure, this would be no problem. But shouldn't we be designing our programs around what we want them to do, and not around the programming language being used? For programs that must switch threads millions of times a second, we simply must eliminate the need for the operating system to get involved. When the OS gets involved, it forces the programming code to switch between user mode and kernel mode, which is a very slow operation and is ultimately unnecessary for certain types of applications. Luckily, another form of multithreading exists: cooperative multithreading.


Cooperative multithreading shifts the focus of controlling each thread from the operating system to the program. In other words, the OS is completely unaware of cooperative threads within a program, and thusly does not attempt to switch between such threads itself. The program has to do this manually. This also makes running more than one cooperative thread at the same time on a multi-core CPU system impossible. However, it has its advantages. By shifting control to the program, rather than the OS, you gain the ability to fine-tune threading to your programs' particular needs. No longer is it necessary to switch between user mode and kernel mode, for example. A thread switch for cooperative multithreading is several times faster as a result.


Now, why would you want to switch threads millions of times a second? I'm sure there are lots of reasons. Here's mine, and the reason I wrote libco: bsnes, my Super NES emulator, breaks apart the SNES system into logical components. There is a main processor (CPU) and an audio processor (APU), among other things. Basically, on a real SNES system, these two processors run at the exact same time, in parallel, communicating over shared memory. But in c++, natively you are only allowed to have one single thread, so in order to emulate multiple processors, one must "break apart" the code behind each processor to allow small parts of each processor to run at a time, before returning to allow another processor to run a little as well, trying to keep the two in synchronization as closely as possible. A program designed like this is often referred to as a state machine. The problem is that this technique has a lot of overhead. The reason is because these state machines are implemented in software, whereas by using threading, you are basically using the host CPU hardware to handle this. Another drawback is that the accuracy of emulating the two processors is heavily dependant upon how small you break apart each processors' tasks. The smaller you break apart each "subprogram", the slower the program gets, as the state machine grows in complexity. The less you break apart the program, the faster the program gets, and the less accurate communication between the two processors becomes.


Cooperative multithreading solves this problem beautifully. By emulating each processor as a separate program, it is no longer necessary to "break apart" each processor's emulation code. Instead, you simply switch threads immediately, and only, when needed for synchronization. This approach is faster, allows for any level of accuracy desired, and results in much cleaner code. It is also a far more logical way to emulate a system. You are now breaking the program apart into separate threads, just as the SNES breaks apart the entire system into separate processors. The only drawback to cooperative multithreading, is that unlike a state machine, the state cannot be saved between separate instances of the process. For emulation, this means that savestates are virtually impossible. One must weigh their design goals to decide if the tradeoff is worth it. Personally, I thought it was worth giving up savestates to allow for increased accuracy, and cleaner code.


Well, that's the basic synopsis. Please see wikipedia.org for more information on any specific topic mentioned above that is unclear.