...making Linux just a little more fun! |
By Pramode C.E |
It was some time since I had wired up a few circuits and watched them on my old 20MHz Oscilloscope. I thought it might be interesting to observe how the complex, dynamic nature of a multitasking operating system influences the working of timing sensitive code by viewing signals generated by such programs on the scope. This article describes a few experiments which I did, first with a `normal' 2.4.18 kernel and then with a kernel patched with `real time extensions' provided by the RTAI project. The reader is assumed to have some background in simple kernel programming.
I converted an old Cyrix CPU based system which was lying around unused to my `embedded linux' experimentation platform. The motherboard was taken out of the cabinet - HDD, monitor, keyboard etc were removed - only the Ethernet card with a boot ROM remained - together with an ISA protoboard. This machine boots from a full fledged Linux system situated just a few feet away. This way, I can conduct hardware experiments without worrying too much about damaging expensive hardware. I have the option of booting either a plain 2.4.18 kernel or an RTAI patched one.
Here is a little user space program which, when executed as the superuser, generates a waveform on the parallel port output pins - I can view this on the scope.
#include <asm/io.h> #define ON 100000 #define OFF ON*10 delay(unsigned int i) { while(i--); } main() { iopl(3); while(1) { outb(0xff, 0x378); delay(ON); outb(0x0, 0x378); delay(OFF); } }The working of the program is simple. Parallel port pins 2 to 9 act as output pins - they can be accessed through an i/o port whose address is 0x378. You write 0xff to 0x378, you are turning on (ie, putting about 5V) on all these pins, you write 0x0 and your are turning off the voltage on these pins. The program has to be compiled with the -O2 option and executed as super user (if the outb is to work, the iopl call, which is concerned with setting some privilege levels, should work. For iopl to work, you have to be the superuser).
On my system, I observe a waveform with an on time of about 2.5 to 2.7ms with my scope set at 1ms/division. The result will surely vary depending on the speed of your processor.
Anybody who has done a basic course in microprocessors will know how to generate `delays' by writing loops. That's exactly what we have done here - absolute kids stuff.
Just being curious, I log on to another console and run the `yes' command, which generates a continuous stream of the character `y' on the screen. I watch the scope and see that my nice looking signal has gone haywire. The ON and OFF periods have been so lengthened that what I see is mostly a continuos line which keeps on jumping from 0V to 5V.
I do another experiment. I `flood ping' (the ping command with the -f option) the sytem from a faster machine - again, I notice that the signal on the scope gets wildly disturbed.
The reason behind this behaviour is not at all difficult to see. My program is now contesting with another one for CPU cycles. In between executing the delay loop, control can switch to the other program, thereby lengthening the delay perceived by the first program. Flood pinging results in lots of activity within the OS kernel, this too has a detrimental effect on the timing of my program.
The solution to the problem is simple - just don't disturb the program which generates the waveform. Let it have full control of the CPU. Then the question is why have a complex multitasking OS at all? Let's see.
I call the program which generates the signal a `realtime' program. Let's visualize the program as a `task' whose job is to `toggle' the parallel port pins at specified intervals. If the generated waveform is used to control a physical appliance like, say, a servo motor (the rotation of the servo is controlled by the length of the `on period' of a pulse whose total on+off period is somewhere around 20ms. When the ON period varies from 1ms to 2ms, the servo rotates by about 180 degree), variation in pulse length can have dramatic effects. My Futaba S2003 servo swings wildly when it is controlled by a program like the one above, if it is perturbed by some other process. A real time program has timing deadlines which it HAS to meet, for correct operation. The classical solution to designing control applications has been to use dedicated microcontrollers and digital signal processors. But with PC hardware becoming so cheap, a very wide range of applications are cropping up where we require the ability to run programs with sensitive timing requirements correctly, and, at the same time, also do things like communicate over the network, visualize data with graphical interfaces, log data on to secondary storage etc, jobs where timing deadlines are not an issue, so called `non-realtime' jobs.
If it is possible to modify the Linux kernel in some way so that the timing constraints imposed on some tasks (which are created and executed in some special manner) are always met, even under the prescence of other `non-realtime' tasks, then we have an ideal situation. We will see a bit later in this article that not one, but many such solutions are available.
Besides the fact that the timing of the program depends a lot on other activities going on in the system, we are burning up CPU cycles by executing a tight loop (also, on a complex microprocessor like the Pentium, it is difficult to compute delays by counting instructions). Why not let the program sleep?
By using functions like `nanosleep', we instruct the Operating System to put our process to sleep, to be woken up at a specified time. But, here again, there is a possibility that our process does not wake up and execute at the desired time because the Operating System was too busy executing some action in kernel mode (say, processing TCP/IP packets, or doing disk I/O) or another process got scheduled just before the kernel woke up our process.
What if we implement our signal generation code as a kernel space module?
#include <linux/module.h> #include <linux/fs.h> #include <linux/param.h> #include <asm/uaccess.h> #include <asm/io.h> static char *name = "foo"; static int major; #define ON 100000 #define OFF ON*10 void delay(unsigned int i) { while(i--); } static int foo_read(struct file* filp, char *buf, size_t count, loff_t *f_pos) { while(1) { outb(0xff, 0x378); delay(ON); outb(0x0, 0x378); delay(OFF); } return 0; } static struct file_operations fops = { read: foo_read, }; int init_module(void) { major = register_chrdev(0, name, &fops); printk("Registered, got major = %d\n", major); return 0; } void cleanup_module(void) { printk("Cleaning up...\n"); unregister_chrdev(major, name); }Executing an infinte loop in the kernel has disastrous consequences - as far as user processes are concerned. No user process would be able to execute until control comes out of kernel mode (this is the way the OS is designed). What we would like to have is a situation where realtime as well as nonrealtime processes coexist.
Although user space processes now can't disturb our program, it is still possible to generate interrupts on the network card by flood pinging. As interrupts are serviced even when kernel code is executing, the waveform displayed on the scope starts jumping around as usual.
It is possible to go to sleep within the kernel - this prevents the system from getting locked up - but then it does not solve our problem of peaceful coexistence of realtime as well as non realtime code.
What if we slide in a `nano kernel' between Linux and our hardware? This kernel would be in control of both Linux as well as a set of `real time tasks'. Linux will be treated as a very low priority task which will be executed only when no other higher priority `real time' tasks are executing. The control of interrupts would be in the hands of this specialized kernel - requests by Linux to disable interrupts will be treated in such a way that interrupts don't really get disabled - only Linux won't be able to see those interrupts - the real time tasks will still be able to execute their interrupt handlers without too much delay.
This novel concept, introduced by Dr.Victor Yodaiken, lead to the birth of RTLinux. Many other universities and research instituitions have attempted their own implementations - one of the most promising (and completely non proprietary) being RTAI, developed by researchers at Dipartimento di Ingegneria Aerospaziale - Politecnico di Milano (DIAPM).
RTAI can be obtained from here. There are two major components:
Let's look at an RTAI program which creates a waveform on the parallel port output pins:
#include <linux/module.h> #include <rtai.h> #include <rtai_sched.h> #define LPT1_BASE 0x378 #define STACK_SIZE 4096 #define TIMERTICKS 1000000 /* 1 milli second */ static RT_TASK my_task; static void fun(int t) { unsigned char c = 0x0; while(1) { outb(c, LPT1_BASE); c = ~c; rt_task_wait_period(); } } int init_module(void) { RTIME tick_period, now; rt_set_periodic_mode(); rt_task_init(&my_task, fun, 0, STACK_SIZE, 0, 0, 0); tick_period = start_rt_timer(nano2count(TIMERTICKS)); now = rt_get_time(); rt_task_make_periodic(&my_task, now + tick_period, 2*tick_period); return 0; } void cleanup_module(void) { stop_rt_timer(); rt_busy_sleep(10000000); rt_task_delete(&my_task); }
Let's look at the general idea before we examine specific details. First, we need a `task' to do anything useful. The `task' is simply a C function. The structure of most of our tasks would be something like this - perform some action, sleep for some time, perform some action again, repeat. One way to sleep is to call `rt_task_wait_period' - the question is how long do we sleep? We sleep for a certain fixed `period', which will be a multiple of a base `tick'. The system 8254 timer can be programmed to generate interrupts at a rate of say 1KHz (ie, 1000 times a second). The RTAI scheduler takes scheduling decisions at each tick - if we set the period of our task to be `2 ticks' and if the interval between each tick is 1ms, then the scheduler will wake up our task after 2ms.
We start with `init_module'. We first configure the timer as a `periodic timer' (another mode is available). The `rt_task_init' function accepts the address of an object of type RT_TASK, the address of our function and a stack size, besides some other values. Some kind of `initialization' is performed and information is stored in the object of type RT_TASK which can be later used for identifying this particular task.
Our TICK_PERIOD is 1000000 nano seconds (1 milli second). The nano2count function converts this time into internal `count units'. The timer is started with a tick period equal to 1ms (which is what the `start_rt_timer' function does).
What remains is to start our task and set its period (remeber, the period is used by rt_task_wait_period to set the time at which the task is to be awakened). We set the period to 2 ticks and instruct the scheduler to start it at the next tick itself.
The body of our task is very simple - it simply writes a value to the parallel port output pins, complements the variable which stores that value and waits for the next period (which will be 2ms). After waking up, it performs the same sequence. Again and again and again... The end result is we observe a waveform on the scope whose on time is 2ms and off time also is 2ms.
I observed the waveform first on an unloaded system. I then resorted to flood pinging the system. The waveform on the scope remained steady. The promise that RTAI gives us is that it will always run Linux as a very low priority task - Linux will execute only when no real time tasks are to be serviced. A real time task waking up will result in control getting transferred to it immediately (of course, there are delays involved in preempting whatever is being done now, activating the real time scheduler and transferring control back to the task which just woke up - these delays also need not be constant). That is why we are able to observe a fairly steady signal even under load.
Here is a code segment which demonstrates the use of a function - `rt_sleep':
#define LPT1_BASE 0x378 #define STACK_SIZE 4096 #define TIMERTICKS 1000000 /* 1 milli second */ #define ON_TIME 3000000 /* 3 milli seconds */ #define OFF_TIME 1000000 /* 1 milli second */ static RT_TASK my_task; RTIME on_time, off_time; static void fun(int t) { while(1) { outb(0xff, LPT1_BASE); rt_sleep(on_time); outb(0x0, LPT1_BASE); rt_sleep(off_time); } } int init_module(void) { RTIME tick_period, now; rt_set_periodic_mode(); rt_task_init(&my_task, fun, 0, STACK_SIZE, 0, 0, 0); tick_period = start_rt_timer(nano2count(TIMERTICKS)); on_time = nano2count(ON_TIME); off_time = nano2count(OFF_TIME); now = rt_get_time(); rt_task_make_periodic(&my_task, now + tick_period, 2*tick_period); return 0; }The basic tick period is 1ms. Our on and off times are integral multiples of this period (3ms and 1ms each). An invocation of `rt_sleep(on_time)' will put the task to sleep - it gets woken up after 3 tick periods. It does some action and again goes to sleep for one tick period.
It may be required to transmit data from a user space non realtime program to an RTAI task (and back). This is very easily done with the use of fifo's. For example, an RTAI task may be generating a PWM (pulse width modulated) signal and you may have to control the width from user space. The RTAI installation creates several device files under /dev/ going by the name rtf0, rtf1 etc. The user program identifies each fifo by its name while the RTAI task does it with numbers 0, 1, 2 etc.
#include <linux/module.h> #include <linux/errno.h> #include <rtai.h> #include <rtai_sched.h> #include <rtai_fifos.h> #define STACK_SIZE 4096 #define COMMAND_FIFO 0 #define FIFO_SIZE 1024 int fifo_handler(unsigned int fifo) { char buf[100]; int r; r = rtf_get(COMMAND_FIFO, buf, sizeof(buf)-1); if (r <= 0) return r; rt_printk("handler called for fifo %d, get = %d\n", fifo, r); buf[r] = 0; rt_printk("data = %s\n", buf); return 0; } int init_module(void) { /* Create fifo, set handler */ rtf_create(COMMAND_FIFO, FIFO_SIZE); rtf_create_handler(COMMAND_FIFO, fifo_handler); return 0; } void cleanup_module(void) { printk("cleaning up...\n"); }In `init_module', we create a fifo and set `fifo_handler' as a function to be invoked when somebody writes to the fifo. The `rtf_get' function reads data from the fifo. After compiling and loading the module, if we do something like:
echo hello > /dev/rtf0we will see the handler getting invoked and reading data from the fifo.
If you are interested in general real time programming issues, you should start with the excellent Real Time and Embedded Guide written by Herman Bruyninckx. RTAI programming is explained in detail in the RTAI manual and RTAI programming guide available for download from the project home page.
An Operating System which provides support for deterministic execution of tasks with stringent timing requirements is just one part of the realtime system design landscape. After playing with RTAI for a few days, I realized that this (realtime design) is something which can't be done as a hobby by a novice like me - you have to invest a lot of time, effort and patience in understanding your system thoroughly (hardware as well as software) and using the tools well. But then, that shouldn't stop you from experimenting and having a little bit of fun!
I am an instructor working for IC Software in Kerala, India. I would have loved
becoming an organic chemist, but I do the second best thing possible, which is
play with Linux and teach programming!