We assume that the reader is familiar with Linux as a user and also, possibly, as a system administrator, especially for individual systems. In this article, we are going to outline the way this operating system is designed.
One can be interested in the design of Linux for four reasons: by intellectual curiosity, to understand how one designs an operating system, to participate in the development of the Linux kernel, or to be inspired by it to develop another system. Our goal here is above all to satisfy the first two motivations.
The advantage of a linux based operating system is that the sources are public and that, beyond the main principles, we will be able to visualize the implementation of the system’s functionalities from these sources and to experiment by changing such implementation.
Basically, an operating system performs three independent tasks: it loads programs one after another, emulates a virtual machine and manages resources. Let us specify each of these tasks.
The first microcomputers were supplied without an operating system. The very first microcomputers had only one program: an interpreter of the BASIC language which was contained in ROM memory. With the appearance of cassette players and then, more reliably, floppy drives, this began to change: if an executable floppy was placed in the floppy drive, this program was executed otherwise the BASIC interpreter would take over.
With this approach, each program change required restarting the microcomputer with the floppy disk for the desired program in the floppy disk drive. This was particularly the case with the Apple II.
The microcomputers were then optionally supplied with an operating system. This, contained on floppy disk or in RAM memory, displays a prompt on the screen. The boot system diskette could then be replaced by a diskette containing the desired program: by writing the name of the program on the command line and pressing the Return key, the program was loaded and executed. At the end of the execution of this program, a new program could be loaded without having to restart the system. This makes it possible, for example, to write a text with a word processor and then to call another program to print it.
The management of a given computer system, for example the IBM-PC, is a priori done in machine language. This is primary and cumbersome for most computers to manage, especially when it comes to input-output. Very few programs would be developed if each programmer had to know the functioning, for example, of a hard disk and all the errors which can appear during the reading of a block. So we had to find a way to free programmers from the complexity of the hardware. It involves wrapping the hardware with a layer of software that manages the entire system. The programmer must be presented with an API , which corresponds to a virtual machine that is easier to understand and program.
Consider for example the programming of I / O hard disks using the IDE controller used on the IBM-PC.
The IDE controller has 8 main commands which all consist of loading between 1 and 5 bytes in its registers. These commands read and write data, move the drive arm, format the drive, and initialize, test, restore, and recalibrate the controller and drives.
The fundamental commands are read and write, each requiring seven parameters grouped into six bytes. These parameters specify such things as the address of the first sector to read or write, the number of sectors to read or write, or whether to attempt to correct errors. At the end of the operation, the controller returns 14 status and error fields grouped into 7 bytes.
Most programmers don’t want to worry about programming hard drives. They want a simple, high-level abstraction: consider for example that the disk contains named ﬁles, each file can be opened for reading or for writing; it will be read or written, and finally closed. The virtual machine part of operating systems subtracts the hardware from the programmer’s view and provides a simple and pleasant view of named ﬁles that can be read and written.
Modern computers consist of processors, memories, clocks, disks, monitors, network interfaces, printers, and other peripherals that can be used by multiple users at the same time. The work of the operating system consists in ordering and controlling the allocation of processors, memories and peripherals between the different programs which call upon them.
Imagine what would happen if three programs running on a computer simultaneously tried to print their results to the same printer. The first printed lines could come from program 1, the following ones from program 2, then from program 3 and so on. The most complete disorder would result. The operating system can avoid this potential chaos by transferring the print results to a buffer file on disk. When printing ends, the operating system can then print one of the files in the buffer. At the same time, another program may continue to generate results without realizing that it is not (yet) sending them to the printer.
Most modern operating systems allow you to multitask: a computer can, while it is running a user’s program, read data from a disk or display results on a terminal or a printer. We talk about a multi-tasking or multi-programmed operating system in this case.
The fundamental notion of multitasking operating systems is that of process. The notion of program does not suffice. Nothing prevents the same program from being executed several times at the same time: you may want, for example, two emacs windows or two gv windows to compare texts.
A process is a running instance of a program.
A process is represented by a program (the code), but also by its data and by the parameters indicating where it is, thus allowing it to continue if it is interrupted (execution stack, ordinal counter, etc.). We are talking about the program environment.
A process is also called a task in the case of Linux.
Most multitasking operating systems are implemented on a computer with a single microprocessor. This, at any given time, is actually executing only one program, but the system can switch it from one program to another by executing each program for a few tens of milliseconds; this gives users the impression that all programs are running at the same time. This is called a time-sharing system.
Some people refer to this very rapid switching of the processor from one program to another as pseudo-parallelism, to differentiate it from the true parallelism that occurs in hardware when the processor is working at the same time as certain I / O devices.
Conceptually, each process has its own virtual processor. Of course, the real processor switches between several processes. But, to fully understand the system, it is better to think of a set of processes that run in (pseudo-) parallelism rather than the allocation of the processor between different processes. This fast switching is called multi-programming.
The graph below shows four processes running at the same time. The central graph shows an abstraction of this situation. The four programs become four independent processes each with their own flow control (i.e. their ordinal counter). In the last graph, it can be seen that, over a fairly large time interval, all the processes have progressed, but at any given moment there is only one active process.
As we have already mentioned, the data of the program is insufficient for the determination of a process. You have to provide a whole series of environment variables: the files on which it operates, where the ordinal counter is, and so on. These environment variables are needed for two reasons:
· The first is that two processes can use the same code (two emacs windows for example) but the specific files can be different, the ordinal counter not being in the same place …
· The second is due to the multitasking character, treated by pseudo-parallelism.
Periodically, the operating system decides to interrupt a running process in order to start the execution of another process. When a process is temporarily suspended in this way, it should later be able to return to exactly the state it was in when it was suspended. All the information it needs must therefore be saved somewhere while it is being put on hold. If it has, for example, several open files, the positions in these files must be stored.
The list of environment variables depends on the operating system in question, and even its version. It is found in the process descriptor.
In many operating systems, each process has its own memory space or allocation area, which is not accessible to other processes. We are talking about the address space of the process.
Since the processor switches between processes, the speed of execution of a process will not be uniform and will likely vary if the same processes are executed again. The processes should therefore not make any presumption on the time factor.
Consider the case of an I / O process that turns on the engine of a floppy disk drive, executes a loop 1000 times for the floppy speed to stabilize, and then requests that the first record be read. If the processor was also allocated to another process during the execution of the loop, the I / O process may be reactivated too late, i.e. after the first record has passed the head reading.
When a process needs to measure durations with precision, that is, when certain events absolutely must occur after a few milliseconds, special measures must be taken to ensure this. We then use timers, as we will see later in this article.
However, most processes are unaffected by the multi-programming of the processor and the differences in execution speed that exist between them.
A multi-user system is capable of executing in a (pseudo-) concurrent and independent manner applications belonging to several users.
“Concurrent” means that applications can be active at the same time and compete for access to different resources such as processor, memory, hard drives … “Independent” means that each application can do its job without worrying about what other users’ applications are doing.
A multi-user system is necessarily multi-tasking, but the converse is false: the MS-DOS operating system is single-user and single-task; MacOS 6.1 and Windows 3.1 systems for instance are single-user but multi-tasking; Unix and Windows NT are multiuser.
As with multi-tasking systems, multi-use is emulated by assigning time frames to each user. Naturally, switching between applications slows down each application and aﬀects the response time perceived by users.
When allowing multi-use, operating systems must put in place a certain number of mechanisms:
· An authentication mechanism to verify the identity of the user; A protection mechanism against erroneous user programs, which could block other applications running on the system, or malicious, which could disrupt or spy on the activities of other users;
· An accounting mechanism to limit the volume of resources allocated to each user.
In a multi-user system, each user has a private space on the machine: usually he has a certain quota of disk space to save his files, he receives private emails, etc. The operating system must ensure that the private part of a user’s space can only be seen by its owner. He must, in particular, ensure that no user can use an application of the system with the aim of violating the private space of another user.
Each user is identified by a unique number, called the User ID, or UID. Usually, only a limited number of people are allowed to use a computer system. When one of these users starts a work session, the operating system prompts them for a username and password. If the user does not respond with valid information, access is denied.
In order to be able to selectively share material with others, each user can be a member of one or more user groups. A group is also identified by a unique number called a group identifier, or GID (Group IDentifier). For example, each file is associated with one and only one group. Under Unix, for example, it is possible to limit read and write access to the sole owner of a file, read to the group, and to prohibit all access to other users.
A multi-user operating system has a particular user called superuser or supervisor. The system administrator must log in as a super user to manage user accounts and perform maintenance tasks such as backups and program updates. The superuser can do almost anything as long as the operating system never applies the protection mechanisms to him, these only affect other users, called ordinary users. The superuser can, in particular, access all files on the system and interfere with the activity of any running process. It cannot, however, access I / O ports that were not provided by the kernel.
The operating system has a number of routines (subroutines). The most important ones belong to the kernel. This is loaded into RAM when the system is initialized and contains many procedures necessary for the proper functioning of the system. The other, less critical routines are called utilities.
The kernel of an operating system consists of four main parts: the task (or process) manager, the memory manager, the file manager, and the I / O device manager. It also has two auxiliary parts: the operating system loader and the command interpreter.
On a timeshare system, one of the most important parts of the operating system is the scheduler. On a single processor system, it divides time into slices. Periodically, the task manager decides to interrupt the current process and start (or resume) the execution of another, either because the first has exhausted its allocation time of the process or because it is blocked ( waiting for data from one of the peripherals).
Controlling several parallel activities is hard work. This is why the designers of operating systems have constantly, over the years, improved the parallelism model to make it easier to use.
Some operating systems only allow non-preemptible processes, which means that the task manager is only invoked when a process voluntarily gives up the processor. But the processes of a multi-user system must be preemptive.
Memory is an important resource that must be managed with care. By the end of the 1980s, the smallest microcomputer had ten times more memory than the IBM 7094, the most powerful computer of the early sixties. But the size of programs is growing just as fast as that of memories.
Memory management is the responsibility of the memory manager. This must know the free parts and the occupied parts of the memory, allocate memory to the processes that need it, recover the memory used by a process when it ends and handle the back and forth (swapping or paging) between the disk and the main memory when the latter cannot contain all the processes.
As we have already said, a fundamental task of the operating system is to hide the specificities of disks and other input / output devices and to provide the programmer with a pleasant and easy-to-use model. This is done through the notion of ﬁle.
Controlling computer input / output (I / O) devices is one of the primary functions of an operating system. The latter must send commands to peripherals, catch interrupts, and handle errors. It should also provide a simple and easy-to-use interface between the peripherals and the rest of the system which should be, as far as possible, the same for all peripherals, i.e. independent of the peripheral used. The I / O code is an important part of an entire operating system.
Many operating systems provide a level of abstraction that allows users to perform input-output without going into hardware detail. This level of abstraction makes each device appear as a special ﬁle, which allows input / output devices to be treated as files. This is the case with Unix.
In general, nowadays, when the computer (compatible PC or Mac) is turned on, it executes a software called BIOS (for Basic Input Output System) placed at a well determined address and contained in RAM memory. This software initializes devices, loads a sector of a disk, and executes what is placed there. When designing an operating system, the operating system loader or, more exactly, the operating system pre-loader is placed on this sector since the content of a sector is insufficient for the loader itself.
The design of the loader and pre-loader is essential, even if these are not explicitly part of the operating system.
The operating system itself is the code that defines system calls. System programs such as text editors, compilers, assemblers, linkers, and command interpreters are not part of the operating system. However, the command interpreter (shell) is often considered to be part of it.
In its most rudimentary form, the command interpreter executes an infinite loop which displays a prompt (thereby showing that something is being expected), reads the name of the program entered by the user at the time, and execute it.
After looking at an operating system from the outside (from the point of view of the interface presented to the user and the programmer), we will examine its internal workings.
Andrew Tanenbaum calls a monolithic (self-contained) system an operating system that is a collection of procedures, each of which can call any other procedure at any time, noting that it is the most common (rather chaotic) organization.
To build the object code of the operating system, you have to compile all the procedures, or the files that contain them, then put them together using a link editor. In a monolithic system, there is no information hiding: each procedure is visible to all the others, as opposed to structures made up of modules or program units and in which the information is local to the modules and where there are compulsory passage points to access the modules.
MS-DOS is an example of such a system.
In many operating systems, there are two modes: kernel mode and user mode. The operating system boots in kernel mode, which allows devices to be initialized and service routines for system calls to be set up, and then switches to user mode. In this latter, we cannot have direct access to the peripherals: we must use what are called system calls to have access to what has been planned by the system: the kernel receives this system call, checks that it is ‘is a valid request (especially from an access rights point of view), executes it, then returns to user mode. Kernel mode can only be changed by compiling the kernel; even the superuser acts in user mode.
Unix and Windows (at least since Windows 95) are such systems. This explains why we cannot program everything on such a system.
Modern microprocessors help in setting up such systems. This is the origin of the protected mode of Intel’s micro-processors since the 80286: there are several levels of privileges with a hardware check, and no longer just software, rules for moving from one level to another.
The preceding systems can be considered as two-layer systems and can be generalized into multi-layer systems: each layer is based on the one immediately below it.
The first system to use this technique was the THE system developed at the Technische Hogeschool in Eindhoven (hence its name) in the Netherlands by Diskstra (1968) and his students. The Multics operating system, originally from Unix, was also a layered system.
Tanenbaum’s Minix operating system, shown below , which inspired Linux, is a four-layer system:
Layer 1, the lowest, handles interrupts and traps and provides the upper layers with a model made up of independent sequential processes that communicate by means of messages. The code in this layer has two major functions: the first is dealing with interrupts and traps; the second is related to the message mechanism. The part of this layer that deals with interrupts is written in assembly language; the other functions of the layer, as well as the upper layers, are written in C language.
Layer 2 contains device drivers, one for each type of device (disk, clock, terminal, etc.). It also contains a specific task, the system task. All of the Layer 2 tasks and all of the Layer 1 code form a single binary program, called the kernel. Layer 2 tasks are completely independent although they are part of the same object program: they are selected independently of each other and communicate by sending messages. They are grouped together in a single binary code to facilitate the integration of Minix into two-mode machines.
Layer 3 contains two managers that provide services to user processes. The Memory Manager (MM) handles all Minix system calls, such as fork (), exec (), and brk (), which relate to memory management. The File System (FS) takes care of the file system calls, such as read (), mount (), and chdir ().
Layer 4 finally contains all user processes: command interpreters, text editors, compilers, and user-written programs.
Linux will take inspiration from this layering, although there are officially only two layers: kernel mode and user mode.
Microkernel based operating systems have only a few functions, usually a few synchronization primitives, a simple task manager, and a process-to-process communication mechanism. System processes run on top of the microkernel to implement other functions of an operating system, such as memory allocation, device managers, system call managers, etc.
Tanenbaum’s Amoeba operating system was one of the first microkernel systems.
This type of operating system promised a lot; unfortunately they have been shown to be slower than monolithic systems, due to the cost of passing messages between the various layers of the operating system.
However, microkernels have theoretical advantages over monolithic systems. For example, they require a modular approach from their designers, since each layer of the system is a relatively independent program which must interact with the other layers via a clean and well-established software interface. In addition, a microkernel based system can be ported fairly easily to other architectures since all the hardware-dependent components are generally located in the code of the micro-kernel. Finally, microkernel based systems tend to use RAM better than monolithic systems.
A module is an object file whose code can be linked to (and removed from) the kernel during execution. This object code is usually made up of a set of functions that implements a file system, a device driver, or some other high level functionality of an operating system. The module, unlike the outer layers of a microkernel-based system, does not run in a specific process. Rather, it is executed in kernel mode on behalf of the current process, like any statically linked function in the kernel.
The notion of modulus represents a feature of the kernel which offers many of the theoretical advantages of a microkernel without penalizing performance. Some of the benefits of modules include:
· A modular approach: since each module can be linked and unlinked while running the system, programmers had to introduce very clear software interfaces allowing access to the data structures managed by the modules. This makes the development of new modules easier.
· Platform independence: even if it must be based on well-defined characteristics of the hardware, a module does not depend on a particular platform. Thus, a disk driver based on the SCSI standard works as well on an IBM compatible computer as on an Alpha.
· Economical use of memory: a module can be inserted into the kernel when the functionalities it provides are required and removed from it when they are no longer required. Moreover, this mechanism can be made transparent to the user since it can be carried out automatically by the kernel.
· No performance loss: once inserted into the kernel, the code of a module is equivalent to the code statically linked to the kernel. Therefore, no message passing is necessary when the functions of the module are invoked. Of course, a small performance loss is caused by loading and removing modules. However, this loss is comparable to that caused by the creation and destruction of the process of a microkernel-based system.
The interface between the operating system and user programs is made up of a set of “extended instructions” provided by the operating system, known as system calls.
System calls create, destroy, and use various software objects managed by the operating system, the most important of which are processes and files.
As the processes run independently of each other, this is pseudo-parallelism. However, sometimes it is necessary to provide information to a process. How does the operating system work? A method similar to that of software interrupts for microprocessors, called signal, has been devised.
Consider, for example, the case of sending a message. To prevent messages loss, it is appropriate that the receiver itself sends an acknowledgment as soon as it receives part of the message (of a determined size); this part is sent again if the acknowledgment does not arrive within a determined time. To set up such a sending, we will use a process: it sends part of the message, asks its operating system to warn it when a certain time has elapsed, it checks that it has received the acknowledgment message and otherwise sends it again.
When the operating system sends a signal to a process, this signal causes the temporary suspension of the work in progress, the saving of the registers in the stack and the execution of a particular procedure for processing the received signal. At the end of the signal processing procedure, the process is restarted in the state it was in just before the signal was received
If you like the content, we would appreciate your support by buying us a coffee. Thank you so much for your visit and support.