picture

Abstract: Many people believe that software engineering on the Linux kernel is impossible, or even not required at all. Although the software architecture can be done in C language, this cannot satisfy the implementation of the driver, and the driver still needs to be properly designed in software.

Original link: https://mairacanal.github.io/does-the-linux-kernel-need-software-engineering/

Disclaimer: This article is a CSDN translation, please indicate the source for reprinting.


Author |  MAÍRA CANAL

Translator | Sun Ruonan Editor | Tu Min

Produced | CSDN (ID: CSDNnews)

Does the Linux kernel require software engineering? For those looking for the short answer: yes, it does.

Now, we can dive into a more elaborate answer.

Software engineering is a more systematic approach to software development that involves the definition, engineering implementation, testing, management, optimization, and improvement of the software life cycle. When thinking about software from this perspective, we must also consider software requirements , design, construction, testing, and maintenance.

Software engineering improves the maintainability, scalability, and security of software. Also, it is easier to add tests to the software stack. This approach can make the software more powerful.

A glossary of some software engineering terms:

Maintainability: A measure of how easy it is to fix or improve a software artifact. Once the product is complete, it is imperative to continue fixing bugs, optimizing features, and refactoring code to avoid future problems.

Scalability: A measure of how easy it is to expand or shrink a software artifact.

Testability: A measure of how easy it is to test a software artifact.

Many people may think that software engineering on the Linux kernel is impossible, or even not required at all. It seems to me that these beliefs come from two accounts:

  • It is impossible to apply C to software engineering: sometimes software engineering is only related to object-oriented programming languages.

  • When using drivers, no software engineering is required: since drivers are theoretically limited, we don't have to consider their scalability and maintainability.

If the above beliefs are followed, we may end up with poorly designed code. As the amount of poorly designed code grows, we will see duplicate code, dead code, abnormal functions, and bugs.

Worst of all: when a huge codebase contains a lot of bad code, maintainability becomes difficult, and software quality gets worse and worse.

So, let's first understand why these two beliefs are false.

picture

Software Engineering Using C Language

You might ask: How do you use fancy design patterns to avoid code duplication and achieve nice polymorphism when there are no classes?

If you use the design patterns in C++, it may be easier for everyone to understand and implement. In C++, you can create a hierarchy to complete different blocks of designed functionality, and that functionality is available out of the box. But we can translate these concepts into C language.

Although C is a structured language, it can be used to write object-oriented programs. In this sense, libraries and structures are your main "ally" for software engineering in C. Additionally, you can use function pointers to create polymorphism in C.

For example, if you want to write a simple queue in C, you can use the following method:

#ifndef QUEUE_H_#define QUEUE_H_
typedef struct Queue Queue;struct Queue { int *buffer; int head; int size; int tail; int (*isFull)(Queue* const me); int (*isEmpty)(Queue* const me); int (*getSize)(Queue* const me); void (*insert)(Queue* const me, int k); int (*remove)(Queue* const me);};
/* Constructor and destructors */void Queue_Init(Queue const me, (*isFullFunction)(Queue* const me), (*isEmptyFunction)(Queue* const me), (*getSizeFunction)(Queue* const me), (*insertFunction)(Queue* const me, int k), (*removeFunction)(Queue* const me));
void Queue_Cleanup(Queue* const me);
/* Operations */int Queue_isFull(Queue* const me);int Queue_isEmpty(Queue* const me);int Queue_getSize(Queue* const me);void Queue_insert(Queue* const me, int k);int Queue_remove(Queue* const me);
Queue *Queue_Create(void);void Queue_Destroy(Queue* const me);
#endif

Note that polymorphism is implemented this way because I can create a new struct that inherits the queue, like:

typedef struct CachedQueue CachedQueue;struct CachedQueue {  Queue *queue;
/* new attributes */ char name[80]; int numberElementsOnDisk;
/* aggregation in subclass */ Queue *outputQueue;
/* inherited virtual function */ int (*isFull)(CachedQueue* const me); int (*isEmpty)(CachedQueue* const me); int (*getSize)(CachedQueue* const me); void (*insert)(CachedQueue* const me, int k); int (*remove)(CachedQueue* const me);
/* new virtual functions */ void (*flush)(CachedQueue* const me); int (*load)(CachedQueue* const me);};

This is polymorphism in C language. If you want to learn more about polymorphism in C language, I also recommend you to read the book "Design Patterns for Embedded Systems in C" by Bruce Powel Douglass, a world-renowned author and speaker. Harvest must be inspired.

As you can see, fancy software architecture can actually be done in C language. There are some nifty abstractions in Linux that use these concepts, such as Virtual File System (VFS). Also, some libraries provide nice API like DRM subsystem.

But sometimes this doesn't work with the driver's implementation. This brings us to the next problem: drivers need to be properly designed in software.


picture
Drivers should be designed as software


Here I must say: Personally biased against VBA libraries. Last month, as part of the development of the GSoC project (GSoC is a global project held by Google to connect students with open source, free software, and technology-related organizations, allowing students to contribute code and get paid), I've been writing unit tests for this library. I was impressed (maybe not great) with the amount of code duplication and the sheer number of functions.

This isn't a bashing of the code for AMDGPU (AMD's fully open-source unified kernel driver for GPUs on Linux): AMD has done a great job for the free software community, and providing an open-source driver to a major graphics retailer makes the difference Incredible. Also, I'm sure this problem exists in other parts of the kernel as well, so I believe this is a good point of discussion.

Let's start with the premise that "drivers are limited": you can get the datasheet, code the hardware to the end of the function and complete the driver.

However, hardware companies usually don't develop a single product with unique characteristics: they usually create a product line, and sometimes the product line has "children" (sub-products), that is, the product is iteratively upgraded.

Product lines have "children"... that sounds like a wonderful case of inheritance for OOP programmers.

So, if you have a product line, do you create a file for each product? For a product that adds some new features, would you paste in the previous driver and modify a few hundred lines of code? This doesn't seem like a good choice for the following reasons:

  • Duplicate code: You are also duplicating code and bugs;
  • Test coverage: Repeated testing is required;
  • Maintainability: in the project maintenance phase, the less code the better; 

You see, it all boils down to maintainability.

As a good example of code reuse, you can look at the IIO subsystem. Hardware manufacturers such as Maxim and Analog Devices Inc often have chips that share the same register map or shared functionality. Instead of creating drivers for each chip, developers write a driver and add compatible device IDs to the device table. For example, you can check out the Maxim MAX1027 ADC driver, which is compatible with the MAX1027, MAX1029, MAX1031, MAX1227, MAX1229, and MAX1231. So we have one driver for six devices: that's great for maintainability!

In this case, if a bug is found, I can make a change and send a patch, the maintainer only has to review it once and everything will go smoothly.

However, in reality, when opening the DML folder in AMD Display Core, more precisely the display_mode_vba file in DCN20 and DCN21. The product lines are very similar, so we may be able to reuse a lot of code.

However, if you examine this directory, you can see that there are three different files: display_mode_vba_20.c, display_mode_vba_20v2.c, and display_mode_vba_21.c.

Check for differences between files by:

$diff drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c

drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20v2.c

And most of the code is the same: I mean some functions don't change a single line! This can have a big impact on maintainability.

At this point, if a bug is found, I need to make three corrections. Also, I may not even know that the code is duplicated, so I may only fix the bug in one place and leave the other files untouched. Then another developer might find the same bug again and send it to the maintainer, who also has to review it again. This will result in a lot of rework!

If I could guess why AMD copied and pasted the code multiple times, I'd point out another maintainability problem: the functions are huge! The functions in some VBA files exceed a thousand lines.

The sheer number of functions in a VBA file means that if you want to modify a few lines in a new line, you need to copy and paste the entire function.

Ideally, according to the principles of the book Clean Code, we would like to have some functions that provide simplicity and provide a single function. I know: this doesn't work 100% of the time, but I can't find a good reason for a function to be so huge and have dozens of arguments.

In addition to readability, those huge functions also do quite a bit of damage to the stack.

Huge functions really hurt the readability, understandability, and testability of the code. Also, since the function has many side effects, they are hard to avoid the problem of code duplication.

A glossary of some software engineering terms:

Readability: A measure of how easy a software artifact is to read.

Understandability : A measure of how easy a software product is to understand.

But it's not a dead end for AMDGPU DML code . I mean, AMDGPU drivers work perfectly on Linux, and code refactoring is always an option.


picture
We can think about software!
At this point, it can be concluded that since the AMDGPU driver is open source, then we can fix these issues in the code. But it's not safe to simply disassemble the code and rewrite it into a patchset, because AMDGPU drivers must remain functional on Linux.

One way to solve this problem is through unit testing to ensure that the code is properly refactored. But throughout GSoC projects I've noticed that it's impossible to write unit tests for a thousand functions. Huge functions have many side effects, and testing every single side effect is not feasible.

Maybe VBA unit testing is not the only way for display mode. We can start by breaking the function into smaller independent parts, as this will help create better tests, improve readability, and reduce stack size.

Now, with smaller functions, it is more feasible to share code on the DCN and create a common interface for it.

This refactoring enables the use of those design patterns I talked about earlier, making DML more maintainable and readable. We can consider using inheritance, with a base library from which DCN20 can extend, and then DCN21 can extend from DCN20. This is how those three large files become small files.

This refactoring can start little by little.
  • Uniform parameters : Do not pass parameters by copying if they are in a public structure. stack will appreciate this change;
  • Split functions : make functions smaller and more readable;
  • write tests for functions ;
  • Create a generic interface : that's what design patterns do.

This approach allows us to do safer refactoring in cases where unit testing is not feasible. This does not mean that no errors will be introduced in the process, but having a structured plan will help us avoid them.

— Recommended reading —
☞ Confused Behavior Award: "Implant" a chip with a hard core in your hand, just to unlock a car?
☞Tencent conference apologized twice in a row: the login failure has been restored; the Southeast Asian e-commerce giant Shopee broke the contract on a large scale; Deno 1.25 was released | Geek Headlines
☞Microsoft reveals the heaviest software in history: a C/C++ compiler weighing up to 36 pounds!

picture