Swoole asynchronous process service system

After understanding the knowledge related to the whole process, thread, and coroutine, let's take a look at how to deal with process problems asynchronously in Swoole, and understand the role of threads in Swoole.

Two operating modes of Server

In fact, in the previous test code, we have seen these two modes, but we didn't say it at the time. Whether it is Http or TCP and other services, we all have the existence of a third parameter. By default, it will be assigned a SWOOLE_PROCESS parameter. Therefore, if it is by default, we generally do not write this parameter. The other mode is SWOOLE_BASE.

SWOOLE_BASE mode

This mode is the traditional asynchronous non-blocking mode, and its effect is exactly the same as that of Nginx and Node.js. In Node.js, all requests are processed through a main thread, and then asynchronous thread processing is performed on I/O operations to avoid the consumption of creating, destroying threads and thread switching. When the I/O task is completed, the specified callback function is executed through the observer, and the completed event is placed at the end of the event queue, waiting for the event loop.

Let's make this thing clear, it's not an exaggeration to open a big series. But if you've learned a bit of Node before, then it's actually pretty easy to understand. Because the various Server service codes we wrote before are basically the same as those written in Node. It is all an event, and then write business code in the callback function after the monitoring is successful. This is the asynchronous non-blocking mode implemented through the callback mechanism, which places time-consuming operations in the callback function. Interested students can simply learn Node.js. As long as they have JS foundation, they can read a set of introductory tutorials in one or two.

In Swoole's SWOOLE_BASE mode, the principle is exactly the same. When a request comes in, all Workers will compete for this connection, and eventually a Worker process will successfully establish a connection with the client directly. After that, all data sending and receiving in this connection communicates directly with this Worker, no longer through the main Reactor thread forwarding of the process.

SWOOLE_PROCESS mode

All clients of SWOOLE_PROCESS are established with the main process, the internal implementation is more complicated, and a lot of inter-process communication and process management mechanisms are used. It is suitable for scenarios with very complex business logic, because it can easily communicate with each other between processes.

In SWOOLE_PROCESS, all Workers will not compete for connections, and will not let a connection communicate with a fixed Worker, but connect through a main process. The rest of the event processing is handed over to different Workers to execute. After reaching the Worker, it is also processed using the callback method, and the subsequent content is basically similar to BASE. That is to say, the difference of Worker function is the biggest difference between it and SWOOLE_BASE. It realizes the separation of connection and data request, and will not cause unbalanced Worker process due to the large amount of data in some connections and the small amount of data. The specific way is that in this mode, there will be an additional thread supervisor process, and there will also be a very important Reactor thread, which we will explain in detail below.

The similarities and differences, advantages and disadvantages of the two modes

SWOOLE_BASE mode is a good choice if there is no need for interaction between clients, that is, our ordinary HTTP service, but it does not support cross-process execution except for the send() and close() methods. But in fact, there is not much difference in the underlying processing between these two modes, both of which are asynchronous IO mechanisms. Just saying they are connected differently. Each Worker of SWOOLE_BASE can be regarded as a combination of Reactor thread and Worker process of SWOOLE_PROCESS.

We can test it.

$http = new Swoole\Http\Server('0.0.0.0'9501, SWOOLE_BASE);

//$http = new Swoole\Http\Server('0.0.0.0', 9501, SWOOLE_PROCESS);

$http->set([
    'worker_num'=>2
]);

$http->on('Request'function ($request, $response) {
    var_dump(func_get_args());
    $response->end('开始测试');
});

$http->start();

By switching the above two comments, we can view the situation of the two service operation modes, which can be passed the pstree -p command.

In SWOOLE_BASE mode, the output is like this.

picture

It can be seen that there are two subprocesses 1630 and 1631 under the 1629 process. Then switch to SWOOLE_PROCESS mode, and then check the process.

picture

Obviously, it's different here. Under the master process of 1577, there are two processes, one is 1578 and the other is 1579, which represents the thread group, and then under the 1578 Manager management process, there are 1580 and 1581. Worker process.

Similarly, we use the previous code in [Swoole Tutorial 2.5] asynchronous task https://mp.weixin.qq.com/s/bQt9Ul-H34eUYw2-Qu-N0g to test, we can see that the Task asynchronous task is also started process. (Note that we set task_worker_num in the test code, and worker_num is not set, so it is 1 Worker + 4 TaskWorker processes, and finally a thread group in PROCESS mode is added) If as shown in the figure.  

picture

At this point, I believe you have also seen that SWOOLE_BASE has one less process progression than SWOOLE_PROCESS, that is, one less level. In SWOOLE_BASE mode, there is no Master process, only one Manager process, and there is no thread group separated from Master. We will talk about Master/Manager/Reactor/TaskWorker in the next section.

Because the BASE mode is simpler, it is not prone to errors, and it has no IPC overhead, while the PROCESS mode has two IPC overheads, and the master process and the worker process need Unix Sockets to communicate. IPC is short for communication between two processes on the same host. It generally has two forms, one is through Unix Socket, which is the most common thing similar to php-fcgi.sock or mysql.sock. The other is the sysvmsg form, which is a message queue provided by Linux and is generally used less.

Of course, BASE mode also has its own problems, mainly because of the characteristics mentioned above. Since a Worker is bound to a connection, if a Worker dies, all connections in the Worker will be closed. In addition, due to contention, the Worker process cannot be balanced, and some connections may have a small amount of data and a very low load. Finally, if the callback function is blocked, it will cause the server to degenerate into synchronous mode, which will easily cause the TCP backlog queue to be full. However, as mentioned above, Http, a stateless connection that does not require interaction, has no problem using BASE, and the efficiency is also very OK. Of course, since Swoole has provided us with the SWOOLE_PROCESS process by default, it means that the SWOOLE_PROCESS mode is a more recommended mode.

Various process issues

Next, we move on to the various process and threading issues that are often mentioned above.

Master process

It is a multithreaded process. Used to manage threads, it will create Master threads and Reactor threads, as well as heartbeat detection threads, UDP packet receiving threads, and so on.

Reactor thread

We have mentioned this thread more than once. It is created in the Master process and is responsible for client TCP connections, processing network IO, processing protocols, and sending and receiving data. It does not execute any PHP code and is used to send data from TCP clients. Buffer, splice, and split into a complete request packet. We can't manipulate threads in Swoole code, why? In fact, PHP itself does not support multi-threading, Swoole is a multi-process application framework. The threads here are wrapped in C/C++ under the hood. Therefore, it does not provide us with an interface that can directly manipulate threads. But we have already learned that coroutines themselves work on threads, and coroutines are already the mainstream direction, so in Swoole, process management and coroutines are the focus of our study.

Worker process

Worker accepts the request data packets delivered by the Reactor thread, and executes specific PHP callback functions for data processing. After the processing is complete, the generated response data is sent back to the Reactor thread, which is then sent to the client by the Reactor. Worker processes can be asynchronous non-blocking mode or synchronous blocking mode, and run in multi-process mode.

TaskWorker process

It accepts the task delivered by the worker process, and returns the result to the worker process after processing the task. This mode is a synchronous blocking mode, and it also runs in a multi-process mode.

Manager process

This process is mainly responsible for creating and recycling Worker/TaskWorkder processes. In fact, it is a process manager.

their relationship

First of all, let's take a look at two pictures, which are also pictures given by the official website, and then look at the examples given on the official website according to these two pictures.

picture

The first picture is mainly about the functions of Manager and Master. We mainly look at the second picture.

picture

In this picture, we can see that the Manager process creates, recycles, and manages the bottom Worker process and Task process. And it is created by the fork() function of the operating system. If you have learned the operating system, you should not be unfamiliar with this thing. Fork() is a function to create a child process. The child processes communicate through Unix Socket or MQ queue. If you are in BASE mode, then there will be no Master process. At this time, each Worker process will assume the function of Reactor by itself, receiving and responding to request data.

If you are using PROCESS mode, then the above Master process will create various threads, remember the thread group with curly braces, but this is not available in BASE mode. It is used to handle the request response problem. Don't think, the multi-threaded method will be more efficient for connection requests. This is also a concrete manifestation of the advantages and disadvantages of the two modes mentioned above. Then the Reactor thread communicates with the Worker through the Unix Socket to complete the forwarding and receiving of data to the Worker.

We use the example given on the official website to illustrate the relationship between them. Reactor is nginx, Worker is PHP-FPM. Reactor threads process network requests asynchronously and in parallel, and then forward them to the Worker process for processing. Reactor and Worker communicate through Unix Socket.

In the application of PHP-FPM, a task is often asynchronously posted to a queue such as Redis, and some PHP processes are started in the background to process these tasks asynchronously. I believe this scenario is familiar to everyone. For example, after we place an order, we will send message notifications and emails in the native PHP environment. We will directly put this kind of problem in a queue, and then let a script run in the background. To consume these queues to send information. The TaskWorker provided by Swoole is a more complete solution, which integrates task delivery, queue, and PHP task processing process management. The processing of asynchronous tasks can be implemented very simply through the API provided by the bottom layer. In addition, TaskWorker can also return a result to the Worker after the task is executed.

A more common analogy, assuming that Server is a factory, then Reactor is sales, accepting customer orders. The Worker is the worker. When the sales receives the order, the Worker goes to work to produce what the customer wants. TaskWorker can be understood as an administrative staff, which can help Worker do some chores and let Worker concentrate on work.

The above content needs to be well understood, especially for those of us who have been exposed to the traditional PHP-FPM model development for many years, it is not easy to change our thinking. However, according to the examples provided by the official, I believe that everyone can quickly turn this around. The ordinary request is to combine our Nginx+PHP-FPM, and the Task can handle some asynchronous operations similar to the message queue.

Swoole service running process

Finally, let's take a look at the running process of the overall Swoole service, which is also a picture from the official website.

picture

In fact, this flow chart is very similar to our code flow. Define a Server object, use the set() method to set parameters, then use the on() method to start listening for various callbacks, and finally start() method to start the service. After the service is started, the Manager process is created. If it is in PROCESS mode, a Master process is created first, and then the Manager is created under the Master. Next, the Manager creates and manages a corresponding number of Worker processes according to the number of tasker_num. Among them, asynchronous task processes can be created in Worker.

The Reactor thread handles the request response problem at the outermost, listens to the corresponding event, and communicates with the Worker. If it is BASE mode, there is no Reactor thread, it is all solved by the Worker, and its relationship with the connection is one-to-one.

Summarize

Another dizzying article. In today's study, the most important thing is actually a change of thinking, that is, we need to provide service applications through multi-process methods. And this mode is actually not unfamiliar. Nginx+PHP-FPM is this mode, but PHP-FPM itself is a process management tool, but its efficiency and implementation are slightly different from Swoole. Including JIT after PHP8, it is implemented through OPCahce, and it also loads most of the code into memory at one time, just like Swoole, which saves the full loading problem of PHP-FPM each time and improves performance.

Please digest and absorb it well, but in the same way, if there are any mistakes or omissions in the above content, you are welcome to correct and criticize at any time, after all, the level is limited.

Test code:

https://github.com/zhangyue0503/swoole/blob/main/3.Swoole%E8%BF%9B%E7%A8%8B/source/3.2Swoole%E5%BC%82%E6%AD%A5%E8 %BF%9B%E7%A8%8B%E7%B3%BB%E7%BB%9F.php

Reference documentation:

https://wiki.swoole.com/#/server/init

https://wiki.swoole.com/#/learn?id=process-diff