Introduction to RPC Concepts
Introduction to RPC
RPC (Remote Procedure Call): Remote procedure call is the idea of requesting services from a remote computer program over a network without knowledge of the underlying network technology.
The main functional goal of RPC (Remote Procedure Call) is to make it easier to build distributed computing (applications) without losing the semantic simplicity of local calls when providing powerful remote call capabilities. To achieve this goal, ** the RPC framework needs to provide a transparent call mechanism so that users do not have to explicitly distinguish between local and remote calls **.
RPC adopts the C/S mode. The requester is a client, and the service provider is a server. First, the client calling process sends a call message with process parameters to the service process, and then waits for the reply message. On the server side, the process remains asleep until the call message arrives. When a call message arrives, the server gets the process parameters, calculates the result, sends the reply message, and then waits for the next call message. Finally, the Client calling process receives the reply message, gets the process result, and then calls execution to continue.
Example
The concept above may still be a bit vague when viewed alone
Let’s take a look at the RPC demo that comes with Python
1 | from SimpleXMLRPCServer import SimpleXMLRPCServer |
1 | Import ServerProxy from xmlrpclib #Import packages from xmlrpclib |
RPC architecture
In a typical RPC usage scenario, it includes components such as Service Discovery, load, fault tolerance, network transmission, and serialization. Among them, the “RPC protocol” indicates how the program performs network transmission and serialization.
RPC core functions (RPC protocol)
- Client (Client): The service caller.
- Client Stub (Client Stub): Stores server level address information, packs the request parameter data information of the client into a network message, and then sends it to the server level through network transmission.
- Server level stub (Server Stub): Receive and unpack the request message sent by the Client, and then call the local service for processing.
- server level (Server): the true provider of the service.
Network Service: The underlying transport, which can be TCP or HTTP.
RPC
- The service consumer (Client Client) invokes the service through local invocation.
- Client stub (Client Stub) is responsible for serializing (assembling) information such as methods and imported parameters into a message body that can be transmitted over the network after receiving the call request.
- Client Stub finds the remote service address and sends the message to the server level over the network.
- The server level stub decodes the message after it is received (deserialization operation).
- server level stub (Server Stub) calls the local service for related processing according to the decoding result
- server level (Server) local service business processing.
The results are returned to the server level stub. - Server level stub serialization results.
- The server level stub sends the results over the network to the consumer.
- The Client Stub receives the message and decodes it (deserializes).
- The service consumer gets the final result.
RPC protocol implementation
The core function of RPC is mainly composed of 5 modules. If you want to implement an RPC yourself, the simplest way is to achieve three technical points, namely:
- Service addressing
Serialization and deserialization of data streams - Network transmission
** Service Addressing **
Service addressing can use Call ID mapping. In local calls, the function body is specified directly through the function pointer, but in remote calls, the function pointer is not possible because the address spaces of the two processes are completely different.
So in RPC, all functions must have their own ID. This ID is uniquely determined in all processes.
Client must attach this ID when making remote procedure calls. Then we also need to maintain a corresponding table of function and Call IDs at the Client and server levels respectively.
When the client needs to make a remote call, it looks up this table, finds out the corresponding Call ID, and then passes it to the server level. The server level also looks up the table to determine the function that the client needs to call, and then executes the corresponding function. Code.
Implementation method: Service registration center.
To invoke a service, you first need a service registry to query which instances of the other service exist.
** Serialization and deserialization **
How does the client pass parameter values to a remote function? In a local call, we just need to push the parameters onto the stack and let the function read them on the stack by itself.
However, during remote procedure calls, the client and server levels are different processes and cannot pass parameters through memory.
At this time, the client needs to convert the parameters into a ByteFlow first, pass it to the server level, and then convert the ByteFlow into a format that can be read by itself.
Only binary data can be transmitted in the network. The definitions of serialization and deserialization are:
The process of converting an object into a binary stream is called serialization
The process of converting a binary stream into an object is called deserialization
This process is called serialization and deserialization. Similarly, values returned from the server level also need to be serialized and deserialized.
** Network transmission **
Network transmission: Remote calls are often used on the network, and the client and server levels are connected through the network.
All data needs to be transmitted over the network, so there needs to be a network transport layer. The network transport layer needs to pass the Call ID and serialized parameter bytes to the server level, and then pass the serialized call result back to the Client.
As long as it can complete both, it can be used as a transport layer. Therefore, the protocol it uses is actually unlimited, as long as it can complete the transmission.
Although most RPC frameworks use the TCP protocol, UDP can also be used, while gRPC simply uses HTTP2.
TCP connections are the most common, a brief analysis of TCP-based connections: usually TCP connections can be on-demand connections (when you need to call to establish a connection, immediately after the call is broken), or long connections (Client and server After establishing a connection to maintain long-term holding, regardless of whether there is a data packet sent at this time, you can cooperate with the heartbeat detection mechanism to regularly detect whether the established connection is alive and valid), multiple remote procedure calls share the same connection.
There are many optional network transmission methods in RPC, including TCP protocol, UDP protocol, and HTTP protocol.
Each protocol has a different impact on the overall performance and efficiency. How to choose a correct network transmission protocol? First of all, it is necessary to understand how various transmission protocols work in RPC.
** Based on
A socket connection is established between the caller of the service and the provider of the service, and the caller of the service serializes the interface name, method name and parameters to be called through the socket and passes them to the provider of the service, and the provider of the service deserializes them. Then use reflection to call related methods.
Returns the result to the caller of the service, which is roughly the case for the entire TCP-based RPC call.
However, in the example application, a series of encapsulation will be carried out. For example, RMI is to pass serializable Java objects on the TCP protocol.
** Based on
This method is more like accessing a web page, but its return result is more simple and simple.
The approximate process is as follows: the caller of the service sends a request to the provider of the service. The method of this request may be one of GET, POST, PUT, DELETE, etc. The provider of the service may make different requests according to different request methods. Different processing, or a certain method only allows a certain request method.
The specific method of the call is to call the method according to the URL, and the parameters required by the method may be the result of parsing the XML data or JSON data transmitted to the service caller, and return the JOSN or XML data result.
Since there are currently many open source web servers, such as Tomcat, it is easier to implement, just like working on a web project.
** Comparison of two ways **
The RPC call based on the TCP protocol implementation, because the TCP protocol is in the lower layer of the Protocol Stack, can customize the protocol fields more flexibly, reduce network overhead, improve performance, and achieve greater throughput and concurrency.
However, more attention needs to be paid to the complex details of the underlying layer, and the cost of implementation is higher. At the same time, for different platforms, such as Android, iOS, etc., different toolkits need to be redeveloped to send requests and parse them accordingly, which is heavy workload and difficult to respond quickly and meet user requests.
RPC based on the HTTP protocol can use request or response data in JSON and XML formats.
JSON and XML as a common format standard (using the HTTP protocol also requires serialization and deserialization, but this is not the concern of the content under the protocol, mature Web programs have already done a good job of serializing content), open source parsing tools have been quite mature, on which secondary development will be very convenient and simple.
However, since the HTTP protocol is an upper layer protocol, the number of bytes occupied by transmitting information containing the same content using the HTTP protocol will be higher than the number of bytes occupied by transmitting using the TCP protocol.
Therefore, under the same network, transmitting the same content through the HTTP protocol will be less efficient than data based on the TCP protocol, and the time occupied by information transmission will be longer. Of course, compressing data can narrow this gap.
RabbitMQ
Benefits of using RabbitMQ:
- Synchronous mutation step: You can use thread pool to turn synchronization into asynchronous, but the disadvantage is that you have to implement thread pool yourself and strong coupling. Using message queue can easily turn synchronous requests into asynchronous requests.
- Low cohesion and high coupling: Decoupling, reducing strong dependencies.
- Traffic peak clipping: set the request value through the message queue, discard or go to the error interface if it exceeds the threshold.
- Improved network communication performance: The creation and destruction of TCP costs a lot, creating 3 handshakes and destroying 4 breakups. At the peak, thousands of links will cause a huge waste of resources, and the number of TCP processed by the operating system per second is also There are quantitative limitations, which will inevitably cause performance bottlenecks.
RPC Service Registration and Discovery
In RPC remote procedure call, there are two roles, a service provider and another service consumer. How to let the caller know which services exist to call? That is, how to let others use our service?
A colleague said it’s very simple, just tell the user the IP and port of the service. Indeed, the key to the problem here is whether it is automatically informed or manually informed.
The way of manual notification: if you find that one machine of your service is not enough and you want to add another one, this time you have to tell the caller that I now have two IPs, and you need to poll the call to achieve Load Balance; the caller grit his teeth and changed it. As a result, one day a machine crashed, and the caller found that half of the service was unavailable, he could only manually modify the code to delete the IP of the machine that crashed. The real production environment will of course not use manual methods.
Is there a way to achieve automatic notification, that is, the addition and removal of machines are transparent to the caller, and the caller no longer needs the Hard code service provider address? Of course, now zookeeper is widely used to achieve automatic service registration and discovery functions!
In mature service governance frameworks, there are not only these two roles, but also a Registry role. A diagram can explain the main responsibilities of the registry.
- Registry for server level registration of remote services and Client discovery services
- server level, provide background services to the outside world, and register your service information to the registration center
- Client, get the registration information of the remote service from the registry, and then make a remote procedure call
At present, the main registration center can be implemented through open source frameworks such as zookeeper, eureka, consul, etcd, etc. Internet companies will also develop their own business characteristics, such as Meituan Dianping’s self-developed MNS and Sina Weibo’s self-developed vintage.
Service registration
Obviously, in order for others to discover your service, you first need to register the service with the service center.
Registering a service to the service center is actually a registration in the registry, which stores the IP, port, calling method (protocol, serialization method) of the service. In zookeeper, registering a service is actually creating a znode node in zookeeper, which stores the service information mentioned above. This node bears the most important responsibility. It is created by the service provider (when the service is released) for service consumers to obtain the information in the node, so as to locate the real network topology location of the service provider and know how to call it.
Service Discovery
When the service consumer calls the service for the first time, it will find the list of IP addresses of the corresponding service through the registry and cache it locally for subsequent use. When the consumer calls the service, it will not request the registry again, but directly call the service from the server of a service provider from the IP list through the Load Balance algorithm.
Perception service offline
When a service is released, it must naturally be registered with the registry, but it must also be removed from the registry when it is offline. Registration is an active behavior, which does not require special attention, but service offline is a problem worth thinking about. Service offline includes offline in abnormal ways such as active offline and system downtime.
RPC与RESTful
Different categories
REST, is the abbreviation of Representational State Transfer, which describes representational state transfer in Chinese (it refers to a snapshot of resource data in an instantaneous state, including information such as the content of resource data, representation format (XML, JSON), etc.)
REST is a software architecture style. The typical application of this style is HTTP. It is widely favored by developers because of its simplicity and strong scalability.
RPC is the abbreviation of Remote Procedure Call Protocol. The Chinese description is a remote procedure call. It can realize that the Client calls the server’s service (method) like calling a local service (method).
And RPC can be based on TCP/UDP, can also be transmitted based on the HTTP protocol, it stands to reason that it and REST are not a level of meaning, should not be discussed together, but who makes REST so popular, it is currently the most popular A set of API design standards for Internet applications, in a sense, we say that REST can actually refer to the HTTP protocol.
Different ways of use
From the point of view of use, the HTTP interface only focuses on the service provider and does not care how the Client is called. The interface only needs to ensure that when there is a Client call, the corresponding data is returned. RPC requires the Client interface to remain consistent with the server.
REST is to write the method at the server level, and the Client does not know the specific method. The Client just wants to get the resource, so it initiates an HTTP request, and the server level locates the method after a series of routes according to the URI after receiving the request. RPC is the server level to provide a good method for the Client to call. The Client needs to know the specific class of the server, the specific method, and then call it directly like a local method.
Object Oriented
From a design point of view, RPC, the so-called remote procedure call, is method-oriented, REST: the so-called Representational state transfer, is resource-oriented, in addition to this, there is a so-called SOA, the so-called service-oriented architecture, It is message-oriented, and this contact is not much to say.
Different serialization protocols
Interface calls usually contain two parts, serialization and communication protocol.
Communication Protocol, as mentioned above, REST is based on the HTTP protocol, while RPC can be based on TCP/UDP or can be transmitted based on the HTTP protocol.
Common serialization protocols are: json, xml, hession, protobuf, thrift, text, bytes, etc. REST usually uses JSON or XML, while RPC uses JSON-RPC or XML-RPC.
Application scenarios
Both REST and RPC are commonly used in microservice structures.
- HTTP is relatively more standardized, more standard, and more general. No matter which language supports the http protocol. If you are an open API to the outside world, such as Open Platform, there are various external programming languages, and you cannot refuse to support each language. Now open source Middleware, the first few protocols supported basically include RESTful.
Utilization of RPC in Microservices
- As the basic component of the microservice architecture, the RPC framework can greatly reduce the cost of the microservice architecture, improve the R & D efficiency of the caller and service provider, and shield all kinds of complex details of calling functions (services) across processes. Let the caller feel like calling a remote function like calling a local function, and let the service provider feel like implementing a local function to implement a service.
Reference link:
https://www.zhihu.com/question/25536695
https://www.jianshu.com/p/027a6ec9c44e