The overall architecture of SAE

SAE adopts a layered design in terms of architecture. From top to bottom, they are the reverse proxy layer, the routing logic layer, and the Web computing service pool. From the Web computing service layer, SAE-affiliated distributed computing services and distributed storage services extend, which are divided into synchronous computing services, asynchronous computing services, persistent storage services, and non-persistent storage services. Various services are uniformly reported to the log and statistics center, refer to the figure below:

Level7 Reverse Proxy (7-layer reverse proxy layer): HTTP reverse proxy, at the outermost layer, is responsible for responding to users' HTTP requests , analyzes the request and forwards it to the back-end web service pool, and provides functions such as load balancing and health check.

Service Router (service routing layer): Logical layer, responsible for quickly mapping (O(1) time complexity) to the corresponding Web service pool based on the unique identifier of the request, and mapping to the corresponding hardware path. If it is found that the mapping relationship does not exist or is wrong, a corresponding error message will be given. This layer hides a lot of specific address information from users, so that developers do not need to care about the actual internal allocation of services.

Web Service Pools (Web Service Pools): It consists of some Web service pools with different characteristics. Each Web service pool is actually composed of a group of Apache (PHP), and these pools provide different levels of services according to different SLAs. Each Web service process actually processes the user's HTTP request. The process runs in the HTTP service sandbox and also embeds the PHP parsing engine that also runs in the SAE sandbox. The user's code ultimately calls various services through the interface.

Statistics Center amp; Log Center (Log and Statistics Center): Responsible for statistics and resource billing of all services used by users, and setting minute quotas to determine whether there is abnormal usage . The minute quota describes the speed of resource consumption. When the speed of resource consumption reaches an early warning threshold, the SAE notification system will issue a warning to the user in advance to remind the user that there may be problems with the application's use of a certain service and require intervention or processing. , the quota system is one of the measures used by SAE to ensure the stability of the entire platform; the log center is responsible for summarizing and backing up the logs of all user services, and providing retrieval and query services.

Various distributed services: SAE provides a variety of services covering the main aspects of Web application development. Users can easily call them through StdLib (which can be understood as the SAE PHP version of STL). At the same time, due to the diversity of Web services, SAE's standard services cannot meet the needs of all scenarios, so SAE uses the service bus to connect to third-party services (such as word segmentation, full-text search, etc.). SAE also welcomes third-party service providers to choose SAE for Developers provide services.

Real user code runs in the Web operating environment provided by SAE. In order to provide the unique security of public cloud computing, SAE designs multi-layer sandboxes to ensure the isolation between user applications. Refer to the picture below:

The innermost layer is the user code. Most of the PHP code can run on the SAE platform without any modification. A small part of the code needs to be modified to adapt to SAE platform features. This is mainly because SAE has disabled local IO due to security, so functions such as fwrite need to be modified to use TmpFD to read and write local temporary files or directly read and write our distributed file storage through the Storage service.

PHP Zend is the standard official PHP interpreter.

SAE Zend Sandbox is a logical concept that provides good isolation for user code running.

There are two levels here:

1. Through the standard php.ini, we have set some special configurations and disabled functions;

2. In order to achieve some things that php.ini cannot To implement the sandbox function, we have made some improvements to the Zend interpreter core to isolate resources by user ID. In addition, we also integrated some SAE-specific services into the Zend layer.

Apache is the standard Apache Web Server. However, we disabled htaccess and provided our own replacement AppConfig. Users can write AppConfig in a natural language-like manner, such as - compress: if(out_header[Content-Length] gt; = 500) compress means to start page compression conditionally. The functions provided by AppConfig include: directory default page, custom error page, compression, page redirection, page expiration, setting the content-type of the response header, and setting page access permissions. Another consideration when we choose to implement AppConfig ourselves is that the efficiency of traditional Apache's htaccess cannot meet the needs of SAE because it requires recursive merging of configuration files by directory.

The HTTP Server Sandbox provides a variety of protection functions for the safe and reliable operation of Apache, such as preventing a user from maliciously occupying the number of connections and causing the entire Web service to become abnormal.

The outermost layer is the standard POSIX environment, and our services run on Linux.

The characteristics of our architectural design will then be discussed in detail.

·Scalability

Scalability is one of the two main purposes of a distributed system. As a public cloud computing, SAE also regards service scalability as an important indicator of architecture design. It is required that when users increase and pressure increases, automatic service expansion can be achieved. Similarly, when pressure decreases, services can be contracted to save resources. The entire process does not require manual participation. SAE manually only needs to do capacity planning and management. There are two main ideas for the scalability of foreign public cloud computing architectures:

Static expansion, where users and resources have a strong binding relationship. The most typical examples are Amazon's EC2 and Ruby cloud computing platform Heroku. The resources requested by the user have a strict one-to-one relationship with the user. In other words, the virtual machine requested by user A cannot be used by user B until user A returns the resources. Even if user A's virtual machine is idle.

Dynamic expansion, there is no strong binding relationship between users and resources. The most typical example is Google App Engine. The resources requested by the user do not have a strict one-to-one relationship with the user. In other words, the process that handles user A's request can immediately handle user B's request.

Both types of scalability have their own pros and cons. The advantage of static expansion is that it provides good isolation for the platform. Resources can be fixedly mapped to a certain user, but the disadvantage is that resource utilization is not high; dynamic expansion The advantage of expansion is high resource utilization, so the cost of the entire cloud computing platform will be very low, but the disadvantage is that it has higher requirements for isolation, because resources can be used by multiple users in a short period of time. In comparison, in terms of security, dynamic expansion has a higher technical threshold than static expansion.

On the SAE platform, we adopt a design that focuses on dynamic expansion and supplements it with static expansion. At the Web computing pool layer, it is a typical dynamic expansion. No one user monopolizes the Web service process, but all users use the Web service process in an exclusive way. Through Cache, hot users naturally occupy more resources in the cache layer. Location.

In some SAE services, scalability is demonstrated in the form of static expansion, such as the RDC (Relational DB Cluster) distributed database cluster. When a user applies for the MySQL service, we will use the RDC backend according to the SLA level. Create a DB with one master and multiple slaves for the user. The DB will not be used by others until the user explicitly deletes the DB. Of course, through RDC, any user does not need to know the actual address of the back-end DB, and only needs to access the unified host and port of RDC.

·High reliability

HA is another main purpose of distributed systems. SAE also takes high reliability of services as an important indicator of architecture design. There are two main ways to implement HA, one is hardware guarantee, and the other is architecture redundancy design.

On the SAE platform, all servers are hardware equipment purchased by Sina standards. They run in the best computer rooms in the country and perform disaster recovery in multiple computer rooms. In terms of network resources, they enjoy the bandwidth environment used by the portal website. . In addition, all hardware equipment has a dedicated operation and maintenance department, and the response speed to faults is the same as Sina's internal services.

In terms of architectural design, SAE provides high reliability of services by conducting redundant designs for all services. The services here can be divided into two categories: computing type and data type:

For computing type programs, redundant design means that the program runs on multiple nodes. But this will bring about consistency problems. The main problem is the election problem, how to select a master node among multiple nodes for execution. For example, Cron, a distributed timing service on SAE, adopts a multi-point deployment method. Multiple computing nodes are isolated from each other, and the timing tasks set by the user are triggered at the same time through the clock synchronization service, but only one node is required to be responsible for execution. In order to solve this problem, SAE designed a distributed lock algorithm to provide election services. This algorithm can provide higher reliability than the Paxos algorithm at the expense of consistency under certain specific conditions (the entire election process of 3 machines will still be normal even if up to any 2 machines fail, while the Paxos algorithm tolerates up to 1 tower). As of December 2012, this algorithm is applying for a patent and is widely used within SAE.

For data-based services, SAE mainly ensures high reliability of services through replication. Data storage services on SAE generally adopt two methods: passive replication and active replication. For example, the master-slave Binlog synchronization between MySQL on SAE is a typical passive replication. Services such as TaskQueue and DeferredJob also use passive replication. The user's task description will be written to the main memory-level queue. The main queue uses background threads to write Operations are synchronized to the slave queue. Once the master queue fails, the slave queue will quickly switch to the master queue. In addition, some services on SAE use active replication (double-write replication) to ensure HA, such as Cron. When the user sets a scheduled task through the App's engineering configuration file appconfig.yaml, the task information will be written in a double-write manner. Multiple persistent DBs for subsequent triggering.

In addition, when designing the overall architecture, SAE fully considers "graceful degradation" between services and tries to reduce the coupling between services. We require that no service should assume that other services are reliable. There is no single-point design for all services on the SAE platform, and the average HA of the service is 99.95, which means that the average annual service unavailability time is between 4 and 5 hours.

Line characteristics

·Platform export IP:

220.181.129.126

220.181.129.121

220.181. 136.229

220.181.136.230

The http interface side needs IP authorization to make corresponding settings.