Cloud Storage by VK - Enterprise Data Storage

Today we will talk about the following:

Problems of classic large data storage systems,
How to organize data storage in companies with increased internal security requirements;
How to easily migrate to a domestic solution;
Advantages of Cloud Storage object storage from VK.

The event program includes:

Customer need;
Presentation of Cloud Storage by VK;
Product partnership program;
Implemented cases.

Speakers: Khanetsky Dmitry, Head of Cloud Storage Sales Group, and Sorokin Georgy, Cloud Storage Architect.

The portrait of our client is the enterprise segment, i.e. large businesses, absolutely any industry, all those companies that have the task of storing unstructured content, which means different data formats: files, videos, scans, audio, etc. The second point is the SLA requirements. There must be high availability, high SLA, usually this is the government segment or the banking sector, where there are requirements for critical infrastructure. Our solution is designed for highly critical SLAs for availability and storage security.

As for the security perimeter of storage, storage is provided in more than one data center (in two or three), VK has four of them. We pay increased attention to the requirements of regulators, this is what concerns directive No. 12-36, the transition to import substitution and certain requirements for the software to belong to the "Russian" class.

What needs do our clients have? This is long-term storage of files of various types, the minimum storage volume is 50 TB. This is useful data, not raw storage capacity, below this threshold our solutions are not so interesting from the point of view of pricing and they are functionally redundant.

Another point is constant intensive access to read and write objects. If these three criteria are met, then you are on the right track in terms of our portfolio.

What do clients usually store? First of all, backups, archives, both hot and cold, with various types of data, this is document management, but there are cases with Big Data, machine learning and multimedia content.

What points should be clarified with the customer when you start a conversation about selling VK products? First of all, this is the willingness to use the standard S3 protocol. This is an international standard that was created by Amazon in 2006. We strictly adhere to this standard, we move within the framework of the standard rules of the S3 protocol. There are many cases when our customers are aware of and use S3 as a standard protocol direction for storage, but there are cases when they do not use it. Therefore, it is worth asking about this and taking into account their wishes at an early stage of project creation.

The second point is the willingness to deploy the solution in the internal circuit, and not in the cloud. In this case, we are talking about an “on-premise” solution. This installation is in the customer's circuit, it is completely Internet-independent from the point of view of external access. We do not store any keys outside in the case of such an installation, we have no restrictions in terms of access. That is, this is a fully isolated solution that can be extended to one, two or more data centers at the customer's site.

The third point is the possibility of placing hardware capacities of the data center, because we often encounter a situation when the customer does not have either sufficient capacity in the data center, or any internal capacities and resources to place equipment on their site. If the answers to all these three questions are positive, then we move on.

I want to draw your attention to the simplicity of implementing our case. We have a standard form in the form of a questionnaire. It includes about 15 questions. The basic profile that we compile based on these answers gives a picture of the load profile, the volume for calculating the cost specification of the project.

The first point is the choice of delivery format. In our case, there are several options available on the market. This is just software, we transfer it to you for sale to the customer. Then you can independently supplement this solution with equipment, hardware configuration, hardware platform (we also attach to the configuration).

The second point is the sale of a hardware and software complex (PAC). We, as a vendor, also offer a PAC, which we assemble ourselves on the base, test, pre-configure and transfer to the customer as a ready-made hardware product. It includes both support, hardware and software. This is a full-fledged ready-made vendor solution, which we are used to seeing on the market.

There is also a third implementation option. It is precisely the exceptional case for configurations less than 50 TB. This is a sale within the VK public cloud.

The fourth step that we go through, before the sale, before closing the project, we get the very configuration that needs to be obtained, either in the PAC version, or in the software version. We sell it to you with the necessary prices and documentation for sale and transfer to the customer. Ultimately, we simply, if it is a PAC, bring it directly to the data center, configure it ourselves, and if you want to participate in this process yourself, we can transfer the reins of control to you, but, as a rule, we try to do the initial implementations ourselves in order to avoid the risks of poor-quality assembly of the platform. Or, we can audit your work, if you gain sufficient competence, are ready to learn in this direction, we also provide such services and are ready to develop our partners. Under certain conditions, we can transfer the implementation to you. We already have several partners who have the necessary competencies and certificates for implementing our solutions and have been successfully doing this on the market for the first time.

These are the main four steps that we define for a successful sale. I want to emphasize once again the fact that we try to simplify the process of selling and configuring the solution as much as possible. This is the basic approach. We do not want to introduce any complications and additions here, we try to make it easier for our partners to work for more productive work with customers. So that it takes not weeks, but a day or two.

There is a small segment that we would like to highlight in terms of classic storage systems. For more than a year, we have been in a situation where problems have arisen in the market related to the difficulties of supplying foreign classics and the transition to something Russian, but in a block format. There are vendors who produce these storage systems, but there are customers who are switching from Western solutions to local alternative technologies.

As a rule, this whole structure is ideally supported by vendors, and any manipulations are primarily wasted time and plus costs. From the point of view of stretching the storage to more than one data center, there are also certain nuances that relate to the implementation of the architectural stretching of the storage, and, as you understand, these restrictions do not go away. Also, for block storage systems, you need to create an additional access circuit for structured content. All these challenges are all the basic things that customers face before switching to an alternative S3 technology.

Now I will tell you about how we deal with these challenges, what advantages we have for our clients over our colleagues in terms of approach and technology. The first thing I want to note is unlimited scaling. This is an x86 architecture, ordinary servers, and we horizontally do not have a bar, at least, we have not yet reached it in the architectural solution. We have installations of more than 350 PB of data. No one in Russia has such installations, except for VK in our data centers. So we claim that we have no scaling restrictions. All this happens within the framework of a single large installation, and we achieve scaling by delivering certain servers or disk shelves simply to the system, without stopping the process. That is, the customer, or you, if you will serve him, can bring additional equipment to the customer's data center and simply connect it to the already ready-made architecture in on-line mode. All this is done quickly and, most importantly, affordably. X86 servers are available on the market. And I also want to note that we are not tied to the manufacturers of these servers. If you supply software, and we sell it to the customer with you, then the servers have only certain configuration requirements that we must take into account. From the point of view of the manufacturer of the vendor of this hardware, its availability and other factors, there are no restrictions.

The second point is the reduction of TCO. We can supply the solution as software, which is practically not done by anyone on the market; it's mostly PACs, i.e., it's tied to hardware, proprietary equipment, and the vendor. Or it can be PACs from us if the customer wants everything from a single point, full support with hot-swapping. In the case of PAC delivery, the customer receives a completely integrated solution. Thus, we reduce TCO compared to block-based solutions because architecturally we are cheaper. The software itself and components with x86 architecture are lower in cost than block-based ones. So, we are free to do whatever we want, and you also have your hands untied in terms of configuring solutions. That is, as one of the business options, we will support you if you, as partners, offer to configure certain PACs for customers on your side.

The flexibility of storage configuration is another point that we have. Depending on the task that the customer is solving, for example, there are large blocks of data that need to be written, but the emphasis is not on storage there, but on the input-output of hot data. This is one configuration option. The second option is cold data storage, when, for example, the customer has many different kinds of objects, small, medium, and large, that are stored on a long-term basis according to certain criteria. In this case, we can reconfigure our solution and add nodes of a certain performance format for storage, etc. Our solution is configured as flexibly as possible in terms of disk components, disk volume, processor power, etc. We are not limited in terms of hardware configuration parameter options. This is also very important for the customer.

We provide a multi-data center configuration, and these are not isolated configurations, but one large stretched configuration across multiple data centers. That is, this is a system that fully covers the solution with a cluster, with multi-data center storage. And we have examples of several sites. We at VK have 4 sites where we store our data. And it's all one big installation.

And the last point, also important. This is our own development from the very beginning.

A little history. In 2006, the report on the S3 protocol and technology began. It was invented and released by Amazon for its data centers and cloud storage. VK joined this race in 2013, while still part of Mail. And we developed the first object storage for our own cloud. We stored content for the mail agent. Our mail, our cloud Mail.ru, fully used our VK Cloud solution. In 2016, support for the S3 protocol was fully implemented in VK object storage. In 2017, we started the first commercial sales of object storage already in the format of a separate product as part of platforms. In 2019, we added the product to the Register of Russian Software as part of Mail.ru Cloud Solutions. And in 2023, we brought this product into a separate combat unit. This is a completely independent data storage solution. We also gave it a new registry name and a new registry entry. Now it is a completely independent project with a long, almost 10-year history of development. During this time, we have fixed a lot of bugs, like all developers who go through this path. And we can safely say that we are as consistent and refined as possible in terms of functionality. Our clients say so. We are included in the Register, there is a registry entry, you can find it on the website of the relevant government agency. This is all publicly available, and we are ready to share this information with you if necessary.

Our solution is powerful and productive. There are no analogues on the market in terms of the amount of data stored in a single installation. Today, we store more than 350 PB of data in our 4 data centers. And we experience the operation of this solution on ourselves every day. We had cases when data centers were shut down, there were fires in data centers in Russia. One of the data centers was rented by us at that very moment. And our solution seamlessly switched and continued to work even in the event of an emergency. We have more than 30 billion objects in hot access. And 90 billion objects are in cold shared access. This does not mean that we should sell exactly the same large installations. This only means that our solution is as well-developed as possible and capable of ideal operating conditions in terms of both the risks of data loss and the scalability of the system.

Now let's talk about delivery options. The first option is software. In this case, we sell perpetual licenses, non-transferable, i.e., they are not limited in time, it is not a subscription, it is a full license that is purchased once and transferred to the customer's balance sheet and is fully owned by him without time limit. We license by useful terabyte data. That is, roughly speaking, the customer needs to store 100 TB of data, we offer a software license for 100 TB regardless of the platform configuration with what replication factor is present, etc. Support we have as most vendors. Sold for certain periods: one, two, three and five years. The maximum term is five years. We can sell support to you as a certificate or as a service. Depending on the customer's budget, and on the basis of which expense item he is willing to buy it.

We sell PAC in different variations. There are situations when the customer does not care what type of equipment is in this PAC. The equipment can be both registered and non-registered. By default, we do not offer registered hardware in order to reduce the cost of the configuration and provide a more competitive price on the market for our PAC. But, if the customer says that they have certain regulatory requirements, or an internal order to comply with the requirements of the regulator on the availability of registered components of the hardware complex, then we can supply registered components in our PAC. That is, any options are possible. We cooperate with all major vendors in the Russian market, and we can supply a variety of configurations based on our PAC. At the same time, if the PAC is supplied by us, regardless of what components are in its composition, we provide centralized support for you and for our customers both for the hardware part and the software part. And as in the case of software, and in the case of PAC, we provide services for commissioning and implementation of software, commissioning of PAC and maintenance. We can also offer training services for personnel. We can train both you and the customer's employees to work with our system.

The third implementation option is a public cloud for cases when we understand that the customer really needs it, but he is not ready to put the complex in his data center, there are no requirements for his own isolated circuit and the storage volume is less than 50 TB. In this case, we can offer Public Cloud. And you can also resell it on your papers for the customer and make money on it.

Then Georgy Sorokin, Cloud Storage architect, took the floor. He began the story of how Cloud Storage is arranged. It is conditionally divided into three functional levels, each of which is responsible for its function. This is the level of Front servers, the level of Metadata and the level of Storage servers. In large solutions, we separate these levels, including physically. Each of these levels consists of completely separate servers.

Front Servers are servers that process all incoming requests, process the load, calculate hashes and distribute incoming information to Storage servers. These are the servers where the objects themselves are stored.

The level of Metaservers is the level where metadata about all objects is stored. It is written and scaled separately. This level runs on the in-memory Tarantool platform. It is thanks to him that we have 350 PB of data that we can scale. Because even at these large volumes, our response time does not drop at all.

Regarding scaling. We can scale in different ways. If, for example, there was initially a certain volume, for example, 5000 IOPS. The load was on the system. And suddenly the customer needs to increase this load. There are more clients or the system has become more loaded. We can simply increase the number of Front servers that are responsible for processing incoming requests. And if we increase the amount of data, then we usually just add Storage servers. Usually, together with Storage servers, which are responsible for saving objects, if they increase, it is quite logical that we also have an increase in the number of objects, for which our Metaservers are responsible. We accordingly add Metaservers. Each of these layers can be scaled separately, which is very important. There is no need, as in the case if these functions were on one server, to add all the layers at once. We choose the parameter that we need to activate, and add servers in accordance with this need.

Now about several data centers. Here our solution works in distributed mode. This means that we do not make one data center active and the second passive and do not configure replication between them. Our solution works distributed between several data centers. Two, three, four or more, it does not matter. And work is carried out immediately with all servers, with all objects that are located in these data centers.

Regarding the replication level. We have Storage servers. They are usually added in pairs. This is due to the fact that by default the replication factor is 2. Each of our objects is stored in two copies, and each of these two copies is absolutely always stored on a separate server and on a separate disk. This is all embedded in the architecture of the system and objects cannot be located differently. Situations when both copies of one object will be located either on one server or on one disk are excluded. If we talk about a multi-data center configuration, then they will always automatically be located in different data centers.

About Metadata. Thanks to Tarantool, we can configure the replication factor, i.e. the number of copies that we store metadata. By default, we use 3 copies. For several (two, four) data centers, we increase this factor to 4. So that you can symmetrically arrange the data centers. Thanks to Tarantool, we have sharding, when we divide one large database into several and place it on separate machines, which is why we have the ability to increase a huge amount of information and a huge number of objects without performance degradation. Since all data about them is stored on separate machines in separate memory and this is handled by a separate instance of Tarantool.

Regarding replications and delays, the standard delay between data centers is recommended to be no more than 7 milliseconds. But this does not mean that we do not support a large delay. We can safely work on 20 and 30 milliseconds. But you need to understand that our system is distributed, the recording is synchronous, and in order for the object to be considered recorded, we must receive confirmation from each data center.

We have our own monitoring system, there are dashboards. Everything works on Grafana. We show all the basic information about the operation of the system, this is the load, and RPS, and the work of individual services, bandwidth, and this is only a small part of the dashboards. There are actually 10 times more of them. And you can delve into the smallest details, see what and how it works, prevent errors and process them in advance, without bringing them to critical events.

In terms of functionality, we maintain the standard S5 protocol developed by Amazon. And the main functions that we support, in principle, are all functions. Some of them are still under development, those that are less critical. On the screen you can see what we support and a small clock for the next quarters, what we already have even in testing.