Bigtable, the NoSQL database service Google wants to use to dominate Big Data

4 min reading
Developers / 08 June 2015
Bigtable, the NoSQL database service Google wants to use to dominate Big Data
Bigtable, the NoSQL database service Google wants to use to dominate Big Data

BBVA API Market

Users of Google Analytics, Gmail, Google Maps, Google Earth and YouTube have been testing the capabilities of Google Cloud Bigtable for years without even knowing it. The Mountain View company claims that they have been conducting tests for more than 10 years and that this NoSQL database service, now in its beta version for companies, will enable much more information to be processed quickly, efficiently and scalably.

This new solution from Google is especially designed for companies that need to manage and analyze data in the order of petabytes in sectors such as financial services, energy, biomedicine, telecommunications and online advertising. Large volumes of data and the need to extract value from that information.

Google summarizes the benefits of Bigtable in several key points.

– Performance: Read and write latency within milliseconds and a performance per dollar twice that of other alternative NoSQL solutions available on the market. The image below shows a comparison study among Bigtable, a generic version of Hbase and Apache Cassandra, the NoSQL database used by Twitter and written in Java.

– Open source interface: It is a cloud service. It is accessible via the HBase API, it integrates natively in the Hadoop solutions of any company and it is compatible with the other Google products that work with data or in the cloud, such as Google BigQuery and Google Cloud Dataflow. Thanks to the API, data can be managed seamlessly in the standard formats proposed within the industry.

– Cost: Google claims that Bigtable halves the cost of other solutions.

– Security: It encrypts all the data it processes and works based on a data replication strategy.

– Simplicity: Google boasts that creating or configuring a cluster in Bigtable would take less than 10 seconds thanks to its user interface.

– Maturity: Google has been testing the service with its own data solutions since 2004, the year it was launched. It is now launching a beta version.

– Price: It will depend on the use of the network, the nodes deployed and the storage used for the data.

Bigtable data model and API

Bigtable is a distributed, organized and multidimensional map with three dimensions: rows, columns and timestamp. How do they interrelate? The system divides the data into columns to store all the information in tables consisting of cells. Each of these cells has a timestamp that enables the data trends over time to be displayed.

The file system used by Bigtable is Google File System (GFS) and data compression and decompression is achieved through two particularly fast algorithms: 100 – 200 MB/s for compression and 400 – 1000 MB/s for decompression. This speed is possible because each of these actions does not act on the entire set, but on part of the data.

Another benefit of this service is its free public access API. This enables any developer to carry out a specific project with it or conduct performance tests on a data sample, for example. If you are a developer and you would like to test it, you should bear in mind that the languages used are C and C++.

With Bigtable’s API, all kinds of operations can be carried out with the data tables: you can create or delete tables and column families, write or delete values, iterate on a subset of data in a table, change a cluster or the metadata for some variable, or manage access control rights.

One example of what can be done with the Bigtable API would be a sample of code in C++ for exploring data and making changes:

Scanner scanner(T);
ScanStream *stream; stream = scanner.FetchColumnFamily(“anchor”);
stream->SetReturnAllVersions(); scanner.Lookup(“com.cnn.www”);
for (; !stream->Done(); stream->Next()) {
printf(“%s %s %lld %s\n”,
scanner.RowName(),
stream->ColumnName(),
stream->MicroTimestamp(),
stream->Value()); }

Bigtable case study with financial services

For marketing Bigtable and meeting the demands of companies, Google is assisted by several partners that would facilitate the implementation of the services in the cloud. One of them is SunGard, a provider that would enable any company in the financial sector to create a scalable platform administered in the cloud.

In fact, according to this provider, there is already a financial audit trail system capable of managing 2.5 million commercial messages per second managing 2.5 million commercial messages per second.

These are some of Google Cloud Bigtable’s features for these services:

– Scalable service.

– Table distribution to enable simultaneous operations.

– Flexible scheme and organized distribution of the data into columns to improve performance.

– The timestamps enable the data trends to be analyzed ay any time, a basic feature in audit processes.

– It guarantees speed as data grows, and low costs when the volume is not so high.

– Ability to add new clusters to the structure in a few minutes.

– The client can check the data even as it is being written.

Google, Amazon and Microsoft, the fight over the public cloud

Google’s new NoSQL database service is a part of a strategic fight that has been heating up for a long time: the battle for the public cloud. Services such as Amazon Web Services, Google Cloud Platform and Microsoft Azure are fighting to offer better performance, more speed, more scalability and, also, better prices to companies that have vast volumes of data.

Google is launching a beta version of Bigtable as an added step in its attempt to dominate this market. Obviously, the rivals are powerful: DynamoDB in Amazon, Azure DocumentDB in Microsoft and Cloudant in IBM. And also other alternatives not supported by large companies, but widely used by the community: Aerospike, MongoDB, CouchBase, Redis, CouchDB and the aforementioned Cassandra.

Following the launch, Gartner analyzed some of the strengths of Google’s service strengths of Google’s service: the key one, that companies will be able to pay for the service as they use it, which is ultimately paying for performance. Therefore, any Bigtable client will be able to preview its usage cost. On the negative side, Gartner called into question Google’s ability to put in place a sufficiently powerful sales service to place its service in many companies.

Follow us on @BBVAAPIMarket

It may interest you