InfoSecDd

Workshop in Information Security - Distributed Databases Project

by Ainat Chervin, Yosi Barad and Ilia Oshmiansky

Distributed database systems are becoming the method of massive data storage for the new generation of web applications storing large amount of metadata information. Distributed database systems use an architecture that distributes storage and processing across multiple servers to address performance and scalability requirements of modern large scale applications.

All these requirements cannot be met by traditional relational databases which require that all requests of the same transaction are executed by a single node of the database. Thus, most of the popular web applications today replace the relational databases with simpler but highly scalable databases, such as BigTable (Google), HBase (Yahoo) and Cassandra (Facebook). However, scalability comes at a price of lower consistency and security guarantees.

Introduction:

This project focuses on two popular table stores, Cassandra and Accumulo. While the access control of Cassandra is at the level of column family, Accumolo has a higher level of security and allows defining cell-level access control.

The main goals of this project are to add support for cell-level ACLs (Access Control Lists) to Cassandra and compare the resulting system to Accumulo, evaluating the performance and measuring the security holes. The project will attempt to improve the security of the system by increasing the consistency, while measuring the performance penalty.

Documentation:

Project Plan

Milestone 1 - Report

Milestone 1 - Presentation

Milestone 2 - Report

Milestone 2 - Presentation

Final Report (docx,pdf)

Final Presentation (pptx,pdf)

Implementation:

The Acl saved within the value in the database:

Cassandra Acl v1.1 (Code, JavaDoc).

The Acl saved in a new column in the database:

Cassandra Acl v1.2 (Code, JavaDoc)

Tutorial and Running examples:

Cassandra Acl - Getting Started (docx,pdf)

Script to parse the results of the inconsistecy tests:

Script - Parse Timelag

Tutorials:

We wrote Tutorials for the systems we worked on during the workshop project.

In the following Tutorials you may find a getting started installation guide and some running examples to the mentioned systems.

Cassandra , Accumulo , YCSB , YCSB++.

JavaDocs:

Cassandra Acl - How it works:

Results:

Cassandra with cell-level ACL versus Accumulo:

The chart above shows the percentage of decrease in performance of the two systems in two scenarios:

The system with no ACLs compared to the same system with only one ACL entry.
The system with one ACL entry compared to the same system with eleven ACL entries.

*Notice that the chart shows the decrease in performance so the smaller the values the better it is.

Consistency tests:

This chart shows the average of multiple tests (we ran each test several times to iron-out any instability). The y axis is latency in milliseconds; the x axis represents the number of replicas. So with 2 nodes we got an average of 118ms delay on average and the 95^th percentile (call it worse-case latency) was 542ms.

In these results we can notice a general increase in latency the more replicas we use, which is to be expected since more replicas means more network traffic.

This chart is the similar as the insert chart with regard to update operations. In this chart we can also notice a clear deterioration in performance with each extra replica but when compared to the read-after-insert results these look better, the latency has dropped significantly.

Looking closer into what the difference was between the two tests, we found something interesting, the throughput (operations per second) of the update was always around 1.8k but the throughput of the insert was around 10k. So we decided to check whether that had something to do with it.

Playing with the thread count in the read-after-insert test we managed to get it down from 10k to about 2k ops/sec, looking at the measurements we noticed that the delays were almost all close to 0 and we got even better results than with the update. When running the tests with 4k ops/sec we got worse results and as higher we went the worse results we got.

The conclusion is again, same as with the WIFI tests (see Final report), the factor which is most crucial with eventual consistency is Network specs. Less ops means less traffic, less traffic means faster propagation of data among nodes and this results in reduced latency.

Conclusions:

Cassandra with cell-level ACL versus Accumulo:

In our, work we extended Cassandra so that it supports ACLs at the cell level, which is defined as the column level in the Cassandra database. Furthermore we benchmarked the performance of Cassandra with our implementation of ACLs in comparison to the original Cassandra database and to the Accumulo table, which can also support cell level ACLs.

From our comparisons it seems that Cassandra with our implementation is generally doing better than Accumulo at supporting multiple ACL entries. However, it is still clear that Accumulo ACL performance is far superior to our system since the overall decrease in performance from no ACL to 11 ACL entries in Cassandra is still much greater in most tests.

Improving the consistency:

Improving the consistency can be achieved in various ways. As part of our conclusions we will discuss several ways that we have tested to be working which are:

Increase the read/write Consistency. Level from ONE to higher:

By increasing the Consistency.Level of operation we can fine-tune the desired consistency on every read/write operation. We have run the read-after-write and read-after-update tests each with ConsistencyLevel.ALL once on read and once on write and in both cases the result was the same – 0 time lag.

Reduce the load on the server by better load-balancing

During our testing we noticed that the latency has everything to do with throughput. Running the test several times with different thread counts (translates to different throughputs) confirmed this theory. So by monitoring the load on nodes and balancing it can really make a difference with consistency

Use a 3^rd-party synchronization mechanism:

Using a 3^rd party software that would work on a higher level than Cassandra and create a mid-layer between the user and the DB can also do the trick. Software that would manage the reads could guarantee that the data is consistent by reading the values from multiple nodes along the cluster and comparing timestamps. We tested this method by getting our "Consumer" client to read the data from multiple sources. The results we got were significantly better.

Monitor and Improve the quality of your network:

As our test clearly demonstrated –for the data to be consistent there is nothing more important than a good and stable network. Keeping track and monitoring your network is a must if consistency is of the essence.

Distributed Systems are highly complex and have many points of failure. When planning such a system you will need to consider a variety of factors and variables ranging from the performance of your servers, the stability of the network, the bandwidth and latency between nodes and more. We hope the results mentioned above contribute in making the appropriate decision when it comes to consistency.

Workshop in information security by Ainat Chervin, Yosi Barad and Ilia Oshmianski

You may also visit our website: http://course.cs.tau.ac.il/secws12/