Apache Accumulo: NoSQL Secure Database

A major concern for organizations building big data analytical ecosystems is data security. One flaw of Hadoop/MapReduce and many NoSQL databases is weak security.

Apache Accumulo is an open-source highly secure NoSQL database created in 2008 by the National Security Agency. It easily integrates with Hadoop, can securely handle massive amounts of structured and unstructured data - at scale cost-effectively - and enables users to move beyond traditional batch processing and conduct a wide variety of real-time analyses. Accumulo is a sorted, distributed key/value store based on Google's BigTable design. It is a system built on top of Hadoop, ZooKeeper and Thrift. Written in Java, Accumulo has cell-level access labels and a server-side programming mechanisms.

Accumulo offers "Cell-Level Security" - extending the BigTable data model, adding a new element to the key called "Column Visibility". This element stores a logical combination of security labels that must be satisfied at query time in order for the key and value to be returned as part of a user request. This allows data of varying security requirements to be stored in the same table, and allows users to see only those keys and values for which they are authorized.

Sqrrl Enterprise, developed by Sqrrl Data, is the operational data store for large amounts of structured and unstructured data. It is the only NoSQL solution that scales elastically to tens of petabytes of data and has fine-grained security controls. Sqrrl Enterprise enables development of real-time applications on top of Big Data. Sqrrl uses HDFS for storage; Accumulo for security/speed of access; Thrift API for interactivity; and works with map/reduce, visualizations, third party software, and existing schema explored databases.