Source: Apache HBASE Wikipedia
“HBase is an open-source, non-relational, distributed database modeled after Google’s Bigtable and is written in Java. It is developed as part of Apache Software Foundation’s Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop. That is, it provides a fault-tolerant way of storing large quantities of sparse data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection).” Read more…
Source: Apache Cassandra Wikipedia
“Apache Cassandra is a free and open-source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients.
Cassandra also places a high value on performance. In 2012, University of Toronto researchers studying NoSQL systems concluded that “In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments” although “this comes at the price of high write and read latencies.” Read more…
Source: Hortonworks Wikipedia
Hortonworks is a business computer software company based in Santa Clara, California. The company focuses on the development and support of Apache Hadoop, a framework that allows for the distributed processing of large data sets across computer clusters.
Source: Apache Mesos Wikipedia
Apache Mesos is an open-source cluster manager that was developed at the University of California, Berkeley. It “provides efficient resource isolation and sharing across distributed applications, or frameworks”. The software enables resource sharing in a fine-grained manner, improving cluster utilization.
Source: Apache ZooKeeper Wikipedia
Apache ZooKeeper is a software project of the Apache Software Foundation, providing an open source distributed configuration service, synchronization service, and naming registry for large distributed systems. ZooKeeper was a sub-project of Hadoop but is now a top-level project in its own right.