Courses & TutorialsProgramming
Awesome HBase – Massive Collection of Resources
HBase is a distributed, scalable, big data store.
Contents
Projects
Clients
- asynchbase – Fully asynchronous, non-blocking HBase client.
- gohbase – Pure Go client for HBase.
- happybase – Python client for HBase.
Cloud
- Amazon EMR – Amazon’s Hadoop/HBase offering on AWS.
- Azure HDInsight – Microsoft’s Hadoop/HBase offering on Azure.
- Cloudera Director – Run Hadoop/HBase clusters on AWS, Azure or Google Cloud.
- Google Cloud Bigtable – High-performance NoSQL database service accessible via HBase client API.
- Hortonworks Cloudbreak – Provision Hadoop/HBase clusters on AWS, Azure, Google Cloud, or OpenStack.
Frameworks
Datasets
- Kite – High-level data layer for Hadoop/HBase.
Document
- HDocDB – HBase as a JSON document database.
Entity/JPA
- DataNucleus – JPA persistence layer with support for HBase.
- Gora – Persistence library for big data with support for HBase.
- HEntityDB – HBase as an entity database.
- Kundera – JPA client with support for HBase.
Geospatial
- GeoMesa – Spatial-temporal database with support for Accumulo, HBase, Cassandra, and Kafka.
Graph
- Gradoop – Research framework for scalable graph analytics built on Flink and HBase.
- HGraphDB – HBase as a TinkerPop graph database.
- JanusGraph – Scalable graph database with support for Cassandra, HBase, Google Cloud Bigtable, and BerkeleyDB.
- NebulaGraph – A high performance distributed Graph database.
- S2Graph – High-performance distributed graph database built on HBase.
SQL/OLAP
- AntsDB – AntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase.
- EsgynDB – Commercial SQL engine providing ACID transactions and BI analytics on top of Hadoop, based on Trafodian.
- Kylin – Extreme OLAP engine for big data that stores data in HBase.
- LeanXScale – Commercial full ACID full SQL product built on Hadoop/HBase.
- Phoenix – SQL layer on top of HBase.
- Splice Machine – Commercial RDBMS built on top of HBase.
- Trafodian – Transactional SQL-on-Hadoop/HBase.
Time Series
- Axibase – Distributed time series database built on HBase.
- OpenTSDB – Scalable time series database built on HBase.
- Warp 10 – Time series database for sensor data.
Infrastructure
Secondary Indices
- hindex – Secondary index for HBase.
- Lily HBase Indexer – Quickly and easily search for content stored in HBase.
Transactions
- Haeinsa – Multi-row/multi-table transaction library for HBase.
- HBase-QoD – Vector-field consistency for HBase fine-grained transactional inter-DC replication.
- Omid – Transactional support for HBase.
- Tephra – Globally consistent transactions on top of HBase.
- Themis – Cross-row/cross-table transactions on HBase based on Google’s Percolator.
Integrations
- Apex – Apex-HBase connector.
- Beam – Beam HBase integration.
- Camel – Camel HBase component.
- Cascading – HBase adapters for Cascading.
- Cascalog – Wrapper around Cascading.HBase for use in Cascalog.
- Crunch – HBase adapters for Crunch.
- Drill – HBase storage plugin for Drill.
- Elasticsearch – Elasticsearch import river for HBase.
- Flink – Flink-HBase connector.
- Gearpump – Gearpump integration for HBase.
- Giraph – Giraph input and output formats for HBase.
- HAWQ – HAWQ PXF external tables on HBase.
- Hive – Hive HBase integration.
- Impala – Impala support for querying HBase tables.
- Kafka – HBase Kafka proxy.
- Pig – Pig HBase integration.
- Pulsar – HBase connector for Pulsar.
- Ranger – HBase plugin for Apache Ranger.
- Spark – Spark-HBase connector.
- Spring for Apache Hadoop – Spring-Hadoop integration, including HBase support.
- Storm – Storm/Trident integration for HBase.
- Tajo – Tajo integration with HBase.
- Zeppelin – HBase shell interpreter for Apache Zeppelin.
Tools
- Ambari – Software for provisioning, managing, and monitor Hadoop/HBase clusters.
- Cloudera Manager – Tool for managing Hadoop/HBase in production.
- DbSchema – Diagram-oriented database designer with support for HBase.
- Hannibal – Tool to monitor and maintain HBase clusters.
- h-rider – GUI for viewing and manipulating data in HBase.
- Hue – Smart analytics workbench that includes an HBase browser.
- Sematext SPM – Tool for monitoring HBase, HDFS, etc.
Miscellaneous
- HubSpot HBase support – Configs and tools for HBase at HubSpot, including Hystrix integration and coprocessors.
Resources
Books
- HBase in Action – Experience-driven guide that shows you how to use HBase.
- HBase: The Definitive Guide – Comprehensive guide to HBase.
- Architecting HBase Applications – Includes HBase principles, cluster guidelines, and in-depth case studies.
- HBase Administration Cookbook – How to master HBase configuration and administration.
- HBase Essentials – A practical guide to using HBase.
- HBase Design Patterns – Successful patterns to develop scalable applications with HBase.
- Learning HBase – Learn the fundamentals of HBase administration and development.
- HBase High Performance Cookbook – Exciting projects that teach you how to use HBase.
- Apache HBase Primer – A compact guide to HBase essentials.
- Pro Apache Phoenix – Basic and best practices for using Phoenix.
Papers
- Bigtable: A Distributed Storage System for Structured Data – The inspiration for HBase.
- Apache Hadoop Goes Realtime at Facebook – How Facebook deployed HBase to production.