Let’s have the database models strut down the runway:
Relational (SQL):
- Data consist of rows in tables.
- Each row has multiple columns.
- Each column has a fixed type.
- Queries use filters and joins.
- Fixed schema is defined separately from the data.
- User-defined indexes improve query performance.
- Robust transaction/data-integrity support.
Graph (RDF):
- Data consist of nodes and labeled links between them.
- Nodes may have type information.
- Optional schema is defined in the same model (e.g. RDFS, OWL).
- Queries use graph-based constraints and filters.
- Indexes are not optimized for specific datasets.
- Limited transaction support.
Tree (XML, filesystem):
- Data are identified by a name and a position within a hierarchy.
- Queries use hierarchies and filters.
- Data may have type information.
- Optional schema may be defined in the same model (XML-Schema) or outside it (DTD).
- Transaction support varies by implementation.
Hash (BDB, memcached):
- Data are untyped binary blobs.
- Data are identified by unique keys in a flat namespace.
- Only key-based lookups are possible.
- Limited transaction support.
Inverted Index (Lucene/Solr):
- Data are text documents.
- Queries are free-form text with some special operators.
- Text queries are fast.
- Query results can be scored/ranked.
- Joins are inefficient; denormalized data are preferred.
- Limited transaction support.
Document-Oriented (CouchDB):
- Data are identified by unique keys in a flat namespace.
- Data have named fields.
- Optional schema defined by application code.
- Denormalized data are preferred.
- Limited type information.
Column-Oriented (Hbase, BigTable):
- Data consist of rows in tables.
- Each row has multiple columns, rows are assumed to be sparse.
- Data from a single column are stored contiguously to improve performance.
- Designed for distributed processing environments such as Hadoop and MapReduce.