DB: Database Basics

A database is an organized collection of data stored electronically so it can be easily accessed, managed, and updated. Instead of keeping information in scattered files or spreadsheets, databases provide a structured system for handling information reliably and efficiently.

A database is a structured system for storing, retrieving, and managing data reliably over time.

At its core, a database answers one fundamental question:

_How do we store data so that it survives program restarts, crashes, and scale?_

Without databases, data would live only in:

  • Memory (lost on restart)
  • Files (hard to query, unsafe, slow at scale)

Memory Is Volatile

txt
+-----------+
| Program   |
| Memory    |
| (RAM)     |
+-----------+
     |
     X  ← Power off → DATA LOST

RAM is fast, but:

  • Volatile
  • Process-bound
  • Not shareable safely

Storing data in files seems simple:

txt
users.txt
-----------
1, Amr
2, Sara
3, John

But quickly becomes problematic:

  • Searching requires full scans
  • Concurrent writes cause corruption
  • No indexing
  • No transactions
  • No schema enforcement

The Need for Databases

Databases solve five critical problems:

ProblemSolution
PersistenceData survives crashes
StructureSchema and constraints
QueryingFast lookups and filtering
ConcurrencyMultiple users safely
IntegrityConsistency and correctness

Types of Database

Databases come in two major families:

  • Non-relational Database (NoSQL)
  • Relational Database (SQL)

Both are designed to store and manage data, but they follow very different approaches.

Understanding their differences helps you choose the right database for the right project.

NoSQL Databases

NoSQL databases are non-relational, meaning they don’t rely on fixed table structures. Instead, they use flexible ways to store data like documents, key-value pairs, or graphs.

  • Flexible Data Models – Ideal for unstructured or semi-structured data, such as social media posts, logs, or sensor data.
  • High Scalability – Designed to scale horizontally across many servers, making them suitable for handling huge amounts of data in real time.
  • Eventual Consistency – Often favor speed and availability over strict consistency (unlike SQL). This works well for applications like e-commerce catalogs or messaging systems.
  • Examples – MongoDB (document-based), Redis (key-value), Cassandra (wide-column), Neo4j (graph).

Relational Database (SQL)

SQL databases are also known as relational databases. They use structured tables with rows and columns, and relatio

  • Structured Data – Best suited for data that can be clearly organized into predefined schemas. For example, customer information, bank accounts, or product inventories.
  • Standard Language (SQL) – Use Structured Query Language to insert, update, and query data. SQL provides consistency and portability across platforms.
  • ACID Transactions – Guarantee that operations are reliable, consistent, and recoverable. This makes SQL databases strong in banking, accounting, and enterprise systems.
  • Examples – MySQL, PostgreSQL, Oracle, Microsoft SQL Server.

Non-relational Database

Non-relational databases use a storage model that is optimized for the type of data being stored.

Data my be organized as JSON documents, key-value pairs or a graph with edges and vertices.

Non relational databases are better for handling unstructured diverse data. (Big data)

There are 4 main types of non-relational databases:

  • Document Database
  • Columnar Database

Document Database

Store semi-structured documents, designed to store JSON-like or XML-like data.

json
[
	{
	  "id": 1,
	  "name": "Amr",
	  "skills": ["C++", "Python"],
	  "Age": 35
	},
	{
	  "id": 2,
	  "name": "Sara",
	  "skills": ["Go", "SQL"]
	}
]

Not every entry has to have the same number of attributes, or the same attributes of the other entries, Flexible schema, but weaker guarantees.

But each document has to have a specific key that allows you to retrieve the document quickly, you can also query by specific fields contained inside.

MangoDB, and DynamoDB are popular non-relational databases.

Columnar Database

Optimized for analytics.

Database is organized by columns rather than rows, each row can have a different set of columns.

txt
Column Storage
--------------
id    | 1 | 2 | 3
name  | A | B | C
age   | 30| 25| 40

Excellent for:

  • Aggregations
  • Reports
  • Data warehouses

Cassandra is a cross between a key-value and columnar database.


Relational Database

Data is a relationship to at least one other piece of data in the database.

relational database is the most common way to store and organize data.

  • Data is organized into tables with columns and rows
  • Each column holds a certain type of data
  • Each row has a set of values related to one entity. (record)

txt
Users Table
+----+-------+-------+
| id | name  |  age  |
+----+-------+-------+
| 1  | Amr   |   35  |
| 2  | Sara  |   30  | 
+----+-------+-------+

Tables

A table is the foundation of a relational database. Think of it as a digital spreadsheet where each row represents a record and each column represents a type of information.

  • Entity representation – Each table typically represents one type of entity, such as _Customers_, _Orders_, or _Products_.
  • Structured design – The table structure ensures that all entries follow the same format, making data easy to maintain and query.

Rows (Records)

Within a table, rows store the actual data. Each row is a complete record that contains all the details for one instance of the entity.

  • Unique entries – A row is usually identified by a unique value, called a primary key, which prevents duplicates.
  • Example – In a Customers table, one row may represent _John Smith_ with values for his name, email, and phone number.

Columns (Fields)

Columns define what type of information can be stored in a table. They act like labels that describe the data for all rows.

  • Data definition – Each column has a name and a data type, such as INTEGERTEXT, or DATE, which keeps the data consistent.
  • Example – A Customers table might include columns for CustomerIDNameEmail, and PhoneNumber.

Multiple tables

In the same database you can have multiple related tables.

  • The tables are connected by a Foreign Key
  • The primary key guarantees uniqueness inside a single table.
  • The foreign key establishes connections between different tables.

RDBMS

Relational Database Management System is the software layer that allows users and applications to interact with the database.

It simplifies complex operations and guarantees data consistency.

  • Database Creation and Management
  • Lets administrators design the structure of databases, tables, and relationships.

  • Data Querying
  • Provides languages like SQL to filter, search, and analyze stored data.

  • User Access Control
  • Ensures only authorized users can access or modify sensitive information.

  • Data Integrity and Recovery
  • Maintains accuracy and restores information in case of failure.

Popular DBMS examples: MySQL, PostgreSQL, Oracle, Microsoft SQL Server, SQLite.

SQL

Many relational database management systems use Structured Query Language (SQL) to access and manage the database.