Engines
of
Information
A Database Management System (DBMS) is the software that interacts with end users, applications, and the database itself to capture and analyze data. This guide visualizes the hidden mechanics of storage, retrieval, and integrity.
SQL Queries
Structured Query Language (SQL) is the standard for communicating with databases. The four basic operations are CRUD: Create (INSERT), Read (SELECT), Update (UPDATE), and Delete (DELETE). Records are often identified by a Primary Key (a unique ID).
Terminal Output
// ReadyRetrieves data from the database. This query finds all students with a high GPA.
Live Database (Table: Students)
| ID | Name | GPA |
|---|---|---|
| 1 | Alice | 3.9 |
| 2 | Bob | 3.2 |
| 3 | Charlie | 3.7 |
SQL Joins
Visualizing how data from two distinct tables is combined based on a related column (Foreign Key).
Inner Join
Returns records that have matching values in both tables.
Left Join
Returns all records from the left table, and matched records from the right.
Right Join
Returns all records from the right table, and matched records from the left.
Full Outer Join
Returns all records when there is a match in either left or right table.
Self Join
Joining a table with itself (e.g., matching employees to managers in the same table).
Users (T1)
| ID | Name |
|---|---|
| 1 | Alice |
| 2 | Bob |
| 3 | Charlie |
| 4 | David |
Orders (T2)
| ID | User_ID | Item |
|---|---|---|
| 101 | 1 | Laptop |
| 102 | 2 | Mouse |
| 103 | 1 | Monitor |
| 104 | 5 | Phone |
Result Set
| Name | Item |
|---|---|
| Select Join Type | |
Indexing (B-Tree)
Without an index, the database must perform a "Full Table Scan" (O(N)). A B-Tree Index allows logarithmic time search (O(log N)).
Normalization
Normalization is the process of organizing data to minimize redundancy. Move through the stages from Unnormalized (UNF) to Third Normal Form (3NF).
ACID Transactions
To ensure data validity, transactions must be Atomic (all or nothing), Consistent (valid state), Isolated (independent), and Durable (saved permanently).
Transaction Log (WAL)
Row-Level Locking
Architectures
How clients connect to the database. 2-Tier connects directly (e.g., Desktop App). 3-Tier uses an Application Server (e.g., Web App).
Master-Slave Replication
Writes are sent to the Master node. The Master then copies (replicates) data to Read Replicas. This allows scaling reads but introduces replication lag.
SQL vs NoSQL
SQL (Relational) databases enforce strict schemas. NoSQL (Document) databases allow flexible, dynamic data structures.
Schema / Structure
Data Store
Choosing the Right Tool
- Structure: Table-based with strict, pre-defined schemas.
- Scaling: Vertical (Scale Up - adding RAM/CPU to one server).
- Query Language: SQL (Structured Query Language).
- Best For: Complex queries, multi-row transactions, ensuring data integrity (ACID).
- Examples: MySQL, PostgreSQL, Oracle, SQL Server.
- Structure: Flexible. Document (JSON), Key-Value, Graph, or Wide-Column.
- Scaling: Horizontal (Scale Out - adding more servers/sharding).
- Query Language: Dynamic schemas, often API-based.
- Best For: Unstructured data, rapid prototyping, massive data volume, real-time apps.
- Examples: MongoDB (Document), Redis (Key-Value), Cassandra (Wide-Column), Neo4j (Graph).
CAP Theorem
In a distributed data store, you can only simultaneously guarantee two out of three: Consistency, Availability, and Partition Tolerance.
Select two properties
Consistency: Every read receives the most recent write.
Availability: Every request receives a (non-error) response.
Partition Tolerance: The system continues to operate despite an arbitrary number of
messages being dropped or delayed by the network between nodes.
Key Definitions
DBMS
Database Management System software that interacts with end users, applications, and the database.
SQL
Structured Query Language, the standard for communicating with relational databases.
CRUD
Create, Read, Update, Delete - the four basic operations of persistent storage.
Primary Key
A unique identifier for a record in a table.
Foreign Key
A field that links to the primary key of another table, creating a relationship.
Normalization
Organizing data to minimize redundancy and dependency (e.g., 1NF, 2NF, 3NF).
ACID
Atomicity, Consistency, Isolation, Durability - properties that guarantee reliable transactions.
Index
A data structure (like B-Tree) that improves the speed of data retrieval operations.
Sharding
Horizontal scaling by splitting a database into smaller parts across multiple servers.
CAP Theorem
It is impossible for a distributed data store to simultaneously provide more than two out of three: Consistency, Availability, Partition Tolerance.