Vaelen's Database Design Document
Vaelen's Database (vDB) is a a simple database in the style of database systems commonly used in the 80s and 90s. It is designed to be used in retrocomputing projects where CPU and memory resources are limited. Target systems include DOS, MacOS 6/7/8/9, Linux, and UNIX. It is written in ANSI C for maximum portability.
It's main features include:
- Data stored in fixed-size blocks (called pages) for efficient random access.
- Empty blocks can be reused by future inserts.
- Records can use multiple adjacent pages if needed.
- All changes are first written to a journal file before being written to the database.
- Reduces the likelihood of data corruption due to power outage or other system failure.
- Indexes that speed up searches.
- Indexes are stored as B-Trees.
- A (required) primary index maps record IDs to physical page numbers.
- Additional (optional) indexes use a hash function to map strings to record IDs.
- Page sizes are set to 512 bytes for more efficient disk access on older computers.
- Deletion of data only marks the page as empty but does not clean up the data.
- The database file is regularly compacted, a form of de-fragmentation that relocates active records into free pages to reduce the total size of the data files on disk.
Data File Format
VDB data files are binary files organized into fixed-size 512-byte pages. This page-based architecture allows for both random access of specific records and easy reuse of freed space when records are removed. The 512-byte page size matches the BTree module's page size and is optimal for retro system disk I/O.
Note: All multi-byte values are stored in little-endian byte order. The library handles endian conversion automatically via macros in vdbutil.h, so database files are portable across big-endian and little-endian systems. All on-disk structures use explicit byte-by-byte serialization to avoid compiler alignment and padding differences.
Files
Each database is actually comprised of a number of files. Let's look at the Users database as an example:
- USERS.DAT - The actual data is stored here (includes header in first page)
- USERS.IDX - Primary index mapping Record ID → Page Number (B-Tree)
- USERS.I00 - Secondary index 00 (e.g., username → Record ID)
- USERS.I01 - Secondary index 01 (e.g., email → Record ID)
- USERS.JNL - Journal file for transaction recovery
Index Architecture:
- The
.IDXfile is the primary index that maps Record IDs to physical Page Numbers - All secondary indexes (
.I??files) map field values to Record IDs - This indirection allows records to move between pages without updating secondary indexes
- Only the primary
.IDXindex needs updating when a record's physical location changes
Database Header (Pages 0-1 of .DAT file)
The database header spans the first two pages of the .DAT file. Page 0 contains metadata and index definitions, page 1 contains the free page list.
Page 0: DBHeader + Index List
Header (32 bytes):
| Name | Type | Notes |
|---|---|---|
| signature | char[4] | "VDB" + '\0' for file type validation |
| version | uint16 | Format version (currently 1) |
| page_size | uint16 | Size of each page in bytes (always 512) |
| record_size | uint16 | Size of each record in bytes (1-130050) |
| record_count | int32 | Total number of active records |
| next_record_id | int32 | Next available Record ID |
| last_compacted | int32 | Timestamp of last compaction operation |
| journal_pending | bool | true if journal needs replay on open |
| index_count | byte | Number of secondary indexes (0-15) |
| reserved | byte[8] | Padding to 32 bytes |
Index List (480 bytes):
| Name | Type | Notes |
|---|---|---|
| indexes | DBIndexInfo[15] | Secondary index definitions (15 × 32 = 480) |
Total page 0 size: 32 + 480 = 512 bytes
Page 1: DBFreeList
| Name | Type | Notes |
|---|---|---|
| free_page_count | uint16 (2 bytes) | Total count of free pages in database |
| free_page_list_len | uint16 (2 bytes) | Number of entries in free_pages array (0-127) |
| free_pages | int32[127] | Page numbers of empty pages (508 bytes) |
DBIndexInfo (in page 0, after header):
| Name | Type | Notes |
|---|---|---|
| field_name | char[30] | The name of the field being indexed (30 bytes) |
| index_type | byte | 0 = IT_ID (int32), 1 = IT_STRING (char[64]) |
| index_number | byte | Maps to one of the I?? files (00-14) |
DBIndexInfo Size: 30 bytes (field_name) + 1 byte (index_type) + 1 byte (index_number) = 32 bytes exactly
Page 0 Layout: 32 bytes (header) + 480 bytes (15 indexes × 32 bytes) = 512 bytes
Index Type Values:
- 0 (IT_ID): Index on an int32 field
- 1 (IT_STRING): Index on a char[64] field
signature: Used to verify file format. Always "VDB" followed by null terminator.
version: Format version number. Current version is 1. Future incompatible changes increment this.
record_count: Tracks active records. Updated on add/delete operations.
next_record_id: Auto-incrementing counter for assigning new Record IDs. Never decreases.
page_size: Always 512 bytes. This constant size matches the BTree module and simplifies disk I/O.
record_size: Size of each record in bytes. All records in this database are the same size. This determines how many pages each record occupies:
- pages_per_record = ceiling(record_size / 506)
- Maximum: 65535 bytes (limited by uint16), which is ~130 pages
- Examples: 100 byte record = 1 page, 1000 byte record = 2 pages
Compaction: The process of removing empty pages and rebuilding indexes to reclaim disk space. The last_compacted field tracks when this was last done.
journal_pending: Set to true when a transaction begins, false when committed. If true on database open, the journal must be replayed to complete interrupted operations.
index_count: Number of active secondary indexes (0-15). Maximum 15 indexes supported.
reserved: 8 bytes of padding to bring header to exactly 32 bytes.
free_page_count (in DBFreeList, page 1): Total count of all free pages in the database (maximum 65535). This counter is incremented when a page is deleted and decremented when a page is allocated. If this value is 0, allocate a new page at end of file. If this value is > 0 but free_page_list_len = 0, call UpdateFreePages to repopulate the free_pages array.
free_page_list_len (in DBFreeList, page 1): Number of valid entries in the free_pages array. Maximum 127. The array acts as a stack (LIFO) - new entries are added at index free_page_list_len, and pages are allocated by decrementing free_page_list_len.
free_pages (page 1): Array of up to 127 page numbers that are known to be empty. This provides O(1) free page lookup. When the array is full and more pages are deleted, free_page_count continues to increment but the page numbers are not added to the array. When free_pages is empty but free_page_count > 0, UpdateFreePages is called to scan the database and refill the array.
indexes (page 0, bytes 32-511): Array of 15 index definitions. Only the first index_count entries are valid. Each index has a unique index_number (0-14) that maps to an .I?? file.
Data Page Layout
The .DAT file contains a series of pages of type DBPage stored in fixed 512-byte format. Pages are numbered sequentially starting from 0. Pages 0-1 contain the database header (with index list) and free list. Data pages start at page 2.
Records may span multiple consecutive pages depending on record_size:
- pages_per_record = ceiling(record_size / 506)
- Records are stored in consecutive page runs
- All pages in a multi-page record share the same Record ID
- Only the first page has status = PS_ACTIVE, additional pages have status = PS_CONTINUATION
| Name | Type | Notes |
|---|---|---|
| id | int32 (4 bytes) | Unique record identifier |
| status | byte (1 byte) | 0 = Empty, 1 = Active, 2 = Continuation |
| data | byte[506] | Actual record data (506 bytes per page) |
Page Size Calculation:
- Page size: 512 bytes (constant)
- Overhead: 4 bytes (id) + 1 byte (status) + 1 byte (reserved) = 6 bytes
- Data size: 512 - 6 = 506 bytes available for record data per page
Free Space Management:
- When a record is deleted (spanning pages_per_record pages):
- Set all pages
statusto PS_EMPTY - Increment
free_page_countby pages_per_record - Add first page number to free list if room (subsequent pages can be calculated)
- Set all pages
- When adding a record:
- Calculate required pages:
pages_needed = CalculatePagesNeeded(record_size) - If
free_page_count < pages_needed: Append new pages at end of file - Else if
free_page_list_len > 0:- Search free list for consecutive run of pages_needed pages
- Or append to end if no suitable run found
- Else: Call
UpdateFreePagesto scan and refill array - Mark first page status = PS_ACTIVE, additional pages status = PS_CONTINUATION
- Decrement
free_page_countby pages_needed
- Calculate required pages:
Primary Index (.IDX file)
The .IDX file is a B-Tree index (as defined by the btree module) that maps Record IDs to Page Numbers.
Index Structure:
- Keys: Record IDs (int32 values from
DBPage.id) - Values: Page Numbers (physical page number of the first page where record starts)
- Purpose: Translates logical Record IDs to physical page locations
For multi-page records, the index points to the first page only. Subsequent pages are at consecutive positions (pageNum+1, pageNum+2, etc.).
This primary index is updated whenever:
- A record is added (insert RecordID → FirstPageNum mapping)
- A record is moved to a new location during compaction (update RecordID mapping)
- A record is deleted (remove RecordID from index)
Secondary Indexes (.I?? files)
The .I?? files contain B-Tree indexes for specific fields. The ?? in the extension should be replaced by the index_number value formatted as a 2-digit decimal. For example, USERS.I00 or USERS.I01.
Index Structure:
- Keys: Field values (generated based on index_type)
- Values: Record IDs (NOT block IDs)
- Purpose: Look up records by field values
Index Key Generation:
IT_IDtype indexes: Use the field value directly as the B-Tree keyIT_STRINGtype indexes: Use theGenerateIndexKey()function to generate a CRC16-based key from the string
Index Values: The B-Tree values are Record IDs. To retrieve the actual record, perform a second lookup in the primary .IDX index to get the Page Number.
Journal File Structure
The journal file (<database>.jnl) contains pending operations that have not yet been committed. Each journal entry is a fixed-size record that describes an operation to be performed on the database.
DBJournalEntry
| Name | Type | Notes |
|---|---|---|
| operation | byte | 0 = None/Empty, 1 = Update, 2 = Delete, 3 = Add |
| page_num | int32 (4 bytes) | Page being modified, -1 for Add operations |
| record_id | int32 (4 bytes) | Record ID for verification |
| data | byte[507] | New record data (for Update and Add) |
| checksum | uint16 (2 bytes) | CRC16 of operation + page_num + record_id + data |
Journal Entry Size: 1 + 4 + 4 + 507 + 2 = 518 bytes
Operation Types:
- 0 (None/Empty): Unused journal entry
- 1 (Update): Update existing page at PageNum with new Data
- 2 (Delete): Mark page at PageNum as Empty
- 3 (Add): Add new record with Data (PageNum is -1, actual page assigned during replay)
Checksum: The CRC16 checksum ensures journal entry integrity. Corrupted entries are skipped during replay.
Example Usage: Find a User by ID (Record ID)
At program start, the program reads the database header from pages 0-1 of the .DAT file (header with index list, and free list).
To find a User by their Record ID, the program:
Opens the primary index: Opens
USERS.IDXusing the BTree moduleBTree idx_tree; OpenBTree(&idx_tree, "USERS.IDX");Finds the page number: Calls
FindInIndex(&idx_tree, record_id, values, max_values, &count)whererecord_idis the Record ID to search for- The
valuesarray will contain the physical page number of the first page
- The
Reads the data pages:
- Calculate pages needed:
pages_needed = CalculatePagesNeeded(db.header.record_size) - For each page i from 0 to pages_needed-1:
- Seeks to position
((page_num + i) * 512)inUSERS.DAT - Reads the
DBPagestruct (512 bytes) - Verifies
id == record_id - First page: verify
status == PS_ACTIVE - Additional pages: verify
status == PS_CONTINUATION - Append
datato record buffer
- Seeks to position
- Returns the assembled record data to the caller
- Calculate pages needed:
Closes the index:
CloseBTree(&idx_tree)
Example Usage: Find a User by Username
Finding by a secondary index requires two lookups: secondary index → Record ID, then primary index → Page Number.
Looks up the index: Search through the index list (page 0, bytes 32-511) for
field_name == "Username"- Example:
{field_name: "Username", index_type: IT_STRING, index_number: 0}
- Example:
Opens the secondary index file: Opens
USERS.I00BTree name_tree; OpenBTree(&name_tree, "USERS.I00");Generates string key:
key = GenerateIndexKey(IT_STRING, (const byte *)"alice")- This is case-insensitive, so "alice", "Alice", "ALICE" all produce the same key
Finds the Record ID:
FindInIndex(&name_tree, key, record_ids, max_values, &count)- The
record_idsarray contains Record IDs (may be multiple due to hash collisions)
- The
Lookup Page Number: For each Record ID:
- Open primary index:
OpenBTree(&idx_tree, "USERS.IDX") - Find page:
FindInIndex(&idx_tree, record_id, page_nums, max_values, &count) - Read record from .DAT file (may span multiple consecutive pages)
- Verify username matches (handle CRC16 hash collisions)
- Close primary index
- Open primary index:
Closes the secondary index:
CloseBTree(&name_tree)
Example Usage: Adding a New Record
Assign Record ID: Use
header.next_record_idand increment itCalculate pages needed:
pages_needed = CalculatePagesNeeded(header.record_size)Find free pages:
- If
free_list.free_page_count < pages_needed: Append new pages at end of file - Else if
free_list.free_page_list_len > 0:- Search free list for consecutive run of pages_needed pages
- If found: use those pages and remove from free list
- If not found: append new pages at end of file
- Else (count > 0 but list empty):
- Call
UpdateFreePagesto scan database and refillfree_pagesarray - Then search for consecutive run
- Call
- If
Write data across pages:
int offset = 0; for (i = 0; i < pages_needed; i++) { page.id = new_record_id; page.status = (i == 0) ? PS_ACTIVE : PS_CONTINUATION; copy_len = (DB_PAGE_DATA_SIZE < record_size - offset) ? DB_PAGE_DATA_SIZE : record_size - offset; memcpy(page.data, &user_data[offset], copy_len); /* Write to .DAT at (first_page_num + i) * DB_PAGE_SIZE */ offset += DB_PAGE_DATA_SIZE; }Update primary index:
- Open
USERS.IDX - Insert mapping:
InsertIntoIndex(&idx_tree, new_record_id, first_page_num) - Note: Only the first page number is stored
- Open
Update all secondary indexes:
- For username index:
InsertIntoIndex(&name_tree, GenerateIndexKey(IT_STRING, username), new_record_id) - For email index:
InsertIntoIndex(&email_tree, GenerateIndexKey(IT_STRING, email), new_record_id) - Note: Secondary indexes store Record IDs, not Page Numbers
- For username index:
Update header and free list:
- Increment
header.record_count - Increment
header.next_record_id - Decrement
free_list.free_page_countby pages_needed - Write header back to page 0
- Write free list back to page 1 (if modified)
- Increment
Example Usage: Deleting a Record
Find the record: Use primary index to get Record ID → First Page Number
Calculate pages to delete:
pages_needed = CalculatePagesNeeded(db.header.record_size)Mark all pages as empty:
for (i = 0; i < pages_needed; i++) { page.status = PS_EMPTY; /* Write updated page at (first_page_num + i) back to .DAT */ }Update primary index:
DeleteFromIndex(&idx_tree, record_id, record_id)- Remove Record ID from primary index
Update secondary indexes: Remove from all secondary indexes
DeleteFromIndex(&name_tree, GenerateIndexKey(IT_STRING, username), record_id)DeleteFromIndex(&email_tree, GenerateIndexKey(IT_STRING, email), record_id)
Add to free list:
- Increment
free_list.free_page_countby pages_needed - If
free_list.free_page_list_len < DB_MAX_FREE_PAGES:- Add first page number to
free_pages[free_page_list_len] - Increment
free_page_list_len - Note: Only store first page; consecutive pages can be calculated
- Add first page number to
- Else: Page numbers not added to array, but count still incremented
- Increment
Update header and free list:
- Decrement
header.record_count - Write header back to page 0
- Write free list back to page 1
- Decrement
Compaction: Eventually run compaction to consolidate pages and rebuild free list accurately
Example Usage: Updating a Record
Non-indexed field changes:
- Simply overwrite the data in the existing pages (all pages_per_record pages)
- No index updates needed
Indexed field changes (e.g., username change):
- Delete old secondary index entry:
DeleteFromIndex(&name_tree, GenerateIndexKey(IT_STRING, old_username), record_id) - Update the data pages: Write new data to existing pages
- Insert new secondary index entry:
InsertIntoIndex(&name_tree, GenerateIndexKey(IT_STRING, new_username), record_id) - Primary index unchanged: Record ID stays the same, so primary index is not modified
Moving a record to a new location (during compaction):
- Write data to new pages: Copy record to new physical location (pages_per_record consecutive pages)
- Update primary index: Change mapping from
record_id → old_first_page_numtorecord_id → new_first_page_num - Mark old pages empty: Set all old pages
status = PS_EMPTY - Secondary indexes unchanged: They map to Record IDs, so no updates needed
Transaction Protocol
The journal file provides crash recovery by recording operations before they are committed to the database. This ensures that partial writes due to power loss or crashes can be detected and completed.
Write Operation Sequence
When updating, deleting, or adding records, follow this sequence:
Append to journal: Write a
DBJournalEntryto the journal file with the operation details- For Update: operation=JO_UPDATE, page_num=target page, record_id=record ID, data=new data
- For Delete: operation=JO_DELETE, page_num=target page, record_id=record ID
- For Add: operation=JO_ADD, page_num=-1, record_id=new ID, data=new data
Set pending flag: Update
header.journal_pending = truein page 0 of .DAT fileFlush journal: Ensure journal is written to disk (fsync/flush)
Perform database update: Execute the actual operation on the .DAT file
Update indexes: Modify all affected indexes (.I?? files)
Clear pending flag: Set
header.journal_pending = falsein page 0 of .DAT fileTruncate journal: Set journal file size to 0 bytes
Multiple Operations: For operations affecting multiple records (e.g., bulk updates), write all journal entries before setting the journal_pending flag. This creates a multi-operation transaction.
Recovery on Database Open
When opening a database, check the journal_pending flag to determine if recovery is needed:
Check flag: Read
header.journal_pendingfrom page 0 of .DAT fileIf false: Open database normally, no recovery needed
If true: Recovery is required:
- Open journal file (.JNL)
- For each journal entry:
- Read entry and verify checksum using CRC16
- If checksum valid, replay operation:
- JO_UPDATE (1): Write
datato pagepage_numin .DAT file - JO_DELETE (2): Set page
page_numstatus to PS_EMPTY - JO_ADD (3): Find first empty page (or append), write
data
- JO_UPDATE (1): Write
- If checksum invalid, skip entry (corrupted)
- Rebuild all indexes from .DAT file (safer than trusting partial index updates)
- Set
header.journal_pending = falsein page 0 of .DAT file - Truncate journal file to 0 bytes
Index Rebuild: After crash recovery, always rebuild all indexes to ensure consistency. This is safer than attempting to replay partial index updates.
Database Helper Methods
The db module should provide high-level helper functions to encapsulate these operations:
Database Operations
bool OpenDatabase(const char *name, Database *db)
- Opens all database files (.DAT, .IDX, .I??)
- Loads metadata into memory
- Returns true on success
void CloseDatabase(Database *db)
- Closes all open files
- Writes any cached metadata
bool CreateDatabase(const char *name, uint16 record_size)
- Creates new .DAT and .IDX files
- Initializes with default values
- Returns true on success
Record Operations
bool AddRecord(Database *db, const byte *data, int32 *record_id)
- Finds free block or allocates new one
- Writes data to .DAT file
- Updates all indexes
- Returns new record ID via pointer
bool FindRecordByID(Database *db, int32 id, byte *data)
- Uses ID index to locate block
- Reads and returns record data
- Returns true if found
bool FindRecordByString(Database *db, const char *field_name, const char *value, byte *data, int32 *record_id)
- Generates key using GenerateIndexKey()
- Uses appropriate string index
- Handles hash collisions
- Returns first matching record
bool UpdateRecord(Database *db, int32 id, const byte *data)
- Finds record by ID
- Overwrites data in existing block
- Updates indexes if indexed fields changed
- Returns true on success
bool DeleteRecord(Database *db, int32 id)
- Marks block as empty
- Removes from all indexes
- Returns true on success
Index Operations
bool AddIndex(Database *db, const char *field_name, byte index_type)
- Creates new .I?? file
- Scans existing records to build index
- Updates index list in page 0 (increment
index_count, add toindexesarray) - Returns true on success
- Maximum 15 indexes supported
bool RebuildIndex(Database *db, int16 index_number)
- Clears and rebuilds specified index
- Useful for corruption recovery
- Returns true on success
Maintenance Operations
bool CompactDatabase(Database *db)
- Removes empty pages by moving active records
- Rebuilds all indexes
- Rebuilds free list with accurate
free_page_countandfree_page_list_len - Updates last_compacted timestamp
- Returns true on success
bool UpdateFreePages(Database *db)
- Scans .DAT file for pages with
status == PS_EMPTY - Fills
free_pagesarray with up to 127 empty page numbers - Sets
free_page_list_lento actual number found - Sets
free_page_countto actual total count of empty pages (for accuracy) - Called automatically when
free_page_count > 0butfree_page_list_len == 0 - Returns true on success
bool ValidateDatabase(Database *db)
- Checks for corruption
- Validates indexes against data
- Returns true if database is valid
Transaction Operations
bool BeginTransaction(Database *db)
- Sets
header.journal_pending = true - Prepares journal file for writing
- All subsequent operations will be journaled
bool CommitTransaction(Database *db)
- Sets
header.journal_pending = false - Truncates journal file to 0 bytes
- Finalizes all pending operations
bool RollbackTransaction(Database *db)
- Discards journal without applying
- Sets
header.journal_pending = false - Truncates journal file to 0 bytes
- Use when aborting an operation
bool ReplayJournal(Database *db)
- Reads and validates all journal entries
- Applies each operation to database
- Rebuilds all indexes
- Returns true on success
- Called automatically during
OpenDatabaseifjournal_pendingis true
Data Types
/* Index info - 32 bytes on disk */
typedef struct {
char field_name[30]; /* Field name (30 bytes total) */
byte index_type; /* 0 = IT_ID, 1 = IT_STRING */
byte index_number; /* Index number 0-14 */
} DBIndexInfo;
/* Database header - 512 bytes (page 0) */
typedef struct {
char signature[4]; /* "VDB" + '\0' */
uint16 version; /* Format version (1) */
uint16 page_size; /* Size of each page (always 512) */
uint16 record_size; /* Size of each record in bytes */
int32 record_count; /* Total active records */
int32 next_record_id;/* Next available Record ID */
int32 last_compacted;/* Timestamp of last compaction */
bool journal_pending;/* true if journal needs replay */
byte index_count; /* Number of secondary indexes (0-15) */
byte reserved[8]; /* Padding to 32 bytes */
DBIndexInfo indexes[DB_MAX_INDEXES]; /* Secondary index definitions */
} DBHeader;
/* Free page list - 512 bytes (page 1) */
typedef struct {
uint16 free_page_count; /* Total count of free pages in DB */
uint16 free_page_list_len; /* Number of entries in free_pages array */
int32 free_pages[DB_MAX_FREE_PAGES]; /* Page numbers of empty pages */
} DBFreeList;
/* Database page - 512 bytes */
typedef struct {
int32 id; /* Record ID */
byte status; /* PS_EMPTY, PS_ACTIVE, or PS_CONTINUATION */
byte reserved;
byte data[DB_PAGE_DATA_SIZE]; /* 506 bytes */
} DBPage;
/* Journal entry - 518 bytes */
typedef struct {
byte operation; /* JO_NONE, JO_UPDATE, JO_DELETE, or JO_ADD */
int32 page_num;
int32 record_id;
byte data[507];
uint16 checksum;
} DBJournalEntry;
/* Database handle */
typedef struct {
char name[64];
DBHeader header; /* Includes index list */
DBFreeList free_list;
FILE *data_file;
FILE *journal_file;
BTree *primary_index; /* .IDX - record_id → page_num */
bool is_open;
} Database;
Performance Considerations
- Primary index lookups: O(log n) via B-Tree - fast
- Secondary index lookups: O(log n) + O(log n) = two B-Tree lookups (secondary → primary)
- Trade-off: Slower reads for faster compaction and updates
- Finding free pages:
- O(1) when
free_page_list_len > 0(use array) - O(1) when
free_page_count == 0(append new page) - O(n) when
free_page_count > 0butfree_page_list_len == 0(scan to refill array)- After scan, next 127 allocations are O(1)
free_page_counttracks total free pages for quick "are there any?" check- Array holds up to 127 page numbers for fast access
- O(1) when
- Multiple indexes: Each insert/update/delete must update primary + all secondary indexes
- Hash collisions: GenerateIndexKey uses CRC16, so collisions are possible (1 in 65536)
- Always verify string match after hash lookup
- Journal overhead: Each write operation requires:
- One journal file append (518 bytes per entry, fsync)
- One header update (set journal_pending to true in page 0)
- Actual database modification
- Second header update (clear flag)
- Journal truncate
- Recovery time: Proportional to number of journal entries and total records (for index rebuild)
- Batch operations: Use multi-entry transactions to amortize journal overhead across multiple operations
- Compaction advantage: With indirect indexing, only primary index needs updating when moving records
- Secondary indexes are unchanged since Record IDs don't change
- Page alignment: 512-byte pages align well with disk sector sizes on retro systems
- Free list size: 4 bytes header + 127 entries × 4 bytes = 512 bytes (fits in one 512-byte page perfectly)
Future Enhancements
- Extended free list: Linked list of free list pages for tracking more than 128 empty pages
- Multi-operation transactions: Group related operations into atomic units
- Record-level checksums: Detect data corruption in .DAT file
- Compression: For large text fields
- Variable-length records: More space-efficient storage
- Index caching: Keep frequently-used index nodes in memory
- Incremental index updates: During recovery, replay index changes from journal instead of full rebuild