JAlcocerTech E-books

Advanced: The Future of Data Systems

The book ends by connecting databases, event logs, stream processors, caches, indexes, and application services into one larger idea: data systems are becoming more like dataflows.

From Systems of Record to Derived Views

Applications often need many specialized views of the same underlying facts: relational tables, search indexes, caches, graph views, analytics stores, and machine-learning features.

Trying to make one database serve every access pattern can produce poor compromises. A more flexible design keeps a clear source of truth and derives specialized views from it.

Unbundling the Database

Traditional databases bundle storage, indexing, query execution, transactions, replication, and recovery. Modern architectures often separate these responsibilities across systems.

This gives flexibility but moves correctness questions into the application architecture. If a cache, search index, and database disagree, the system needs a defined recovery and reconciliation story.

Event Sourcing and Change Data Capture

Event sourcing records facts as an append-only sequence of events. Current state is derived by replaying those events.

Change data capture applies a related idea to databases: capture committed changes and feed them into downstream systems. This can keep derived views updated without scattering dual-write logic across services.

Correctness as End-to-End Thinking

Many bugs happen between systems, not inside one product. A database may provide strong guarantees locally, while the application loses correctness by combining it with a queue, cache, RPC call, or external API without an end-to-end protocol.

Correctness must be defined at the workflow level.

Human and Ethical Dimensions

Data systems affect people. Data collection, retention, derived inferences, auditing, privacy, and misuse are not purely technical questions. Architecture determines what is easy to observe, delete, combine, leak, or abuse.

Practical Takeaway

The direction of modern data architecture is toward explicit dataflows:

  • immutable logs of facts
  • rebuildable derived views
  • deterministic transformations
  • careful schema evolution
  • end-to-end correctness guarantees
  • operational observability
  • privacy and retention as design constraints