Add Deep Database Style#4925
Conversation
Replaces SPACETIME_PROGRAMMING_STANDARDS.md with DEEP_CORE_STYLE.md. The document is reorganized around seven principles for the core (datastore, commitlog, snapshotting, replication): 1. Work towards zero dependencies 2. Work towards deterministic simulation testing 3. Work towards thread-per-core 4. Work towards no_std 5. Think in terms of persistent data structures 6. Think in terms of pipelining 7. Think in terms of unreliable processes A short style section follows the principles, covering assertions, bounded loops and queues, error handling, control flow, naming, and formatting.
|
|
||
| ## 4. Work towards `no_std` | ||
|
|
||
| To control our failure modes, we should enforce no memory allocation inside the core. This is not absolute. Primitives like pages can be allocated outside the core and passed in. But the rule is that the deep core does not allocate. |
There was a problem hiding this comment.
I think the natural way to do this is to be generic over an allocator trait. Otherwise, this is probably going to be very intrusive in the datastore.
There was a problem hiding this comment.
We expect these goals and guidelines to be intrusive. We cannot get to where we are going without intrusion.
That being said, I'd be curious to know how the allocator trait works. I think we'd like to have an allocator per resource type.
|
|
||
| ## 5. Think in terms of persistent data structures | ||
|
|
||
| We want to support time-travel APIs, sub-transactions, background snapshotting, and potentially MVCC. Persistent data structures, such as Merkle trees and Postgres-style MVCC, naturally allow us to look at multiple versions of data and update versions atomically. |
There was a problem hiding this comment.
(We are deeply committed to mutable non-persistent structures in the datastore right now.)
There was a problem hiding this comment.
This principle does not state that we cannot use mutable, non-persistent data structures, it means the overall system we build has to have the properties of a persistent datastructure.
|
|
||
| We should model the core's communication with the outside world (Tokio, disk I/O, networking, peers) as unreliable, asynchronous message passing. | ||
|
|
||
| This sharpens our error handling. Every message can be lost, delayed, or reordered, and the core's logic must remain correct under those conditions. It is also a natural fit with principle 6, since messages to other processes are inherently pipelined. |
There was a problem hiding this comment.
How about corruption and protection against cosmic rays?
There was a problem hiding this comment.
That is included in this. I will make that explicit.
|
|
||
| - Every loop has a static upper bound. If a loop must not terminate (an event loop, for example), that fact is itself asserted. | ||
| - Every queue has a fixed capacity. The deep core does not allocate to absorb load. | ||
| - No recursion in the deep core. |
There was a problem hiding this comment.
(This is currently done in many places in the datastore where we tree-walk AVs and ATs.)
|
|
||
| ### Control flow | ||
|
|
||
| Prefer simple, explicit control flow. Split compound conditions into nested `if/else` rather than chaining them. State invariants positively. Avoid macros where a function will do. |
There was a problem hiding this comment.
I don't think this necessarily makes code simpler to read or more fault tolerant. It can often lead to more repetition and error prone code as a result.
There was a problem hiding this comment.
I concur on the macro part, tho.
There was a problem hiding this comment.
Yes, agreed on the macros part.
There was a problem hiding this comment.
I do not feel strongly about this one. I do feel strongly about the macros part though.
|
|
||
| ### Comments and formatting | ||
|
|
||
| - Comments explain *why*, not *what*. The code already says *what*. |
There was a problem hiding this comment.
I think "what"-paragraph-comments are also useful, as they can allow skipping sections of code by summarizing what they do. Why-comments are more valuable of course.
There was a problem hiding this comment.
Sure, but they can also get out of sync with the code and be more confusing. We actually just had a post mortem on this. I think this is a good general guideline, although summarizing complex logic I don't think is an issue.
- Rename DEEP_CORE_STYLE.md to DEEP_DATABASE_STYLE.md and retitle to "Deep Database Style". The "deep core" remains the term we use for the core of the database inside the document. - §4 (no_std): note that an allocator-trait-based, per-resource-type approach is the natural Rust expression, and that the resulting intrusion in the datastore is expected, not a cost to avoid. - §5 (persistent data structures): clarify the principle is about the externally observable behavior of the system, not a ban on mutable internals. Components may still use mutable, non-persistent structures internally. - §7 (unreliable processes): make corruption explicit, including cosmic rays and ordinary hardware faults; tie back to Merkle structures from §5. - Style/Control flow: drop the parts I do not feel strongly about (compound-condition splitting, positive invariant phrasing); keep and expand the macros guidance. - Style/Comments: allow short what-summaries for genuinely complex logic, while still warning about drift between comments and code.
Summary
Adds
docs/DEEP_DATABASE_STYLE.md, a style guide for the deep core of the SpacetimeDB database (datastore, commitlog, snapshotting, replication).Read the rendered document
The document is organized around seven principles:
no_stdA short style section follows the principles, covering assertions, bounded loops and queues, error handling, control flow, naming, and formatting. Inspired by TIGER STYLE, narrowed and adapted for Rust and our principles.
This is a seed document. It will grow as we make the principles operational in code and as the practices that serve them become clearer with use.
Test plan