Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ would like to work on or if it has already been resolved.
- Write a clear description of the changes you made and the reasons behind them.

> [!IMPORTANT]
> It's assumed that by submitting a pull request, you agree to license your contributions under the project's license.
> By submitting a pull request, you agree to license your contributions under the project's license.


### Development Workflow

Expand Down
57 changes: 23 additions & 34 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,30 +66,19 @@ reachable(X, Z) :- reachable(X, Y), edge(Y, Z).
% X = 3, Y = 4
```

Datalog is used in many application domains, especially when recursive querying over structured data is needed.
For example:

- Security and access control
- Role-based authorization with hierarchical permission inheritance and explicit denials
- Network reachability analysis through routing policies and firewall rules
- Taint analysis to trace untrusted data through program flows and detect vulnerabilities
- Data governance and compliance
- Data lineage tracking through ETL pipelines for GDPR and CCPA compliance
- PII propagation analysis with anonymization checkpoints
- Healthcare and life sciences
- Medical ontology reasoning with type hierarchies and property inheritance
- Drug-disease relationship inference and side effect prediction
- Software engineering
- Dependency resolution with transitive closure and cycle detection
- Points-to analysis and other static analyses over program representations

### Why Zodd?

- Written in pure Zig with a simple API
- Implements semi-naive evaluation for efficient recursive query processing
- Uses immutable, sorted, and deduplicated relations as core data structures
- Provides primitives for multi-way joins, anti-joins, secondary indexes, and aggregation
- Includes a Datalog frontend with a parser, a builder API, stratified negation, aggregates, and comparison operators
Datalog is useful for recursive queries over structured data, such as:

- **Access control**: managing role hierarchies and permissions.
- **Data lineage**: tracking how data moves through a system.
- **Software analysis**: resolving package dependencies or analyzing source code structure.
- **Graph queries**: finding paths and connections between data points.

### Features

- **Pure Zig**: a simple API built specifically for Zig projects.
- **Semi-naive evaluation**: handles recursion efficiently.
- **Core operations**: supports multi-way joins, anti-joins, secondary indexes, and aggregation.
- **Built-in frontend**: includes a Datalog parser, program builder, negation, and comparison filters.

See [ROADMAP.md](ROADMAP.md) for the list of implemented and planned features.

Expand All @@ -101,25 +90,24 @@ See [ROADMAP.md](ROADMAP.md) for the list of implemented and planned features.

### Getting Started

You can add Zodd to your project and start using it by following the steps below.
Follow the steps below to add Zodd to your project.

#### Installation

Run the following command in the root directory of your project to download Zodd:
Run this command in your project root to download Zodd:

```sh
zig fetch --save=zodd "https://github.com/CogitatorTech/zodd/archive/<branch_or_tag>.tar.gz"
```

Replace `<branch_or_tag>` with the desired branch or release tag, like `main` (for the developmental version) or `v0.1.0`.
This command will download Zodd and add it to Zig's global cache and update your project's `build.zig.zon` file.
Replace `<branch_or_tag>` with the branch or tag you want to use, such as `main` or `v0.1.0`.

> [!NOTE]
> Zodd is developed and tested with Zig version 0.16.0.

#### Adding to Build Script
#### Build Configuration

Next, modify your `build.zig` file to make Zodd available to your build target as a module.
Add Zodd as a module dependency in your `build.zig` file:

```zig
pub fn build(b: *std.Build) void {
Expand Down Expand Up @@ -205,7 +193,8 @@ Zodd is licensed under the MIT License (see [LICENSE](LICENSE)).

### Acknowledgements

* The logo shows a directed graph that edges form a Z, with a dashed arc for the derived fact `path(a, d)`.
* This project uses the [Minish](https://github.com/CogitatorTech/minish) for property-based testing and
the [Ordered](https://github.com/CogitatorTech/ordered) for B-tree indices.
* Zodd is inspired and modeled after the [Datafrog](https://github.com/frankmcsherry/blog/blob/master/posts/2018-05-19.md) Datalog engine for Rust.
* The logo shows a directed graph whose edges form a Z, with a dashed arc for the derived fact `path(a, d)`.
* This project uses [Minish](https://github.com/CogitatorTech/minish) for property-based testing and
[Ordered](https://github.com/CogitatorTech/ordered) for B-tree indices.
* Zodd is inspired by and modeled after the [Datafrog](https://github.com/frankmcsherry/blog/blob/master/posts/2018-05-19.md) Datalog engine for Rust.

4 changes: 2 additions & 2 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Project Roadmap

This document outlines the features implemented in Zodd and the future goals for the project.
This document lists the completed and planned features for Zodd.

> [!IMPORTANT]
> This roadmap is a work in progress and is subject to change.
Expand All @@ -18,7 +18,7 @@ This document outlines the features implemented in Zodd and the future goals for
- [x] `FilterAnti` - negation (filter out matching tuples)
- [x] `ExtendAnti` - anti-join (filter to keep non-matching values)

### Extra Features
### Other Features

- [x] Negation primitives (anti-join and anti-extend)
- [x] Aggregations
Expand Down
2 changes: 1 addition & 1 deletion build.zig.zon
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.{
.name = .zodd,
.version = "0.1.0-alpha.5",
.version = "0.1.0-alpha.6",
.fingerprint = 0x2d03181bdd24914c, // Changing this has security and trust implications.
.minimum_zig_version = "0.16.0",
.dependencies = .{
Expand Down
20 changes: 10 additions & 10 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,16 @@

#### List of Examples

| # | File | Description |
|---|--------------------------------------------------------------|----------------------------------------------------------------------------|
| 1 | [e1_network_reachability.zig](e1_network_reachability.zig) | Network zone reachability through routing and firewall rule analysis. |
| 2 | [e2_knowledge_graph.zig](e2_knowledge_graph.zig) | Medical ontology reasoning with type hierarchy and drug-disease inference. |
| 3 | [e3_data_lineage.zig](e3_data_lineage.zig) | Data lineage tracking for GDPR compliance with PII propagation. |
| 4 | [e4_rbac_authorization.zig](e4_rbac_authorization.zig) | RBAC authorization with role hierarchy, joins, and denial filtering. |
| 5 | [e5_taint_analysis.zig](e5_taint_analysis.zig) | Security taint analysis using leapfrog trie join for taint propagation. |
| 6 | [e6_dependency_resolution.zig](e6_dependency_resolution.zig) | Package dependency resolution with aggregation and reverse-dep index. |
| 7 | [e7_datalog_frontend.zig](e7_datalog_frontend.zig) | Textual Datalog frontend with recursion, negation, and aggregates. |
| 8 | [e8_comparison_filters.zig](e8_comparison_filters.zig) | Comparison filters for SLA monitoring over latencies and aggregates. |
| # | File | Description |
|---|--------------------------------------------------------------|--------------------------------------------------------------------------|
| 1 | [e1_network_reachability.zig](e1_network_reachability.zig) | Network zone reachability based on routing and firewall rules. |
| 2 | [e2_knowledge_graph.zig](e2_knowledge_graph.zig) | Ontology reasoning with type hierarchies and drug-disease relationships. |
| 3 | [e3_data_lineage.zig](e3_data_lineage.zig) | Data lineage tracking through transformations and anonymization. |
| 4 | [e4_rbac_authorization.zig](e4_rbac_authorization.zig) | RBAC authorization with role hierarchies and explicit denials. |
| 5 | [e5_taint_analysis.zig](e5_taint_analysis.zig) | Taint analysis tracing untrusted inputs to program sinks. |
| 6 | [e6_dependency_resolution.zig](e6_dependency_resolution.zig) | Package dependency resolution with size aggregation and indexes. |
| 7 | [e7_datalog_frontend.zig](e7_datalog_frontend.zig) | Datalog parser, evaluator, and query frontend. |
| 8 | [e8_comparison_filters.zig](e8_comparison_filters.zig) | Comparison filters to monitor latencies against SLA limits. |

#### Running Examples

Expand Down
8 changes: 3 additions & 5 deletions examples/e1_network_reachability.zig
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
const std = @import("std");
const zodd = @import("zodd");

// Network Reachability Analysis
// Network Reachability
//
// Determines which network zones can communicate through routing policies and
// firewall rules. A common task in enterprise security auditing to identify
// unintended exposure paths. For example, verifying that the internet cannot
// reach the database tier, or that PCI zones are properly isolated.
// Determines which network zones can communicate based on routing and firewall
// rules. For example, checking if someone from the internet can access the database.
//
// Datalog rules:
// reachable(A, B) :- link(A, B).
Expand Down
9 changes: 4 additions & 5 deletions examples/e2_knowledge_graph.zig
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
const std = @import("std");
const zodd = @import("zodd");

// Knowledge Graph Reasoning (Medical Ontology)
// Knowledge Graph Reasoning
//
// Infers new biomedical facts from a medical ontology through type hierarchy
// and property inheritance. This is a common pattern in healthcare, pharma,
// and biotech for drug repurposing, adverse effect prediction, and clinical
// decision support.
// Infers facts from a hierarchical ontology. For example, inheriting symptoms
// through type hierarchies (e.g., if a disease subtype inherits a symptom from
// a supertype).
//
// Datalog rules:
// is_a(X, Z) :- is_a(X, Y), is_a(Y, Z).
Expand Down
15 changes: 7 additions & 8 deletions examples/e3_data_lineage.zig
Original file line number Diff line number Diff line change
@@ -1,25 +1,24 @@
const std = @import("std");
const zodd = @import("zodd");

// Data Lineage for GDPR/CCPA Compliance
// Data Lineage Tracking
//
// Tracks how sensitive data (PII) flows through ETL pipelines and data
// warehouse transformations. Identifies which downstream datasets contain
// PII, verifies that anonymization steps properly cleanse data, and flags
// compliance violations when PII appears in public-facing datasets.
// Tracks how sensitive data flows through transformations. Identifies downstream
// datasets containing sensitive information, and flags when sensitive information
// reaches public datasets without anonymization.
//
// Datalog rules:
// contains_pii(D) :- source_pii(D).
// contains_pii(D2) :- contains_pii(D1), transform(D1, D2),
// NOT anonymizes(D1, D2).
// violation(D) :- contains_pii(D), public_dataset(D).

pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();

std.debug.print("Zodd Datalog Engine - Data Lineage for Compliance\n", .{});
std.debug.print("Zodd Datalog Engine - Data Lineage Tracking\n", .{});
std.debug.print("=================================================\n\n", .{});

// Data pipeline:
Expand Down
12 changes: 6 additions & 6 deletions examples/e4_rbac_authorization.zig
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
const std = @import("std");
const zodd = @import("zodd");

// Role-Based Access Control (RBAC) Authorization Engine
// Role-Based Access Control (RBAC)
//
// Computes effective user permissions through role hierarchy inheritance,
// permission grants, and explicit denials using Datalog rules:
// Computes user permissions through role hierarchies, permission grants, and
// explicit denials using Datalog rules:
//
// has_role(U, R) :- user_role(U, R).
// has_role(U, R2) :- has_role(U, R1), role_hier(R1, R2).
// can_access(U, P) :- has_role(U, R), role_perm(R, P).
// effective(U, P) :- can_access(U, P), NOT denied(U, P).

pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();

std.debug.print("Zodd Datalog Engine - RBAC Authorization Example\n", .{});
std.debug.print("Zodd Datalog Engine - RBAC Authorization\n", .{});
std.debug.print("=================================================\n\n", .{});

// Identifiers (using u32 for simplicity):
Expand Down
12 changes: 6 additions & 6 deletions examples/e5_taint_analysis.zig
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
const std = @import("std");
const zodd = @import("zodd");

// Taint Analysis for Security
// Taint Analysis
//
// Tracks the flow of untrusted (tainted) data through a program to detect
// potential security vulnerabilities such as SQL injection and XSS.
// Tracks how untrusted input flows through a program to detect potential
// security vulnerabilities (like SQL injection).
//
// Datalog rules:
// tainted(V) :- source(V).
Expand All @@ -13,13 +13,13 @@ const zodd = @import("zodd");
//
// Uses ExtendWith (leapfrog trie join) for taint propagation and
// FilterAnti for sanitizer filtering.

pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();

std.debug.print("Zodd Datalog Engine - Taint Analysis Example\n", .{});
std.debug.print("Zodd Datalog Engine - Taint Analysis\n", .{});
std.debug.print("=============================================\n\n", .{});

// Simulated program:
Expand Down
11 changes: 5 additions & 6 deletions examples/e6_dependency_resolution.zig
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
const std = @import("std");
const zodd = @import("zodd");

// Dependency Resolution for a Package Manager
// Package Dependency Resolution
//
// Resolves transitive package dependencies, detects circular dependencies,
// computes total install sizes, and supports reverse-dependency lookups.
// Resolves package dependencies, detects cycles, and aggregates install sizes.
//
// Datalog rules:
// dep(A, B) :- direct_dep(A, B).
Expand All @@ -15,13 +14,13 @@ const zodd = @import("zodd");
// - Variable + Relation for transitive closure
// - aggregate for computing total install size per package
// - SecondaryIndex for efficient reverse-dependency lookups

pub fn main() !void {
var gpa = std.heap.DebugAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();

std.debug.print("Zodd Datalog Engine - Dependency Resolution Example\n", .{});
std.debug.print("Zodd Datalog Engine - Package Dependency Resolution\n", .{});
std.debug.print("===================================================\n\n", .{});

// Package IDs:
Expand Down
5 changes: 1 addition & 4 deletions src/zodd/frontend/explain.zig
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,7 @@ fn predName(program: *const ast.Program, interner: *const Interner, pred: ast.Pr
}

fn writeValue(writer: *std.Io.Writer, interner: *const Interner, atom: dyntuple.Atom) WriteError!void {
switch (interner.resolve(atom)) {
.int => |v| try writer.print("{d}", .{v}),
.str => |s| try writer.print("\"{s}\"", .{s}),
}
try interner_mod.writeValueLiteral(writer, interner.resolve(atom));
}

fn writeVar(writer: *std.Io.Writer, rule: *const ast.Rule, var_id: ast.VarId) WriteError!void {
Expand Down
35 changes: 35 additions & 0 deletions src/zodd/frontend/interner.zig
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,26 @@ pub const Value = union(enum) {
str: []const u8,
};

/// Writes a value as a Datalog literal.
pub fn writeValueLiteral(writer: *std.Io.Writer, value: Value) std.Io.Writer.Error!void {
switch (value) {
.int => |v| try writer.print("{d}", .{v}),
.str => |s| {
try writer.writeAll("\"");
for (s) |c| {
switch (c) {
0x22 => try writer.writeAll(&.{ 0x5c, 0x22 }),
0x5c => try writer.writeAll(&.{ 0x5c, 0x5c }),
0x0a => try writer.writeAll(&.{ 0x5c, 0x6e }),
0x09 => try writer.writeAll(&.{ 0x5c, 0x74 }),
else => try writer.writeAll(&.{c}),
}
}
try writer.writeAll("\"");
},
}
}

/// Errors produced when encoding values into the atom space.
pub const EncodeError = error{IntegerTooLarge};

Expand Down Expand Up @@ -132,6 +152,21 @@ test "Interner: round-trip ints and strings" {
try std.testing.expectEqualStrings("x", interner.resolve(str_atom).str);
}

test "Interner: value literal formatting escapes strings" {
var buffer: [64]u8 = undefined;
var writer = std.Io.Writer.fixed(&buffer);

const value = [_]u8{ 0x61, 0x22, 0x62, 0x5c, 0x63, 0x0a, 0x09 };
try writeValueLiteral(&writer, .{ .str = &value });

const expected = [_]u8{
0x22, 0x61, 0x5c, 0x22, 0x62, 0x5c,
0x5c, 0x63, 0x5c, 0x6e, 0x5c, 0x74,
0x22,
};
try std.testing.expectEqualSlices(u8, &expected, writer.buffered());
}

test "Interner: integers must fit in 63 bits" {
try std.testing.expectEqual(@as(Atom, PAYLOAD_MASK), try encodeInt(PAYLOAD_MASK));
try std.testing.expectError(error.IntegerTooLarge, encodeInt(PAYLOAD_MASK + 1));
Expand Down
Loading
Loading