Working With Graphs

Guides

Working With Graphs

April 27, 2026·siddharth-mishra

Graph(T) is the container you use when the problem is not “store a list of items” but “store a topology and run passes over it”.

That usually means one of these:

reachability analysis
control-flow graphs
dependency graphs
graph rewrites that need stable traversal before destructive mutation

The graph container is generic, but it is not “object graph” generic in the OOP sense. It is closer to the graph style used by compilers, static analyzers, and systems libraries:

nodes have stable ids while they are live
node payloads are generic
edges are directed
traversal is explicit
deletion is deferred through mark-then-commit

The Data Model

At the type level, a graph is just:

typedef Graph(MyNodeType) MyGraph;

Each live node has:

a user payload of type T
an outgoing adjacency list
an incoming adjacency list
a scratch visit count
a stable node id while the node remains live

The important part is that node identity is not the payload.

That matters because these are all valid node payloads:

Str
BasicBlock *
Module *
a small by-value struct such as Intersection

Two nodes can carry equal payload values and still be distinct graph nodes.

Node Ids And Node Handles

The graph exposes two related but different concepts:

GraphNodeId
GraphNode

`GraphNodeId`

GraphNodeId is the stable public id for a live node.

Internally it packs:

a slot index
a generation

That lets the graph reuse storage after deletion without silently treating an old id as if it still referred to the same logical node.

Practical rule:

live id: valid for access and traversal
deleted id after GraphCommitChanges(...): stale, do not use with live-node APIs

`GraphNode`

GraphNode is the traversal handle used by the foreach macros and handle-based helpers.

Use it when you want to:

read the payload through GraphNodeData(...)
access the node id through GraphNodeGetId(...)
update the scratch visit count
iterate neighbors or predecessors

The usual pattern is:

GraphForeachNode(&graph, node) {
    WriteFmtLn("node id = {}", GraphNodeGetId(node));
}

Traversal Model

The graph exposes both directions explicitly.

Forward traversal:

GraphOutDegree(...)
GraphNeighborAt(...)
GraphNodeForeachNeighbor(node, neighbor)

Reverse traversal:

GraphInDegree(...)
GraphPredecessorAt(...)
GraphNodeForeachPredecessor(node, predecessor)

That reverse path is maintained incrementally by the graph. It is not computed lazily by scanning every edge on demand.

That is the right tradeoff for the workloads this container is meant for:

CFG predecessor queries
dependency backtracking
reverse reachability
graph rewrite passes

Reachability With Named Intersections

A city map is a good example because it shows the split between topology and external lookup.

You normally want:

graph topology for roads
node payloads for intersection data
a side index from name to node id

typedef Graph(Str) CityGraph;
typedef Map(const char *, GraphNodeId) CityIndex;

Then:

add the intersection name to Graph(Str)
store name -> node_id in a Map
connect intersections with GraphAddEdge(...)

That means the graph is responsible for connectivity, while the map handles name lookup.

Why The Name Index Lives Outside The Graph

Because names are not special. They are just one possible payload.

For another workload, the payload might instead be:

BasicBlock *
Module *
TaskId

The graph should stay neutral and let the application choose how payload lookup works.

A Simple Reachability Pattern

For small one-off traversals, the graph-owned scratch visit count is enough:

static bool reachable_from(GraphNode node, GraphNodeId goal_id) {
    if (GraphNodeVisitCount(node) > 0) {
        return false;
    }

    GraphNodeVisit(node);
    if (GraphNodeGetId(node) == goal_id) {
        return true;
    }

    GraphNodeForeachNeighbor(node, neighbor) {
        if (reachable_from(neighbor, goal_id)) {
            return true;
        }
    }

    return false;
}

Before starting a new traversal, reset visits:

GraphForeachNode(&graph, node) {
    GraphNodeUnvisit(node);
}

This is intentionally lightweight. It is good for:

DFS-style reachability
one traversal at a time
graph-local scratch state

It is not the only way to track traversal state.

External Side State

For more serious analysis, use your own side tables.

Typical examples:

Map(GraphNodeId, u64) for sparse counters
Vec(u64) keyed by slot index for dense side state
domain-specific arrays for colors, timestamps, distances, dominators, or worklist flags

This is the important caveat:

GraphNodeId generation changes when a slot is reused
GraphNodeIndex(node) can repeat across delete-and-reuse commits

So if you key a dense side array by slot index, you must either:

reset it when deleted slots can be reused
or pair it with generation-aware logic

That is not a bug in the graph. It is part of the storage contract.

CFGs And Dependency Graphs

Graph(T) is a good fit for control-flow or dependency analysis because both node payload styles are supported well.

CFG By Pointer

Most compiler pipelines already own their basic blocks elsewhere.

In that case, store pointers:

typedef Graph(BasicBlock *) Cfg;

Why pointer payload is often right here:

the graph should not own the whole IR object model
the payload is large and already managed elsewhere
graph passes usually care about connectivity first and block metadata second

Then predecessor traversal becomes natural:

GraphNode block = GraphGetNode(&cfg, block_id);

GraphNodeForeachPredecessor(block, pred) {
    BasicBlock *pred_bb = GraphNodeData(&cfg, pred);
    (void)pred_bb;
}

Dependency Graph By Value Or Pointer

For dependency analysis, either payload form can be fine:

Graph(Str) if the graph owns module names directly
Graph(Module *) if a larger module table owns the real objects

The graph does not force one style. That choice belongs to the surrounding system.

Deferred Deletion

The container does not support arbitrary destructive mutation during active traversal.

That is deliberate.

Instead, the graph uses a two-stage model:

mark nodes or edges
commit changes

Available operations:

GraphMarkNodeForDeletion(node)
GraphUnmarkNodeForDeletion(node)
GraphNodeMarkedForDeletion(node)
GraphMarkEdgeForRemoval(g, from, to)
GraphUnmarkEdgeForRemoval(g, from, to)
GraphEdgeMarkedForRemoval(g, from, to)
GraphCommitChanges(g)

This is useful for rewrite passes such as:

“remove all dead blocks after analysis”
“remove selected dependency edges after pruning”
“mark this subgraph now, actually delete it after traversal”

Why Not Delete Immediately?

Because immediate structural mutation during iterator-driven traversal creates fragile rules very quickly:

iteration order changes under your feet
neighbor arrays shift while you are still walking them
users end up depending on subtle invalidation behavior

Mark-then-commit avoids that.

What Is Safe During Traversal

Inside the graph foreach APIs, the intended operations are:

read payloads
update payloads
use GraphNodeVisit(...) and GraphNodeUnvisit(...)
mark and unmark nodes for deletion
mark and unmark edges for removal

What is not allowed:

adding nodes
adding edges
clearing the graph
committing changes

Those are structural mutations, and the iterators will abort if they detect them.

Important Pitfalls

1. Stale Ids Are Real Errors

If you delete a node and commit, the old id is not a “maybe false” query key. It is stale.

That means APIs that expect live ids, such as:

GraphNodeAt(...)
GraphOutDegree(...)
GraphInDegree(...)
GraphNeighborAt(...)
GraphPredecessorAt(...)
GraphHasEdge(...)

should only be called with ids you know are still live.

Use GraphContainsNode(...) when probing an old stored id.

2. `GraphNodeIndex(...)` Is Not A Permanent Identity

It is the slot index, not the full logical identity.

After delete-and-reuse:

slot index may repeat
generation changes

So index-keyed side tables need reset or generation-aware handling.

3. The Scratch Visit Count Is Shared Graph State

It is convenient, not magical.

If you need:

multiple independent traversals
nested algorithms with distinct state
richer metadata than one counter

use your own side tables.

A Good Mental Model

The clean way to think about Graph(T) is:

the graph owns topology
the graph owns the node payload T
GraphNodeId is live-node identity
GraphNode is the traversal handle
predecessor and successor lists are maintained eagerly
destructive edits are deferred
richer analysis state belongs outside the graph

That model scales from toy reachability all the way up to CFG and dependency analysis without forcing the graph container to become an application framework.

Last updated on April 28, 2026

Building and Testing MisraStdC

Working With Graphs

The Data Model

Node Ids And Node Handles

GraphNodeId

GraphNode

Traversal Model

Reachability With Named Intersections

Why The Name Index Lives Outside The Graph

A Simple Reachability Pattern

External Side State

CFGs And Dependency Graphs

CFG By Pointer

Dependency Graph By Value Or Pointer

Deferred Deletion

Why Not Delete Immediately?

What Is Safe During Traversal

Important Pitfalls

1. Stale Ids Are Real Errors

2. GraphNodeIndex(...) Is Not A Permanent Identity

3. The Scratch Visit Count Is Shared Graph State

A Good Mental Model

`GraphNodeId`

`GraphNode`

2. `GraphNodeIndex(...)` Is Not A Permanent Identity