Automated Migrations
These utility functions are provided in the utils package for writing migration transactions.
Automated Migrations with Context
1. Introduction
Version Requirements Substrate and Shiroclient CLI must be at least version v2.208.0 (the first release containing the idempotency fix and the recursive in-transaction execution loop).
In production environments, large data migrations can quickly exceed transaction size limits, hit execution timeouts, and become cumbersome to manage. Historically, users had to comment and uncomment individual migration blocks, deploy each batch manually, and rerun initialization for every segment. This context-based migration feature automates batching, tracks progress via a serialized context map, and orchestrates repeated transactions until completion, ensuring safe, efficient, and resumable migrations.
2. Prerequisites
- Substrate Runtime: Version
v2.208.0or later with context-based migration utilities. - Shiroclient CLI: Version
v2.208.0or later forinitcommand enhancements. - Phylum Code: Your application business logic written in ELPS.
- Testing Framework: Version
v2.208.0or later ofshirotester, available in your local development environment.
3. Overview of Context-Based Migrations
Context-based migrations allow large data migrations to be split into multiple, safe transactions. Each migration step:
- Receives a context map from the previous step (or empty for the first run).
- Processes a batch of data.
- Returns a new context to continue, or empty to finish.
- Is idempotent: each
(migration-id, serialized-input-context)pair caches its output, so re-execution with the same input returns the cached next-context without running the body again.
Two loops drive the work:
- Inner loop (inside a single transaction). Each call to
run_ctx_migrationsfeeds the result of every registered migration back into itself viarun-migration-to-completion, stopping when the migration returns empty (done), hits a fixed point (output equals input), or reaches the safety cap of 1000 inner iterations. Cached steps return instantly, so an inner loop typically fast-forwards through already-completed work and then executes fresh body code until the transaction's read/write set fills up. - Outer loop (driven by the CLI).
shiroclient initcallsrun_ctx_migrationsrepeatedly, threading each call's commit block forward as the dependent block of the next call. It stops when an iteration returns an empty result or the--iterationscap is hit.
4. Substrate Runtime Utilities
The following utilities are provided in the utils.lisp runtime package:
4.1 ctx-migrations Vector
A registry that holds all registered context migration functions.
4.2 register-migration-ctx
Macro to register a migration by ID and function body.
4.3 def-migration-ctx
Convenience macro that wraps register-migration-ctx with an idempotency cache. Each call serializes its input context to JSON and looks up the key ("utl" "migration-ctx" <id> <serialized-input-ctx>) in statedb. On a hit the cached output is returned without running the body; on a miss the body runs once and its return value is written to that key. This means a re-run with the same input context is free, and an MVCC retry of a partially-executed transaction is safe.
4.4 run-ctx-migrations
Walks the ctx-migrations registry and, for each registered migration, calls run-migration-to-completion with the relevant slice of the supplied context map.
4.5 run-migration-to-completion
Repeatedly feeds a migration's output back as its next input until one of three things happens:
- The migration returns an empty map — done, drop from the result.
- Output equals input — fixed point reached, return as-is.
- The 1000-iteration safety cap is hit — return whatever the last step produced (a warning is logged).
Because def-migration-ctx caches each step, this loop typically scans through any already-completed steps for free, then performs fresh work until the surrounding Fabric transaction's read/write set is full. The remaining state is returned to the CLI, which commits and starts the next transaction.
5. Phylum Endpoint: run_ctx_migrations
This feature adds a new internal endpoint to the platform that is called by shiroclient automatically:
(defendpointnames "run_ctx_migrations" '("ctx") (ctx)
(route-success (utils:run-ctx-migrations ctx)))
This endpoint is called by the CLI in a loop, passing the context map and returning the next context.
6. CLI Integration: the init Command
The shiroclient init command has been enhanced to support context-based migrations. Two new flags drive the migration loop:
--iterations/-l: Maximum number of migration iterations (default: 0). The CLI will callrun_ctx_migrationsup to this many times or stop early if an empty context is returned.--initial-ctx/-m: JSON string to seed the first migration context. Defaults to an empty map{}if omitted. Provide a JSON object whose keys are your migration IDs (the strings you passed todef-migration-ctx) and whose values are the per-migration context maps. Any migration not mentioned will start with an empty context. You also can use this to resume a migration that may have failed part way through execution.
Format of --initial-ctx:
{
"<migration-id-1>": { "...": "context for first migration" },
"<migration-id-2>": { "...": "context for second migration" }
}
Example: if you have two context-based migrations registered as def-migration-ctx "RD-451-update-property-location" … and def-migration-ctx "RD-12-normalize-user-emails" …, you might write:
shiroclient init \
-l 10 \
-m '{
"RD-451-update-property-location": { "bookmark": "", "batch_size": 500 },
"RD-12-normalize-user-emails": { "last_id": "user_123" }
}' \
<version> <phylum.zip>
This tells the CLI to seed each named migration with its initial sub-context.
Defaults:
- If you omit
--initial-ctx, the CLI uses an empty map: all migrations begin at their default starting point. - If you include only some migration IDs, the others still run but start from an empty context.
Execution loop and dependent-block handling. Shiroclient executes the following steps for the init command:
- Deploy the phylum via the
updatecall. - Reinitialize the CLI client against the new phylum version and sleep briefly for propagation.
- Loop up to
--iterations, invokingrun_ctx_migrationswith the previous response context. Each call after the first passes the prior transaction's commit block number as the dependent block, so the new transaction is guaranteed to observe the cached results written by the previous step. - Exit early if a call returns an empty context (all migrations finished) or an error occurs.
A single iteration may advance through many already-cached steps and then execute one or more fresh steps until the transaction's read/write set is full — so the number of iterations needed is roughly the number of full transactions of fresh work the migration produces, not the total number of steps.
7. Writing Migrations in Your Phylum
7.1 Syntax of def-migration-ctx
Define a context-aware migration with an ID and a single parameter (the context map):
(def-migration-ctx "your-migration-id" (ctx)
;; migration body, returning next context or empty map when complete
)
NOTE: you can specify multiple def-migration-ctx, all of which are executed within each iteration of the migration loop.
7.2 Context Parameter & Return Semantics
- Developers decide what to store in
ctx—it must be serializable to JSON and contain all information required for the next transaction to continue where the last one left off. - Return a non-empty map to continue migration; return an empty map (e.g.
(sorted-map)) to signal completion. - IMPORTANT: Each step is cached under its serialized input context. If a step returns the same context it was given (e.g. you forgot to advance a cursor),
run-migration-to-completiondetects the fixed point and stops — but the cached output and the in-tx loop both block forward progress. Always change at least one field (bookmark, cursor, counter, etc.) when you want the migration to continue. - The inner loop is capped at 1000 iterations per call as a safety valve against runaway migrations; if you hit this cap it almost always means a bug in your migration body.
7.3 JSON Serialization Requirements
- All keys and values in the context map must be serializable via
json:dump-bytes. - Complex values should be converted to strings or basic types before returning.
7.4 Example Migration Definition
(defun reduce-collector-page (acc key val bookmark)
(sorted-map "keys" (concat 'list (get acc "keys")
(list (sorted-map "key" key "val" val)))
"bookmark" bookmark))
...
(def-migration-ctx "RD-1337-update-property-location" (ctx)
(let* ([bookmark (default (get ctx "bookmark") "")]
[batch-size (default (get ctx "batch_size") 1000)]
[result (range-page-bytes reduce-collector-page
(sorted-map "keys" '() "bookmark" "")
"property-index:"
"property-index^"
batch-size
bookmark)]
[properties (get result "keys")]
[next-bookmark (get result "bookmark")])
(map () (lambda (pid)
(migrate-property (get pid "key")))
properties)
(if (string? next-bookmark)
(sorted-map "bookmark" next-bookmark "batch_size" batch-size)
(sorted-map))))
8. Testing with shirotester
8.1 Unit Tests in utils_test.lisp
The current test suite covers four scenarios — see internal/substrate/shirocore/utils_test.lisp:
register-migration-ctx: registration appends toctx-migrationsand the wrapped lambda runs without caching.def-migration-ctx: a one-shot migration returns its next context and writes the cache key; a secondrun-ctx-migrationscall with the same input returns the cached output without re-running the body (verified via a call counter).def-migration-ctx-iterative: a multi-step migration runs through several steps within a singlerun-ctx-migrationscall (the inner loop), and a subsequent call with the same starting context completes entirely from cache.def-migration-ctx-fixed-pointanddef-migration-ctx-max-iters: exercise the two non-completion exit paths ofrun-migration-to-completion.
8.2 Invoking run_ctx_migrations in Tests
(in-package 'utils)
(use-package 'utils)
(use-package 'testing)
(test "def-migration-ctx"
;; Define a migration that returns a next-context map
(def-migration-ctx "test-migration" (context)
(cc:infof (sorted-map "context" context) "Executing migration")
(sorted-map "next_key" "next_value"))
(let* ([ctx (sorted-map "test-migration" (sorted-map "key" "value"))]
[result (run-ctx-migrations ctx)])
(assert-deep-equal
(sorted-map "next_key" "next_value")
(get result "test-migration"))))
9. Best Practices & Caveats
9.1 Pagination via statedb:range-page
- Instead of scanning all keys,
statedb:range-pageretrieves up topage-sizekeys and returns abookmarkfor continuation. This drastically reduces the total number of keys scanned in the ledger, shrinks the read/write set, and keeps transaction payloads small (avoiding size and timeouts). - Tombstoned keys still occupy pagination space, which may result in more or fewer results than the specified
page-size(see semantics section below), but no data is skipped across batches.
9.2 Handling Partial Batches & Tombstones
- Design your migration logic to handle smaller and greater-than-expected batches gracefully.
- If strict batch sizes are required, loop internally and resize the result sets until the desired count is processed.
9.3 Safety Guards with --iterations
- The CLI’s
--iterations(-l) flag caps the number ofrun_ctx_migrationstransactions in a singleinitinvocation. Each transaction can advance through any number of cached steps for free plus enough fresh steps to fill its read/write set, so size-lagainst the expected number of full transactions of fresh work — not the total number of migration steps. The loop exits early as soon as a transaction returns an empty result. - Independently, every call has an internal cap of 1000 iterations inside
run-migration-to-completionto catch runaway migrations. Hitting this almost always indicates a logic bug. - By default,
--iterationsis0, and no context-based migrations run unless you specify a positive number.
10. FAQ
Q: Is each migration step executed as a single transaction? No — each run_ctx_migrations call is a single transaction, and that transaction may execute many migration steps via the inner run-migration-to-completion loop. Cached steps from prior transactions resolve instantly; fresh steps execute until the transaction's read/write set fills up, at which point the CLI commits and starts the next transaction.
Q: Are migrations paginated? Yes. Under the hood, migrations should use statedb:range-page (or a custom paging loop) to fetch a subset of keys and a bookmark for the next batch.
Q: Do we pass full migration logic for each step? Yes. Every step includes the complete migration function body; only the context map changes between invocations.
Q: Why do we need to set --iterations? As a safety guard, --iterations caps the number of automated steps to prevent infinite loops due to logic errors. The loop still exits early if the migration returns an empty context.
Q: What do the new features automate? Previously, developers managed batching and transaction control manually. With def-migration-ctx and CLI looping. Now migrations run across multiple transactions automatically—no manual intervention needed.
Q: Why can’t we run the entire migration in one transaction? Transactions have limits on size, key count, and execution time. Splitting into batches ensures reliable commits without hitting gRPC or ledger constraints.
11. Pagination range-page / range-page-bytes
11.1 statedb:range-page-bytes
Similar to range and range-bytes, but with pagination.
(range-page-bytes fn z start end page-size bookmark)
Best used to minimize ledger reads and transaction payloads.
- fn (function): signature
(acc curr-key curr-val bookmark) -> new-acc, wherecurr-valis raw bytes. - z: initial accumulator value
acc. - start: inclusive start key string.
- end: exclusive end key string.
- page-size: maximum entries to process per page (integer).
- bookmark: pagination token (string),
""for the first page. Set this aside in youracc.
Returns: new-acc:
new-acc: the accumulator value after processing this page.
Example:
(let* ([result (range-page-bytes collect-keys
'()
"user:"
"user;"
100
"")]
[next-bmk (get result "bookmark")])
;; collect-keys stores the keys and bookmark in a sorted map
(when next-bmk
(cc:infof "More pages available with bookmark: {}" next-bmk)))
11.2 statedb:range-page
(range-page fn z start end page-size bookmark)
- Same parameters as
range-page-bytes, exceptfnhas signature(acc curr-key curr-val bookmark)wherecurr-valis automatically deserialized.
Under the hood, range-page calls deserialize on each raw value before invoking fn.
Example:
(let* ([result (range-page collect-objects
'()
"order:"
"order;"
50
"")]
[next-bmk (get result "bookmark")]
[orders (get result "values")])
;; collect-objects stores the values and bookmark in a sorted map
;; orders is a list of maps, next-bmk a token for further pages
(cc:infof "Fetched orders: {}" orders))
11.3 Page Size Semantics
-
Page size as a hint: The
page-sizeparameter caps how many keys the Fabric ledger scan will perform per call, but actual returned counts vary:- You may get more items than
page-sizeif the transaction cache holds extra entries in the same range. - You may get fewer items than
page-sizeif tombstoned keys occupy slots but are filtered out.
- You may get more items than
-
Merged sources: Each pagination call unifies two data sources:
- Paginated ledger query (up to
page-sizekeys). - Unbounded in-memory cache query.
- Paginated ledger query (up to
-
Bookmark control: We pass and manage a
bookmark(the last processed key) between calls via the iterator functionfn. To keep the API consistent with the existing non-paginatedrangevariants,range-page/range-page-bytesreturn only the accumulator (acc) instead of a tuple(acc, bookmark). Therefore, the pagination utility does not update the bookmark internally; it is the caller’s responsibility to capture the finalbookmarkvalue withinfn(for example, storing it in a sorted map or variable) and then supply that bookmark in the next invocation to continue without overlap.
In summary,
- Capture the last processed key within
fn. - Supply it as the next
bookmarkin the following call.
11.3.1 Technical Details / Algorithm
Under the hood, the platform uses the following algorithm:
- Shim query: Perform a paginated range query on the ledger from
bookmark(inclusive) toendKey, limited bypageSize. - Cache query: Perform a full range query on the in-memory transaction cache over the same bounds (no
pageSizelimit). - Merge & dedupe: Combine both sorted lists of keys, remove duplicates, and preserve sort order.
- Skip bookmark: Omit the first key if it equals the input
bookmark, preventing reprocessing. - Determine next bookmark: The last key in the merged result becomes the next
bookmarkfor subsequent calls.
11.3.2 Design Considerations
- No data loss: By merging and deduping, we include newly written and uncommitted keys and exclude deleted ones.
- Consistent progress: Using the last processed key as the next
bookmarkensures forward-only paging without overlap. - Performance bound: While
page-sizebounds ledger scan work, total returned items may vary. Including cached items in the same pass leverages already-loaded data and avoids additional queries. - Simplified implementation: Merging ledger and cache streams abstracts complexity inside the pagination utility, so your migration logic remains straightforward.
- Disabling pagination: Setting
pageSize ≤ 0returns all keys in the range at once (both ledger and cache).
By understanding these semantics, you can reliably implement batch processing loops that adapt to both ledger and cache behaviors, while controlling transaction size and duration.
12. Appendix
12.1 Sample JSON Contexts
{ "bookmark": "", "batch_size": 500 }
{ "i": 3 }
{ "cursor": "abc123", "step": 2 }