Architecture

Data flow

The end-to-end replication data path — changed-block identification, descriptors, rolling-window streaming, cloud-native writes, and checkpoint finalization.

Product: Datamotive Platform
Version: v2.0.3
Last updated: Updated Jun 11, 2026
Reading time: 2 min read

This page traces the path data takes during a replication iteration — from changed blocks on the source to a finalized recovery checkpoint on the target — and during recovery.

Replication data flow

Identify changed blocks
The source side uses the platform's incremental tracking — vSphere CBT, cloud-native changed-block APIs (e.g. EBS ListChangedBlocks) — to find data changed since the last iteration. The first iteration transfers full disk data.
Create replication descriptors
Instead of streaming blocks sequentially, the engine builds descriptors — offset, length, chunk metadata, replication state — optimized for highly fragmented, non-sequential change sets.
Schedule chunks into rolling windows
Chunks (default 1 MB) are scheduled into sliding per-disk windows (default 32 MB, adaptively scaled between 16–64 MB). Multiple chunks stay in flight simultaneously — there is no stop-and-wait between batches.
Stream in parallel
Parallel chunk streams run across multiple disks, workloads, and replication workers. Target-side workers drive a pull model: they request windows, control inflight concurrency, and centrally manage replication pressure.
Apply backpressure
Node-level and per-disk windows adapt to cloud storage latency, write queue depth, WAN variability, and retry behavior — preventing memory amplification, cloud-write overload, and storage queue saturation.
Write to cloud-native storage
Chunks are validated and written through native storage APIs — AWS EBS snapshots/volumes, Azure Managed Disks, VMware datastores, GCP Persistent Disks.
Finalize the recovery checkpoint
After chunk validation, storage-write completion, metadata synchronization, and consistency validation, the iteration is finalized into a recovery checkpoint — the basis for failover, test recovery, and migration. Intermediate checkpoints are taken every 500 MB (default).

Optional per-plan transforms apply on the wire: encryption, compression, and deduplication (via the DeDup Node's chunk checksum index). See Replication configuration.

Recovery data flow

Recovery reverses the path using the data plane on the recovery site:

Disk reconstruction from the selected checkpoint (latest or point-in-time).
Snapshot finalization and system health checks for Windows workloads (via the Prep Node).
VM instantiation on the target platform's native compute.
Network attachment per the recovery configuration.
Boot-order orchestration with configured delays.
Recovery validation (Windows guests).

Recovery orchestration runs in parallel while respecting cloud API limits, quota constraints, and storage concurrency — see Limits.

What crosses which boundary

Workload data moves only between your protected and recovery sites — over private subnets or VPN tunnels, optionally encrypted on the wire.
Orchestration calls flow outbound from Datamotive nodes to the platform managers (vCenter, AWS/Azure/GCP APIs) over port 443. No organization data is transmitted on this path.
Metadata stays inside the Management Servers on your sites. Nothing transits Datamotive infrastructure.

Related docs

Was this page helpful?

Replication data flow

Identify changed blocks

Create replication descriptors

Schedule chunks into rolling windows

Stream in parallel

Apply backpressure

Write to cloud-native storage

Finalize the recovery checkpoint

Recovery data flow

What crosses which boundary