Architecture
Data flow
The end-to-end replication data path — changed-block identification, descriptors, rolling-window streaming, cloud-native writes, and checkpoint finalization.
- Product
- Datamotive Platform
- Version
- v2.0.3
- Last updated
- Updated
- Reading time
- 2 min read
This page traces the path data takes during a replication iteration — from changed blocks on the source to a finalized recovery checkpoint on the target — and during recovery.
Replication data flow
Identify changed blocks
The source side uses the platform's incremental tracking — vSphere CBT, cloud-native changed-block APIs (e.g. EBS ListChangedBlocks) — to find data changed since the last iteration. The first iteration transfers full disk data.
Create replication descriptors
Instead of streaming blocks sequentially, the engine builds descriptors — offset, length, chunk metadata, replication state — optimized for highly fragmented, non-sequential change sets.
Schedule chunks into rolling windows
Chunks (default 1 MB) are scheduled into sliding per-disk windows (default 32 MB, adaptively scaled between 16–64 MB). Multiple chunks stay in flight simultaneously — there is no stop-and-wait between batches.
Stream in parallel
Parallel chunk streams run across multiple disks, workloads, and replication workers. Target-side workers drive a pull model: they request windows, control inflight concurrency, and centrally manage replication pressure.
Apply backpressure
Node-level and per-disk windows adapt to cloud storage latency, write queue depth, WAN variability, and retry behavior — preventing memory amplification, cloud-write overload, and storage queue saturation.
Write to cloud-native storage
Chunks are validated and written through native storage APIs — AWS EBS snapshots/volumes, Azure Managed Disks, VMware datastores, GCP Persistent Disks.
Finalize the recovery checkpoint
After chunk validation, storage-write completion, metadata synchronization, and consistency validation, the iteration is finalized into a recovery checkpoint — the basis for failover, test recovery, and migration. Intermediate checkpoints are taken every 500 MB (default).
Optional per-plan transforms apply on the wire: encryption, compression, and deduplication (via the DeDup Node's chunk checksum index). See Replication configuration.
Recovery data flow
Recovery reverses the path using the data plane on the recovery site:
- Disk reconstruction from the selected checkpoint (latest or point-in-time).
- Snapshot finalization and system health checks for Windows workloads (via the Prep Node).
- VM instantiation on the target platform's native compute.
- Network attachment per the recovery configuration.
- Boot-order orchestration with configured delays.
- Recovery validation (Windows guests).
Recovery orchestration runs in parallel while respecting cloud API limits, quota constraints, and storage concurrency — see Limits.
What crosses which boundary
- Workload data moves only between your protected and recovery sites — over private subnets or VPN tunnels, optionally encrypted on the wire.
- Orchestration calls flow outbound from Datamotive nodes to the platform managers (vCenter, AWS/Azure/GCP APIs) over port 443. No organization data is transmitted on this path.
- Metadata stays inside the Management Servers on your sites. Nothing transits Datamotive infrastructure.
Related docs
Was this page helpful?
