Skip to content

Architecture

Data flow

The end-to-end replication data path — changed-block identification, descriptors, rolling-window streaming, cloud-native writes, and checkpoint finalization.

Product
Datamotive Platform
Version
v2.0.3
Last updated
Updated
Reading time
2 min read

This page traces the path data takes during a replication iteration — from changed blocks on the source to a finalized recovery checkpoint on the target — and during recovery.

Replication data flow

  1. Identify changed blocks

    The source side uses the platform's incremental tracking — vSphere CBT, cloud-native changed-block APIs (e.g. EBS ListChangedBlocks) — to find data changed since the last iteration. The first iteration transfers full disk data.

  2. Create replication descriptors

    Instead of streaming blocks sequentially, the engine builds descriptors — offset, length, chunk metadata, replication state — optimized for highly fragmented, non-sequential change sets.

  3. Schedule chunks into rolling windows

    Chunks (default 1 MB) are scheduled into sliding per-disk windows (default 32 MB, adaptively scaled between 16–64 MB). Multiple chunks stay in flight simultaneously — there is no stop-and-wait between batches.

  4. Stream in parallel

    Parallel chunk streams run across multiple disks, workloads, and replication workers. Target-side workers drive a pull model: they request windows, control inflight concurrency, and centrally manage replication pressure.

  5. Apply backpressure

    Node-level and per-disk windows adapt to cloud storage latency, write queue depth, WAN variability, and retry behavior — preventing memory amplification, cloud-write overload, and storage queue saturation.

  6. Write to cloud-native storage

    Chunks are validated and written through native storage APIs — AWS EBS snapshots/volumes, Azure Managed Disks, VMware datastores, GCP Persistent Disks.

  7. Finalize the recovery checkpoint

    After chunk validation, storage-write completion, metadata synchronization, and consistency validation, the iteration is finalized into a recovery checkpoint — the basis for failover, test recovery, and migration. Intermediate checkpoints are taken every 500 MB (default).

Optional per-plan transforms apply on the wire: encryption, compression, and deduplication (via the DeDup Node's chunk checksum index). See Replication configuration.

Recovery data flow

Recovery reverses the path using the data plane on the recovery site:

  1. Disk reconstruction from the selected checkpoint (latest or point-in-time).
  2. Snapshot finalization and system health checks for Windows workloads (via the Prep Node).
  3. VM instantiation on the target platform's native compute.
  4. Network attachment per the recovery configuration.
  5. Boot-order orchestration with configured delays.
  6. Recovery validation (Windows guests).

Recovery orchestration runs in parallel while respecting cloud API limits, quota constraints, and storage concurrency — see Limits.

What crosses which boundary

  • Workload data moves only between your protected and recovery sites — over private subnets or VPN tunnels, optionally encrypted on the wire.
  • Orchestration calls flow outbound from Datamotive nodes to the platform managers (vCenter, AWS/Azure/GCP APIs) over port 443. No organization data is transmitted on this path.
  • Metadata stays inside the Management Servers on your sites. Nothing transits Datamotive infrastructure.

Related docs

Was this page helpful?