Synthetic Data Generator

TrajPy provides a built-in synthetic trajectory generator (trajpy.traj_generator) that can produce four physically motivated types of particle motion. These synthetic trajectories are the foundation for building labelled training datasets for machine-learning-based trajectory classification.


Overview

The module trajpy.traj_generator exposes one generator function for each diffusion regime and a convenience helper to persist the results to disk:

Function

Description

anomalous_diffusion()

Subdiffusion / superdiffusion via the Weierstrass–Mandelbrot function

normal_diffusion()

Brownian (Fickian) diffusion via a Monte-Carlo acceptance–rejection scheme

confined_diffusion()

Diffusion restricted to a circular confinement region

superdiffusion()

Directed (ballistic) motion at constant velocity

save_to_file()

Write trajectory arrays to CSV files

All generator functions share the same return convention:

x, y = generator(...)
# x – 1-D array of time points,  shape (n_steps,)
# y – position array
#     shape (n_steps,)          when n_samples == 1
#     shape (n_steps, n_samples) when n_samples  > 1

Diffusion Regimes

Anomalous Diffusion

Anomalous diffusion encompasses both subdiffusion (\(\alpha < 1\)) and superdiffusion (\(\alpha > 1\)). The mean squared displacement of an anomalous process scales as:

\[\langle r^2(t) \rangle \propto t^{\alpha}\]

Trajectories are generated using the Weierstrass–Mandelbrot (WM) stochastic function:

\[W(t) = \sum_{n=-\infty}^{\infty} \frac{\cos(\phi_n) - \cos(\gamma^n t^* + \phi_n)} {\gamma^{n\alpha/2}}\]

where \(\gamma = \sqrt{\pi}\), \(t^* = 2\pi t / N\), and \(\phi_n\) is a uniformly distributed random phase in \([0, 2\pi)\).

anomalous_diffusion(n_steps, n_samples, time_step, alpha)

Generate an ensemble of anomalous diffusion trajectories.

Parameters:
  • n_steps (int) – Number of time steps per trajectory.

  • n_samples (int) – Number of independent trajectories to generate.

  • time_step (float) – Duration of each time step \(\Delta t\).

  • alpha (float) – Anomalous exponent (\(0 < \alpha < 2\)). Values below 1 produce subdiffusion; values above 1 produce superdiffusion.

Returns:

(x, y) – time array and position array.

Return type:

tuple[numpy.ndarray, numpy.ndarray]

Normal Diffusion

Normal (Brownian) diffusion produces trajectories whose mean squared displacement grows linearly with time:

\[\langle r^2(t) \rangle = 4 D t\]

Steps are drawn using a Monte-Carlo acceptance–rejection method with the radial probability density:

\[p(u) = \frac{2u}{4Dt} \exp\!\left(-\frac{u^2}{4Dt}\right)\]

where \(u\) is the magnitude of the proposed displacement, \(D\) is the diffusion coefficient, and \(\Delta t\) is the time step.

normal_diffusion(n_steps, n_samples, dx, y0, D, dt)

Generate an ensemble of normal diffusion trajectories.

Parameters:
  • n_steps (int) – Number of time steps per trajectory.

  • n_samples (int) – Number of independent trajectories.

  • dx (float) – Maximum proposed step length (defines the proposal interval \([-dx/2,\, dx/2]\)).

  • y0 (float) – Initial position.

  • D (float) – Diffusion coefficient.

  • dt (float) – Time step \(\Delta t\).

Returns:

(x, y) – time array and position array.

Return type:

tuple[numpy.ndarray, numpy.ndarray]

Confined Diffusion

Confined diffusion models a particle undergoing Brownian motion within a bounded region of radius \(R\). At each macroscopic time step, a short normal-diffusion sub-trajectory is simulated; only displacements that keep the particle inside the confinement region are accepted.

confined_diffusion(radius, n_steps, n_samples, dx, y0, D, dt)

Generate trajectories under spatial confinement.

Parameters:
  • radius (float) – Confinement radius \(R\).

  • n_steps (int) – Number of time steps per trajectory.

  • n_samples (int) – Number of independent trajectories.

  • dx (float) – Maximum step length passed to the internal normal-diffusion sampler.

  • y0 (float) – Initial position.

  • D (float) – Diffusion coefficient.

  • dt (float) – Time step \(\Delta t\).

Returns:

(x, y) – time array and position array.

Return type:

tuple[numpy.ndarray, numpy.ndarray]

Superdiffusion (Directed Motion)

Superdiffusion via directed (ballistic) motion models a particle moving at constant velocity \(v\), such that:

\[y(t + \Delta t) = y(t) + v \,\Delta t\]

This represents the fastest possible transport regime and is typically combined pairwise with normal diffusion components to create realistic active-motion trajectories.

superdiffusion(velocity, n_steps, n_samples, y0, dt)

Generate directed (ballistic) motion trajectories.

Parameters:
  • velocity (float) – Constant drift velocity \(v\).

  • n_steps (int) – Number of time steps per trajectory.

  • n_samples (int) – Number of independent trajectories.

  • y0 (float) – Initial position.

  • dt (float) – Time step \(\Delta t\).

Returns:

(x, y) – time array and position array.

Return type:

tuple[numpy.ndarray, numpy.ndarray]


Saving Trajectories to Disk

save_to_file(y, param, path)

Save a trajectory array to a CSV file.

The output filename follows the pattern <path>/traj_<param>.csv and contains a header row t,x,y,... compatible with the trajpy.Trajectory loader.

Parameters:
  • y (numpy.ndarray) – Trajectory array of shape (n_steps, n_dims) or (n_steps,) for a single 1-D trajectory.

  • param (int | float | str) – A scalar or string that characterises the trajectory (e.g. the value of \(\alpha\) or \(D\)). Used in the filename.

  • path (str) – Directory where the file will be written.


Usage Examples

Anomalous Diffusion

Generate 20 anomalous trajectories spanning \(\alpha \in [0.1, 2.1]\) and save each one:

import numpy as np
import trajpy.traj_generator as tjg

n_steps  = 250   # time steps per trajectory
n_samples = 1    # one trajectory per alpha value
dt        = 1.0  # time increment

alphas = np.linspace(0.10, 2.1, 20)

for alpha in alphas:
    x, y = tjg.anomalous_diffusion(n_steps, n_samples, dt, alpha=alpha)
    tjg.save_to_file(y, alpha, 'data/anomalous')

Normal Diffusion

Generate trajectories for several diffusion coefficients:

import numpy as np
import trajpy.traj_generator as tjg

n_steps   = 250
n_samples = 1
dt        = 1.0
diffusivity = np.array([10., 100., 1000., 10000.])

for D in diffusivity:
    x, y = tjg.normal_diffusion(n_steps, n_samples, dx=1.0, y0=0., D=D, dt=dt)
    tjg.save_to_file(y, D, 'data/normal')

Confined Diffusion

Generate confined trajectories for three confinement radii:

import numpy as np
import trajpy.traj_generator as tjg

n_steps   = 250
n_samples = 1
dt        = 1.0
D         = 100.
radii     = np.array([5., 10., 20.])

for R in radii:
    x, y = tjg.confined_diffusion(R, n_steps, n_samples, dx=1.0, y0=0.0, D=D, dt=dt)
    tjg.save_to_file(y, R, 'data/confined')

Superdiffusion (Directed Motion)

Generate directed-motion trajectories for several velocities:

import numpy as np
import trajpy.traj_generator as tjg

n_steps   = 250
n_samples = 1
dt        = 1.0
velocities = np.array([0.1, 1., 2., 5.])

for v in velocities:
    x, y = tjg.superdiffusion(v, n_steps, n_samples, y0=0., dt=dt)
    tjg.save_to_file(y, v, 'data/superdiff')

Parameter Guidelines

The table below lists recommended starting values for each trajectory type.

Regime

n_steps

dt

Key parameter

Suggested range

Anomalous

≥ 100

1.0

alpha

0.1 – 2.0

Normal diffusion

≥ 100

1.0

D

10 – 10 000

Confined

≥ 100

1.0

radius

5 – 50

Superdiffusion

≥ 100

1.0

velocity

0.1 – 10

Tip

Use n_samples > 1 to produce an ensemble in a single call. The returned y array will have shape (n_steps, n_samples), which can be iterated column-wise when saving individual trajectories.


Pre-built Dataset

Instead of generating trajectories from scratch, you can download the labelled dataset that was produced with TrajPy and is publicly available on Zenodo:

Dataset generated by TrajPy for training a trajectory classifier. https://zenodo.org/records/3627650

For more details: https://trajpy.readthedocs.io

The dataset covers four trajectory classes — normal diffusion, direct motion, anomalous diffusion and confined diffusion — and provides the following pre-computed feature columns:

Column

Description

alpha

Anomalous exponent derived from the mean squared displacement

ratio

Mean squared displacement ratio (short-time vs long-time scaling)

df

Fractal dimension of the trajectory

anisotropy

Anisotropy of the radius of gyration tensor

kurtosis

Kurtosis of the radius of gyration

straightness

Similarity of the trajectory to a straight line

gaussianity

Similarity of the displacement distribution to a Gaussian

trappedness

Probability that the particle is spatially trapped

diffusivity

Short-time diffusion coefficient

efficiency

Efficiency of the particle’s movement

Tip

You can load the dataset directly with pandas and use it with any Scikit-Learn classifier:

import pandas as pd
from sklearn.ensemble import RandomForestClassifier

df = pd.read_csv('trajpy_dataset.csv')
X = df.drop(columns=['label'])
y = df['label']

clf = RandomForestClassifier()
clf.fit(X, y)

API Reference

trajpy.traj_generator.anomalous_diffusion(n_steps: int, n_samples: int, time_step: float, alpha: float) Tuple[ndarray, ndarray][source]

Generates an ensemble of anomalous trajectories.

Parameters:
  • n_steps – total number of steps

  • n_samples – number of simulations

  • time_step – time step

  • alpha – anomalous exponent

Return x, y:

time, array containing N_sample trajectories with Nsteps

trajpy.traj_generator.confined_diffusion(radius: float, n_steps: int, n_samples: int, dx: float, y0: float, D: float, dt: float) Tuple[ndarray, ndarray][source]

Generates trajectories under confinement.

Parameters:
  • radius – confinement radius

  • n_steps – number of displacements

  • n_samples – number of trajectories

  • dx – displacement

  • y0 – initial position

  • D – diffusion coefficient

  • dt – time step

Return x, y:

time, array containing N_samples trajectories with N_steps

trajpy.traj_generator.normal_diffusion(n_steps: int, n_samples: int, dx: float, y0: float, D: float, dt: float) Tuple[ndarray, ndarray][source]

Generates an ensemble of normal diffusion trajectories.

Parameters:
  • n_steps – total steps

  • n_samples – number of trajectories

  • dx – maximum step length

  • y0 – starting position

  • D – diffusivity

  • dt – time step

Return x, y:

time, array containing N_samples trajectories with N_steps

trajpy.traj_generator.normal_distribution(u: float, D: float, dt: float) float[source]

This is the steplength probability density function for normal diffusion.

Parameters:
  • u – absolute distance travelled by the particle durint the time interval dt

  • D – diffusivity

  • dt – time interval

Return pdf:

probability density function

trajpy.traj_generator.save_to_file(y: ndarray, param: int | float | str, path: str) None[source]

Saves the trajectories to a file.

Parameters:
  • y – trajectory array

  • param – a parameter that characterizes the kind of trajectory

  • path – path to the folder where the file will be saved

trajpy.traj_generator.superdiffusion(velocity: float, n_steps: int, n_samples: int, y0: float, dt: float) Tuple[ndarray, ndarray][source]

Generates direct diffusion trajectories. Combine pairwise with normal diffusion components.

Parameters:
  • velocity – constant velocity

  • n_steps – number of time steps

  • n_samples – number of trajectories

  • y0 – initial position

  • dt – time interval

Return x, y:

time, array containing N_samples trajectories with N_steps

trajpy.traj_generator.weierstrass_mandelbrot(t: float, n_displacements: int, alpha: float) float[source]

Calculates the weierstrass mandelbrot function

\[W(t) = \sum_{n=-\infty}^{\infty} \frac{\cos{(\phi_n )} - \cos{(\gamma^n t^* + \phi_n )} }{\gamma^{n\alpha/2}} \, .\]
Parameters:
  • t – time step

  • n_displacements – number of displacements

  • alpha – anomalous exponent

Returns:

anomalous step