Synthetic Data Generator¶
TrajPy provides a built-in synthetic trajectory generator (trajpy.traj_generator) that can produce
four physically motivated types of particle motion. These synthetic trajectories are the foundation for
building labelled training datasets for machine-learning-based trajectory classification.
Overview¶
The module trajpy.traj_generator exposes one generator function for each diffusion regime and a
convenience helper to persist the results to disk:
Function |
Description |
|---|---|
Subdiffusion / superdiffusion via the Weierstrass–Mandelbrot function |
|
Brownian (Fickian) diffusion via a Monte-Carlo acceptance–rejection scheme |
|
Diffusion restricted to a circular confinement region |
|
Directed (ballistic) motion at constant velocity |
|
Write trajectory arrays to CSV files |
All generator functions share the same return convention:
x, y = generator(...)
# x – 1-D array of time points, shape (n_steps,)
# y – position array
# shape (n_steps,) when n_samples == 1
# shape (n_steps, n_samples) when n_samples > 1
Diffusion Regimes¶
Anomalous Diffusion¶
Anomalous diffusion encompasses both subdiffusion (\(\alpha < 1\)) and superdiffusion (\(\alpha > 1\)). The mean squared displacement of an anomalous process scales as:
Trajectories are generated using the Weierstrass–Mandelbrot (WM) stochastic function:
where \(\gamma = \sqrt{\pi}\), \(t^* = 2\pi t / N\), and \(\phi_n\) is a uniformly distributed random phase in \([0, 2\pi)\).
- anomalous_diffusion(n_steps, n_samples, time_step, alpha)¶
Generate an ensemble of anomalous diffusion trajectories.
- Parameters:
n_steps (int) – Number of time steps per trajectory.
n_samples (int) – Number of independent trajectories to generate.
time_step (float) – Duration of each time step \(\Delta t\).
alpha (float) – Anomalous exponent (\(0 < \alpha < 2\)). Values below 1 produce subdiffusion; values above 1 produce superdiffusion.
- Returns:
(x, y)– time array and position array.- Return type:
tuple[numpy.ndarray, numpy.ndarray]
Normal Diffusion¶
Normal (Brownian) diffusion produces trajectories whose mean squared displacement grows linearly with time:
Steps are drawn using a Monte-Carlo acceptance–rejection method with the radial probability density:
where \(u\) is the magnitude of the proposed displacement, \(D\) is the diffusion coefficient, and \(\Delta t\) is the time step.
- normal_diffusion(n_steps, n_samples, dx, y0, D, dt)¶
Generate an ensemble of normal diffusion trajectories.
- Parameters:
n_steps (int) – Number of time steps per trajectory.
n_samples (int) – Number of independent trajectories.
dx (float) – Maximum proposed step length (defines the proposal interval \([-dx/2,\, dx/2]\)).
y0 (float) – Initial position.
D (float) – Diffusion coefficient.
dt (float) – Time step \(\Delta t\).
- Returns:
(x, y)– time array and position array.- Return type:
tuple[numpy.ndarray, numpy.ndarray]
Confined Diffusion¶
Confined diffusion models a particle undergoing Brownian motion within a bounded region of radius \(R\). At each macroscopic time step, a short normal-diffusion sub-trajectory is simulated; only displacements that keep the particle inside the confinement region are accepted.
- confined_diffusion(radius, n_steps, n_samples, dx, y0, D, dt)¶
Generate trajectories under spatial confinement.
- Parameters:
radius (float) – Confinement radius \(R\).
n_steps (int) – Number of time steps per trajectory.
n_samples (int) – Number of independent trajectories.
dx (float) – Maximum step length passed to the internal normal-diffusion sampler.
y0 (float) – Initial position.
D (float) – Diffusion coefficient.
dt (float) – Time step \(\Delta t\).
- Returns:
(x, y)– time array and position array.- Return type:
tuple[numpy.ndarray, numpy.ndarray]
Superdiffusion (Directed Motion)¶
Superdiffusion via directed (ballistic) motion models a particle moving at constant velocity \(v\), such that:
This represents the fastest possible transport regime and is typically combined pairwise with normal diffusion components to create realistic active-motion trajectories.
- superdiffusion(velocity, n_steps, n_samples, y0, dt)¶
Generate directed (ballistic) motion trajectories.
- Parameters:
velocity (float) – Constant drift velocity \(v\).
n_steps (int) – Number of time steps per trajectory.
n_samples (int) – Number of independent trajectories.
y0 (float) – Initial position.
dt (float) – Time step \(\Delta t\).
- Returns:
(x, y)– time array and position array.- Return type:
tuple[numpy.ndarray, numpy.ndarray]
Saving Trajectories to Disk¶
- save_to_file(y, param, path)¶
Save a trajectory array to a CSV file.
The output filename follows the pattern
<path>/traj_<param>.csvand contains a header rowt,x,y,...compatible with thetrajpy.Trajectoryloader.- Parameters:
y (numpy.ndarray) – Trajectory array of shape
(n_steps, n_dims)or(n_steps,)for a single 1-D trajectory.param (int | float | str) – A scalar or string that characterises the trajectory (e.g. the value of \(\alpha\) or \(D\)). Used in the filename.
path (str) – Directory where the file will be written.
Usage Examples¶
Anomalous Diffusion¶
Generate 20 anomalous trajectories spanning \(\alpha \in [0.1, 2.1]\) and save each one:
import numpy as np
import trajpy.traj_generator as tjg
n_steps = 250 # time steps per trajectory
n_samples = 1 # one trajectory per alpha value
dt = 1.0 # time increment
alphas = np.linspace(0.10, 2.1, 20)
for alpha in alphas:
x, y = tjg.anomalous_diffusion(n_steps, n_samples, dt, alpha=alpha)
tjg.save_to_file(y, alpha, 'data/anomalous')
Normal Diffusion¶
Generate trajectories for several diffusion coefficients:
import numpy as np
import trajpy.traj_generator as tjg
n_steps = 250
n_samples = 1
dt = 1.0
diffusivity = np.array([10., 100., 1000., 10000.])
for D in diffusivity:
x, y = tjg.normal_diffusion(n_steps, n_samples, dx=1.0, y0=0., D=D, dt=dt)
tjg.save_to_file(y, D, 'data/normal')
Confined Diffusion¶
Generate confined trajectories for three confinement radii:
import numpy as np
import trajpy.traj_generator as tjg
n_steps = 250
n_samples = 1
dt = 1.0
D = 100.
radii = np.array([5., 10., 20.])
for R in radii:
x, y = tjg.confined_diffusion(R, n_steps, n_samples, dx=1.0, y0=0.0, D=D, dt=dt)
tjg.save_to_file(y, R, 'data/confined')
Superdiffusion (Directed Motion)¶
Generate directed-motion trajectories for several velocities:
import numpy as np
import trajpy.traj_generator as tjg
n_steps = 250
n_samples = 1
dt = 1.0
velocities = np.array([0.1, 1., 2., 5.])
for v in velocities:
x, y = tjg.superdiffusion(v, n_steps, n_samples, y0=0., dt=dt)
tjg.save_to_file(y, v, 'data/superdiff')
Parameter Guidelines¶
The table below lists recommended starting values for each trajectory type.
Regime |
|
|
Key parameter |
Suggested range |
|---|---|---|---|---|
Anomalous |
≥ 100 |
1.0 |
|
0.1 – 2.0 |
Normal diffusion |
≥ 100 |
1.0 |
|
10 – 10 000 |
Confined |
≥ 100 |
1.0 |
|
5 – 50 |
Superdiffusion |
≥ 100 |
1.0 |
|
0.1 – 10 |
Tip
Use n_samples > 1 to produce an ensemble in a single call. The returned y
array will have shape (n_steps, n_samples), which can be iterated column-wise
when saving individual trajectories.
Pre-built Dataset¶
Instead of generating trajectories from scratch, you can download the labelled dataset that was produced with TrajPy and is publicly available on Zenodo:
Dataset generated by TrajPy for training a trajectory classifier. https://zenodo.org/records/3627650
For more details: https://trajpy.readthedocs.io
The dataset covers four trajectory classes — normal diffusion, direct motion, anomalous diffusion and confined diffusion — and provides the following pre-computed feature columns:
Column |
Description |
|---|---|
|
Anomalous exponent derived from the mean squared displacement |
|
Mean squared displacement ratio (short-time vs long-time scaling) |
|
Fractal dimension of the trajectory |
|
Anisotropy of the radius of gyration tensor |
|
Kurtosis of the radius of gyration |
|
Similarity of the trajectory to a straight line |
|
Similarity of the displacement distribution to a Gaussian |
|
Probability that the particle is spatially trapped |
|
Short-time diffusion coefficient |
|
Efficiency of the particle’s movement |
Tip
You can load the dataset directly with pandas and use it with any Scikit-Learn classifier:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
df = pd.read_csv('trajpy_dataset.csv')
X = df.drop(columns=['label'])
y = df['label']
clf = RandomForestClassifier()
clf.fit(X, y)
API Reference¶
- trajpy.traj_generator.anomalous_diffusion(n_steps: int, n_samples: int, time_step: float, alpha: float) Tuple[ndarray, ndarray][source]¶
Generates an ensemble of anomalous trajectories.
- Parameters:
n_steps – total number of steps
n_samples – number of simulations
time_step – time step
alpha – anomalous exponent
- Return x, y:
time, array containing N_sample trajectories with Nsteps
- trajpy.traj_generator.confined_diffusion(radius: float, n_steps: int, n_samples: int, dx: float, y0: float, D: float, dt: float) Tuple[ndarray, ndarray][source]¶
Generates trajectories under confinement.
- Parameters:
radius – confinement radius
n_steps – number of displacements
n_samples – number of trajectories
dx – displacement
y0 – initial position
D – diffusion coefficient
dt – time step
- Return x, y:
time, array containing N_samples trajectories with N_steps
- trajpy.traj_generator.normal_diffusion(n_steps: int, n_samples: int, dx: float, y0: float, D: float, dt: float) Tuple[ndarray, ndarray][source]¶
Generates an ensemble of normal diffusion trajectories.
- Parameters:
n_steps – total steps
n_samples – number of trajectories
dx – maximum step length
y0 – starting position
D – diffusivity
dt – time step
- Return x, y:
time, array containing N_samples trajectories with N_steps
- trajpy.traj_generator.normal_distribution(u: float, D: float, dt: float) float[source]¶
This is the steplength probability density function for normal diffusion.
- Parameters:
u – absolute distance travelled by the particle durint the time interval dt
D – diffusivity
dt – time interval
- Return pdf:
probability density function
- trajpy.traj_generator.save_to_file(y: ndarray, param: int | float | str, path: str) None[source]¶
Saves the trajectories to a file.
- Parameters:
y – trajectory array
param – a parameter that characterizes the kind of trajectory
path – path to the folder where the file will be saved
- trajpy.traj_generator.superdiffusion(velocity: float, n_steps: int, n_samples: int, y0: float, dt: float) Tuple[ndarray, ndarray][source]¶
Generates direct diffusion trajectories. Combine pairwise with normal diffusion components.
- Parameters:
velocity – constant velocity
n_steps – number of time steps
n_samples – number of trajectories
y0 – initial position
dt – time interval
- Return x, y:
time, array containing N_samples trajectories with N_steps
- trajpy.traj_generator.weierstrass_mandelbrot(t: float, n_displacements: int, alpha: float) float[source]¶
Calculates the weierstrass mandelbrot function
\[W(t) = \sum_{n=-\infty}^{\infty} \frac{\cos{(\phi_n )} - \cos{(\gamma^n t^* + \phi_n )} }{\gamma^{n\alpha/2}} \, .\]- Parameters:
t – time step
n_displacements – number of displacements
alpha – anomalous exponent
- Returns:
anomalous step