Performance Optimization

Tips and techniques for maximizing simulation performance.

Profiling Your Simulation

Before optimizing, identify bottlenecks:

from mechanics_dsl import PhysicsCompiler
from mechanics_dsl.utils.profiling import profile_simulation

compiler = PhysicsCompiler()
compiler.compile_dsl(source)

# Profile the simulation
with profile_simulation() as prof:
    solution = compiler.simulate(t_span=(0, 10))

# Print timing breakdown
prof.print_stats()

Typical output:

MechanicsDSL Profiling Report
=============================
Compilation:     0.234 s (12.3%)
Simulation:      1.567 s (82.4%)
  - RHS evals:   1.234 s (64.9%)
  - Integration: 0.333 s (17.5%)
Visualization:   0.100 s (5.3%)

Solver Selection

Choose the right integrator for your problem:

Solver	Best For	Notes
RK45	General purpose	Default, adaptive step
DOP853	High accuracy needs	8th order, fewer evals
LSODA	Unknown stiffness	Auto-switches methods
BDF	Stiff systems	Implicit, stable
Radau	Very stiff systems	Implicit, high order

Set solver in DSL:

\solve{DOP853}

Or in Python:

solution = compiler.simulate(t_span=(0, 10), method='DOP853')

Tolerance Tuning

Balance accuracy vs speed with tolerances:

# Faster but less accurate
solution = compiler.simulate(rtol=1e-3, atol=1e-6)

# Slower but very accurate
solution = compiler.simulate(rtol=1e-12, atol=1e-14)

Rule of thumb:

Visualization only: rtol=1e-3 is fine
Conservation checks: rtol=1e-6 to 1e-9
Research quality: rtol=1e-10 or tighter

Symbolic Simplification

Complex Lagrangians generate complex equations. Simplify:

compiler = PhysicsCompiler(simplify=True)  # Default

# For very complex systems, try aggressive simplification
compiler = PhysicsCompiler(simplify='aggressive')

This calls SymPy’s simplification routines which may take longer but produce faster runtime code.

Caching

Enable equation caching to avoid recompilation:

from mechanics_dsl.utils.caching import enable_cache

enable_cache(max_size=100)  # Cache last 100 compilations

# First call compiles
compiler.compile_dsl(source)  # ~0.5s

# Subsequent calls use cache
compiler.compile_dsl(source)  # ~0.01s

C++ Code Generation

For maximum performance, generate native code:

# Generate and compile C++
compiler.compile_to_cpp("simulation.cpp", compile_binary=True)

Typical speedups: 10-100x faster than Python.

See codegen/cpp for details.

Parallelization

For N-body or SPH simulations, use OpenMP:

compiler.compile_to_cpp("simulation.cpp", target="openmp")

Run with multiple threads:

export OMP_NUM_THREADS=8
./simulation

Memory Efficiency

For long simulations, avoid storing every time point:

# Store fewer points
solution = compiler.simulate(
    t_span=(0, 1000),
    num_points=1000  # Instead of default 10000
)

Or use dense output for interpolation:

solution = compiler.simulate(dense_output=True)

# Evaluate at any time
state_at_50 = solution.sol(50.0)

Common Performance Issues

Issue: Simulation slows down over time

Check for energy divergence (numerical instability)
Try smaller time step or implicit solver

Issue: Compilation takes too long

Simplify Lagrangian if possible
Enable caching
Pre-compile equations and reuse

Issue: Memory usage grows

Reduce num_points
Use streaming output for long simulations
Process results in chunks

Benchmarks

Reference performance on Intel i7-10700K:

System	Points	Python	C++
Simple pendulum	10,000	50 ms	1 ms
Double pendulum	10,000	210 ms	8 ms
3-body problem	10,000	530 ms	15 ms
Figure-8 orbit	10,000	1.2 s	35 ms
SPH (1000 particles)	2000 frames	N/A	4.2 s