Parallelization

Functions in circuit_analyzer such as monte_carlo_corners support parallelization. This uses multiple cores on a single machine to run multiple simulations in parallel. Parallelization can be done over the wavelength range or over the number of (independent) simulation samples. If you choose a small number of samples or a small wavelength range, there might not be an improvement in the simulation time.

The default parallelization technique uses concurrent.futures.ProcessPoolExecutor. The number of threads is chosen automatically by ProcessPoolExecutor, but can be modified using the LUCEDA_NUM_THREADS environment variable.

Tips and tricks

Make sure to wrap the main script in an if __name__ == '__main__' clause, otherwise the workers will attempt to run everything in the main file many times.
Speedups in parallelization are only useful once the circuits are large enough and/or there are enough samples in the Monte Carlo run and/or if there are enough wavelength points.
The parallelization benefit is greatest with small number of threads. More threads will still speedup, but it does not scale linearly. More benefit is to be gained with large simulations.
Increasing the number of threads beyond the number of physical cores of your system slows down the simulation.