{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# JIT Particles and Scipy particles\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial is meant to highlight the potentially very big difference between the computational time required to run Parcels in **JIT** (Just-In-Time compilation) versus in **Scipy** mode. It also discusses how to more efficiently sample in Scipy mode.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Short summary: JIT is faster than scipy\n", "\n", "In the code snippet below, we use `AdvectionRK4` to advect 100 particles in the peninsula `FieldSet`. We first do it in JIT mode (by setting `ptype=JITParticle` in the declaration of `pset`) and then we also do it in Scipy mode (by setting `ptype=ScipyParticle` in the declaration of `pset`).\n", "\n", "In both cases, we advect the particles for 1 hour, with a timestep of 30 seconds.\n", "\n", "To measure the computational time, we use the `timer` module.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100%|██████████| 3600.0/3600.0 [00:01<00:00, 1843.14it/s]\n", "100%|██████████| 3600.0/3600.0 [00:00<00:00, 533400.25it/s]\n", "(100%) Timer root : 3.165e+00 s\n", "( 3%) ( 3%) Timer fieldset creation : 9.443e-02 s\n", "( 65%) ( 65%) Timer scipy : 2.073e+00 s\n", "( 32%) ( 32%) Timer jit : 9.969e-01 s\n" ] } ], "source": [ "from datetime import timedelta as delta\n", "\n", "from parcels import (\n", " AdvectionRK4,\n", " FieldSet,\n", " JITParticle,\n", " ParticleSet,\n", " ScipyParticle,\n", " download_example_dataset,\n", " timer,\n", ")\n", "\n", "timer.root = timer.Timer(\"root\")\n", "\n", "timer.fieldset = timer.Timer(\"fieldset creation\", parent=timer.root)\n", "\n", "example_dataset_folder = download_example_dataset(\"Peninsula_data\")\n", "fieldset = FieldSet.from_parcels(\n", " f\"{example_dataset_folder}/peninsula\", allow_time_extrapolation=True\n", ")\n", "timer.fieldset.stop()\n", "\n", "ptype = {\"scipy\": ScipyParticle, \"jit\": JITParticle}\n", "ptimer = {\n", " \"scipy\": timer.Timer(\"scipy\", parent=timer.root, start=False),\n", " \"jit\": timer.Timer(\"jit\", parent=timer.root, start=False),\n", "}\n", "\n", "for p in [\"scipy\", \"jit\"]:\n", " pset = ParticleSet.from_line(\n", " fieldset=fieldset,\n", " pclass=ptype[p],\n", " size=100,\n", " start=(3e3, 3e3),\n", " finish=(3e3, 45e3),\n", " )\n", "\n", " ptimer[p].start()\n", " pset.execute(AdvectionRK4, runtime=delta(hours=1), dt=delta(seconds=30))\n", " ptimer[p].stop()\n", "\n", "timer.root.stop()\n", "timer.root.print_tree()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "As you can see above, even in this very small example **Scipy mode took more than 2 times as long** (2.1 seconds versus 1.0 seconds) as the JIT mode. For larger examples, this can grow to hundreds of times slower.\n", "\n", "This is just an illustrative example, depending on the number of calls to `AdvectionRK4`, the size of the `FieldSet`, the size of the `pset`, the ratio between `dt` and `outputdt` in the `.execute` etc, the difference between JIT and Scipy can vary significantly. However, JIT will almost always be faster!\n", "\n", "So why does Parcels support both JIT and Scipy mode then? Because Scipy is easier to debug when writing custom kernels, so can provide faster development of new features.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "_As an aside, you may wonder why we use the `time.time` module, and not the `timeit` module, to time the runs above. That's because it affects the AST of the kernels, causing errors in JIT mode._\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Further digging into Scipy mode: adding `particle` keyword to `Field`-sampling\n", "\n", "Sometimes, you'd want to run Parcels in Scipy mode anyways. In that case, there are ways to make Parcels a bit faster.\n", "\n", "As background, one of the most computationally expensive operations in Parcels is the [Field Sampling](https://docs.oceanparcels.org/en/latest/examples/tutorial_sampling.html). In the default sampling in Scipy mode, we don't keep track of _where_ in the grid a particle is; which means that for every sampling call, we need to again search for which grid cell a particle is in.\n", "\n", "Let's see how this works in the simple Peninsula FieldSet used above. We use a simple Euler-Forward Advection now to make the point. In particular, we use two types of Advection Kernels\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def AdvectionEE_depth_lat_lon_time(particle, fieldset, time):\n", " (u1, v1) = fieldset.UV[time, particle.depth, particle.lat, particle.lon]\n", " particle.lon += u1 * particle.dt\n", " particle.lat += v1 * particle.dt\n", "\n", "\n", "def AdvectionEE_depth_lat_lon_time_particle(particle, fieldset, time):\n", " (u1, v1) = fieldset.UV[\n", " time,\n", " particle.depth,\n", " particle.lat,\n", " particle.lon,\n", " particle, # note the extra particle argument here\n", " ]\n", " particle.lon += u1 * particle.dt\n", " particle.lat += v1 * particle.dt\n", "\n", "\n", "kernels = {\n", " \"dllt\": AdvectionEE_depth_lat_lon_time,\n", " \"dllt_p\": AdvectionEE_depth_lat_lon_time_particle,\n", "}" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 0%| | 0/3600.0 [00:00