{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# JIT Particles and Scipy particles\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This tutorial is meant to highlight the potentially very big difference between the computational time required to run Parcels in **JIT** (Just-In-Time compilation) versus in **Scipy** mode. It also discusses how to more efficiently sample in Scipy mode.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Short summary: JIT is faster than scipy\n",
    "\n",
    "In the code snippet below, we use `AdvectionRK4` to advect 100 particles in the peninsula `FieldSet`. We first do it in JIT mode (by setting `ptype=JITParticle` in the declaration of `pset`) and then we also do it in Scipy mode (by setting `ptype=ScipyParticle` in the declaration of `pset`).\n",
    "\n",
    "In both cases, we advect the particles for 1 hour, with a timestep of 30 seconds.\n",
    "\n",
    "To measure the computational time, we use the `timer` module.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100%|██████████| 3600.0/3600.0 [00:01<00:00, 1843.14it/s]\n",
      "100%|██████████| 3600.0/3600.0 [00:00<00:00, 533400.25it/s]\n",
      "(100%)  Timer root                       : 3.165e+00 s\n",
      "(  3%)    (  3%) Timer fieldset creation : 9.443e-02 s\n",
      "( 65%)    ( 65%) Timer scipy             : 2.073e+00 s\n",
      "( 32%)    ( 32%) Timer jit               : 9.969e-01 s\n"
     ]
    }
   ],
   "source": [
    "from datetime import timedelta as delta\n",
    "\n",
    "from parcels import (\n",
    "    AdvectionRK4,\n",
    "    FieldSet,\n",
    "    JITParticle,\n",
    "    ParticleSet,\n",
    "    ScipyParticle,\n",
    "    download_example_dataset,\n",
    "    timer,\n",
    ")\n",
    "\n",
    "timer.root = timer.Timer(\"root\")\n",
    "\n",
    "timer.fieldset = timer.Timer(\"fieldset creation\", parent=timer.root)\n",
    "\n",
    "example_dataset_folder = download_example_dataset(\"Peninsula_data\")\n",
    "fieldset = FieldSet.from_parcels(\n",
    "    f\"{example_dataset_folder}/peninsula\", allow_time_extrapolation=True\n",
    ")\n",
    "timer.fieldset.stop()\n",
    "\n",
    "ptype = {\"scipy\": ScipyParticle, \"jit\": JITParticle}\n",
    "ptimer = {\n",
    "    \"scipy\": timer.Timer(\"scipy\", parent=timer.root, start=False),\n",
    "    \"jit\": timer.Timer(\"jit\", parent=timer.root, start=False),\n",
    "}\n",
    "\n",
    "for p in [\"scipy\", \"jit\"]:\n",
    "    pset = ParticleSet.from_line(\n",
    "        fieldset=fieldset,\n",
    "        pclass=ptype[p],\n",
    "        size=100,\n",
    "        start=(3e3, 3e3),\n",
    "        finish=(3e3, 45e3),\n",
    "    )\n",
    "\n",
    "    ptimer[p].start()\n",
    "    pset.execute(AdvectionRK4, runtime=delta(hours=1), dt=delta(seconds=30))\n",
    "    ptimer[p].stop()\n",
    "\n",
    "timer.root.stop()\n",
    "timer.root.print_tree()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see above, even in this very small example **Scipy mode took more than 2 times as long** (2.1 seconds versus 1.0 seconds) as the JIT mode. For larger examples, this can grow to hundreds of times slower.\n",
    "\n",
    "This is just an illustrative example, depending on the number of calls to `AdvectionRK4`, the size of the `FieldSet`, the size of the `pset`, the ratio between `dt` and `outputdt` in the `.execute` etc, the difference between JIT and Scipy can vary significantly. However, JIT will almost always be faster!\n",
    "\n",
    "So why does Parcels support both JIT and Scipy mode then? Because Scipy is easier to debug when writing custom kernels, so can provide faster development of new features.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "_As an aside, you may wonder why we use the `time.time` module, and not the `timeit` module, to time the runs above. That's because it affects the AST of the kernels, causing errors in JIT mode._\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Further digging into Scipy mode: adding `particle` keyword to `Field`-sampling\n",
    "\n",
    "Sometimes, you'd want to run Parcels in Scipy mode anyways. In that case, there are ways to make Parcels a bit faster.\n",
    "\n",
    "As background, one of the most computationally expensive operations in Parcels is the [Field Sampling](https://docs.oceanparcels.org/en/latest/examples/tutorial_sampling.html). In the default sampling in Scipy mode, we don't keep track of _where_ in the grid a particle is; which means that for every sampling call, we need to again search for which grid cell a particle is in.\n",
    "\n",
    "Let's see how this works in the simple Peninsula FieldSet used above. We use a simple Euler-Forward Advection now to make the point. In particular, we use two types of Advection Kernels\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def AdvectionEE_depth_lat_lon_time(particle, fieldset, time):\n",
    "    (u1, v1) = fieldset.UV[time, particle.depth, particle.lat, particle.lon]\n",
    "    particle.lon += u1 * particle.dt\n",
    "    particle.lat += v1 * particle.dt\n",
    "\n",
    "\n",
    "def AdvectionEE_depth_lat_lon_time_particle(particle, fieldset, time):\n",
    "    (u1, v1) = fieldset.UV[\n",
    "        time,\n",
    "        particle.depth,\n",
    "        particle.lat,\n",
    "        particle.lon,\n",
    "        particle,  # note the extra particle argument here\n",
    "    ]\n",
    "    particle.lon += u1 * particle.dt\n",
    "    particle.lat += v1 * particle.dt\n",
    "\n",
    "\n",
    "kernels = {\n",
    "    \"dllt\": AdvectionEE_depth_lat_lon_time,\n",
    "    \"dllt_p\": AdvectionEE_depth_lat_lon_time_particle,\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "  0%|          | 0/3600.0 [00:00<?, ?it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100%|██████████| 3600.0/3600.0 [00:00<00:00, 4426.66it/s]\n",
      "100%|██████████| 3600.0/3600.0 [00:00<00:00, 4111.05it/s]\n",
      "(100%)  Timer root                       : 1.705e+00 s\n",
      "( 48%)    ( 48%) Timer dllt              : 8.206e-01 s\n",
      "( 52%)    ( 52%) Timer dllt_p            : 8.785e-01 s\n"
     ]
    }
   ],
   "source": [
    "timer.root = timer.Timer(\"root\")\n",
    "ptimer = {\n",
    "    \"dllt\": timer.Timer(\"dllt\", parent=timer.root, start=False),\n",
    "    \"dllt_p\": timer.Timer(\"dllt_p\", parent=timer.root, start=False),\n",
    "}\n",
    "\n",
    "for k in [\"dllt\", \"dllt_p\"]:\n",
    "    pset = ParticleSet.from_line(\n",
    "        fieldset=fieldset,\n",
    "        pclass=ScipyParticle,\n",
    "        size=100,\n",
    "        start=(3e3, 3e3),\n",
    "        finish=(3e3, 45e3),\n",
    "    )\n",
    "\n",
    "    ptimer[k].start()\n",
    "    pset.execute(kernels[k], runtime=delta(hours=1), dt=delta(seconds=30))\n",
    "    ptimer[k].stop()\n",
    "\n",
    "timer.root.stop()\n",
    "timer.root.print_tree()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You will see that the two kernels don't really differ in speed. That is because the Peninsula FieldSet is a simple _Rectilinear_ Grid, where indexing a particle location to the grid is very fast.\n",
    "\n",
    "However, the difference is much more pronounced if we use a _Curvilinear_ Grid like in the [NEMO dataset](https://docs.oceanparcels.org/en/latest/examples/tutorial_nemo_curvilinear.html).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "example_dataset_folder = download_example_dataset(\"NemoCurvilinear_data\")\n",
    "filenames = {\n",
    "    \"U\": {\n",
    "        \"lon\": f\"{example_dataset_folder}/mesh_mask.nc4\",\n",
    "        \"lat\": f\"{example_dataset_folder}/mesh_mask.nc4\",\n",
    "        \"data\": f\"{example_dataset_folder}/U_purely_zonal-ORCA025_grid_U.nc4\",\n",
    "    },\n",
    "    \"V\": {\n",
    "        \"lon\": f\"{example_dataset_folder}/mesh_mask.nc4\",\n",
    "        \"lat\": f\"{example_dataset_folder}/mesh_mask.nc4\",\n",
    "        \"data\": f\"{example_dataset_folder}/V_purely_zonal-ORCA025_grid_V.nc4\",\n",
    "    },\n",
    "}\n",
    "variables = {\"U\": \"U\", \"V\": \"V\"}\n",
    "dimensions = {\"lon\": \"glamf\", \"lat\": \"gphif\", \"time\": \"time_counter\"}\n",
    "fieldset = FieldSet.from_nemo(\n",
    "    filenames, variables, dimensions, allow_time_extrapolation=True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "100%|██████████| 864000.0/864000.0 [00:03<00:00, 243133.93it/s]\n",
      "100%|██████████| 864000.0/864000.0 [00:00<00:00, 5615534.75it/s]\n",
      "(100%)  Timer root                       : 3.715e+00 s\n",
      "( 96%)    ( 96%) Timer dllt              : 3.558e+00 s\n",
      "(  4%)    (  4%) Timer dllt_p            : 1.568e-01 s\n"
     ]
    }
   ],
   "source": [
    "timer.root = timer.Timer(\"root\")\n",
    "ptimer = {\n",
    "    \"dllt\": timer.Timer(\"dllt\", parent=timer.root, start=False),\n",
    "    \"dllt_p\": timer.Timer(\"dllt_p\", parent=timer.root, start=False),\n",
    "}\n",
    "\n",
    "for k in [\"dllt\", \"dllt_p\"]:\n",
    "    pset = ParticleSet.from_line(\n",
    "        fieldset=fieldset,\n",
    "        pclass=ScipyParticle,\n",
    "        size=10,\n",
    "        start=(45, 40),\n",
    "        finish=(60, 40),\n",
    "    )\n",
    "\n",
    "    ptimer[k].start()\n",
    "    pset.execute(kernels[k], runtime=delta(days=10), dt=delta(hours=6))\n",
    "    ptimer[k].stop()\n",
    "\n",
    "timer.root.stop()\n",
    "timer.root.print_tree()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, the difference is massive, with the `AdvectionEE_depth_lat_lon_time_particle` kernel more than 20 times faster than the kernel without the `particle` argument at the end of the Field sampling operation.\n",
    "\n",
    "So, if you want to run in Scipy mode, make sure to add `particle` at the end of your Field sampling!\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}