{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "[<img src=\"https://gitlab.irit.fr/toc/etu-n7/controle-optimal/-/raw/master/ressources/Logo-toulouse-inp-N7.png\" alt=\"N7\" height=\"80\"/>](https://gitlab.irit.fr/toc/etu-n7/controle-optimal)\n", "<img src=\"https://gitlab.irit.fr/toc/ens-n7/texCoursN7/-/raw/main/logo-insa.png\" alt=\"INSA\" height=\"80\" style=\"margin-left:50px\"/>\n", "\n", "# Calcul de dérivées\n", "\n", "- Date : 2024-2025\n", "- Durée approximative : 1h15\n", "\n", "<div style=\"width:95%;\n", " margin:10px;\n", " padding:8px;\n", " background-color:#afa;\n", " border:2px solid #bbffbb;\n", " border-radius:10px;\n", " font-weight:bold;\n", " font-size:1.5em;\n", " text-align:center;\">\n", "Introduction\n", "</div>\n", "\n", "Il existe plusieurs façon de calculer une dérivée sur un calculateur : \n", "\n", "- par différences finies;\n", "- par différentiation complexe;\n", "- en utilisation la différentiation automatique;\n", "- en utilisant le calcul formel et un générateur de code.\n", "\n", "Nous allons étudier ici quelques cas que nous allons appliquer au calcul des équations variationnelles (ou linéarisées) des équations différentielles.\n", "\n", "On notera $\\|\\cdot\\|$ la norme euclidienne usuelle et $\\mathcal{N}(0,1)$ une variable aléatoire Gaussienne centrée réduite." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# activate local project\n", "using Pkg\n", "Pkg.activate(\".\")\n", "\n", "# load packages\n", "using DualNumbers\n", "using ForwardDiff\n", "using LinearAlgebra\n", "using Plots\n", "using Plots.Measures\n", "using Printf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div style=\"width:95%;\n", " margin:10px;\n", " padding:8px;\n", " background-color:#afa;\n", " border:2px solid #bbffbb;\n", " border-radius:10px;\n", " font-weight:bold;\n", " font-size:1.5em;\n", " text-align:center;\">\n", "Dérivées par différences finies avant\n", "</div>\n", "\n", "Soit $f$ une fonction lisse de $\\mathbb{R}^{n}$ dans $\\mathbb{R}^{m}$, $x$ un point de $\\mathbb{R}^{n}$ et $v$ un vecteur de $\\mathbb{R}^{n}$. Si on note $g \\colon h \\mapsto f(x+hv)$, alors on a d'après la formule de Taylor-Young :\n", "$$\n", " g(h) = \\sum_{i=0}^{n} \\frac{h^i}{i!} g^{(i)}(0) + R_n(h), \\quad R_n(h) = o(h^n),\n", "$$\n", "ou d'après Taylor-Lagrange, \n", "$$\n", " \\| R_n(h) \\| \\leq \\frac{M_n h^{n+1}}{(n+1)!},\n", "$$\n", "de même, \n", "$$\n", " g(-h) = \\sum_{i=0}^{n} \\frac{(-h)^i}{i!} g^{(i)}(0) + R_n(h).\n", "$$\n", "\n", "La méthode des différences finies avants consiste à approcher la différentielle de $f$ en $x$ dans la direction $v$ par la formule suivante : \n", "$$\n", " \\frac{f(x+hv) - f(x)}{h} = \n", " \\frac{g(h)-g(0)}{h} = g'(0) + \\frac{h}{2} g^{2}(0) + \\frac{h^2}{6} g^{(3)}(0) + o(h^2).\n", "$$\n", "L'approximation ainsi obtenue de $ g'(0) = f'(x) \\cdot v \\in \\mathbb{R}^m$ est d'ordre 1 si $g^{(2)}(0) \\neq 0$ ou au moins d'ordre 2 sinon. \n", "\n", "**Remarque.** Sur machine, Les calculs se font en virgule flottante. On note epsilon machine, le plus petit nombre $\\mathrm{eps}_\\mathrm{mach}$ tel que $1+\\mathrm{eps}_\\mathrm{mach}\\ne 1$. Cette quantité dépend des machines et de l'encodage des données. Pour l'optenir en `Julia` il suffit de taper `eps()` (ou `eps(Float64)` pour les flottants codés sur 64 bits).\n", "\n", "Notons $\\mathrm{num}(g,\\, h)$ la valeur de $g(h)$ calculée numériquement et supposons que l'on puisse majorer l'erreur relative numérique par : \n", "$$\n", " \\left\\| \\mathrm{num}(g,h) - g(h) \\right\\| := \\| e_h\\| \\leq \\mathrm{eps}_\\mathrm{mach} L_f,\n", "$$\n", "ou $L_f$ est une constante qui dépend de la valeur de $f$ sur le domaine d'intérêt. Ainsi on a : \n", "\\begin{align*}\n", " \\left\\| \\frac{\\mathrm{num}(g,h) - \\mathrm{num}(g,0)}{h} - g'(0) \\right\\|\n", " &= \\left\\| \\frac{g(h) + e_h - g(0) - e_0}{h} - g'(0) \\right\\|, \\\\[1em]\n", " &= \\left\\| \\frac{R_1(h)}{h} + \\frac{e_h - e_0}{h} \\right\\|, \\\\[1em]\n", " &\\leq \\left\\| \\frac{R_1(h)}{ h} \\right\\| + \\left\\| \\frac{e_h - e_0}{h} \\right\\|, \\\\[1em]\n", " & \\leq \n", " \\underbrace{ \\frac{M_1 h}{2}}_{{\\text{Erreur d'approximation}}} + \n", " \\underbrace{2 \\frac{\\mathrm{eps}_\\mathrm{mach}L_f}{h}}_{{\\text{Erreur numérique}}}.\n", "\\end{align*} \n", "Le majorant trouvé atteint son minimum en \n", "$$\n", " h_{*} = 2 \\sqrt{\\frac{\\mathrm{eps}_\\mathrm{mach} L_f }{M_1}}.\n", "$$\n", "\n", "En considérant que $L_f \\simeq M_1$, alors le choix se révélant le plus optimal est \n", "$$\n", " h_{*} \\approx \\sqrt{\\mathrm{eps}_\\mathrm{mach}}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div style=\"width:95%;\n", " margin:10px;\n", " padding:8px;\n", " background-color:#afa;\n", " border:2px solid #bbffbb;\n", " border-radius:10px;\n", " font-weight:bold;\n", " font-size:1.5em;\n", " text-align:center;\">\n", "Dérivées par différences finies centrées\n", "</div>\n", "\n", "On peut utiliser un schéma de différences finies centrée pour calculer la dérviée de $g$. \n", "$$\n", " \\frac{f(x+hv) - f(x-hv)}{2h} = \\frac{g(h) - g(-h)}{2h} = \n", " g'(0) + g^{(3)}(0) \\frac{h^2}{6} + \\mathcal{O}(h^4),\n", "$$\n", "l'approximation ainsi obtenue de $f'(x) \\cdot v \\in \\mathbb{R}^{m}$ est d'ordre 2 si $g^{(3)}(0) \\neq 0$ ou au moins d'ordre 4 sinon. À noter que ce schéma nécessite plus d'évaluations de la fonction $f$. On peut montrer comme précédemment que le meilleur $h$ est de l'ordre \n", "$$\n", " h_* \\approx \\sqrt[3]{\\mathrm{eps}_\\mathrm{mach}}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div style=\"width:95%;\n", " margin:10px;\n", " padding:8px;\n", " background-color:#afa;\n", " border:2px solid #bbffbb;\n", " border-radius:10px;\n", " font-weight:bold;\n", " font-size:1.5em;\n", " text-align:center;\">\n", "Dérivées par différentiation complexe\n", "</div>\n", "\n", "Les formules des schémas avant et centrée sont sensibles aux calculs de la différence $\\Delta f = f(x+h) - f(x)$ ou $\\Delta f = f(x+h) - f(x-h)$. Pour remédier à ce problème, les [différences finies à pas complexe](https://dl.acm.org/doi/10.1145/838250.838251) ont été introduites.\n", "Si on suppose que la fonction $g$ est holomorphe, c'est-à-dire dérivable au sens complexe,\n", "on peut considérer un pas complexe $ih$. Un développement limité de $g$ en $0$ s'écrit\n", "\n", "$$\n", " f(x+ih v) = g(ih) = g(0) + ih g'(0) - \\frac{h^2}{2} g^{(2)}(0) - i\\frac{h^3}{6} g^{(3)}(0) + o(h^3),\n", "$$\n", "\n", "On considère alors l'approximation : \n", "\n", "$$\n", " f'(x) \\cdot v = g'(0) \\approx \\frac{\\mathrm{Im}(f(x+ihv))}{h}.\n", "$$\n", "\n", "On peut prouver que l'approximation ci-dessus est au moins d'ordre 2 et aussi démontrer que tout pas inférieur à $h_*$ est optimal, avec \n", "$$\n", " h_{*} \\approx \\sqrt{\\mathrm{eps}_{\\mathrm{mach}}}.\n", "$$\n", "\n", "**Remarque.** Utiliser en `Julia` la commande `imag` pour calculer la partie imaginaire d'un nombre complexe et la variable `im` pour représenter l'unité imaginaire." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div style=\"width:95%;\n", " margin:10px;\n", " padding:8px;\n", " background-color:#afa;\n", " border:2px solid #bbffbb;\n", " border-radius:10px;\n", " font-weight:bold;\n", " font-size:1.5em;\n", " text-align:center;\">\n", "Dérivées par différentiation automatique via les nombres duaux\n", "</div>\n", "\n", "Les nombres duaux s'écrivent sous la forme $a + b\\, \\varepsilon$ avec $(a,b)\\in \\mathbb{R}^2$ et $\\varepsilon^2 = 0$. Nous allons voir comment nous pouvons les utiliser pour calculer des dérivées.\n", "\n", "Soit deux fonctions $f, g \\colon \\mathbb{R} \\to \\mathbb{R}$ dérivables, de dérivées respectives $f'$ et $g'$. On pose\n", "\n", "$$\n", "f(a + b\\, \\varepsilon) := f(a) + f'(a)\\, b\\, \\varepsilon\n", "$$\n", "\n", "et\n", "\n", "$$\n", "g(a + b\\, \\varepsilon) := g(a) + g'(a)\\, b\\, \\varepsilon.\n", "$$\n", "\n", "On a alors automatiquement les propriétés suivantes. Posons $d = x + \\varepsilon$, alors :\n", "\n", "- $(f + g)(d) = (f+g)(x) + (f+g)'(x) \\, \\varepsilon$\n", "- $(fg)(d) = (fg)(x) + (fg)'(x) \\, \\varepsilon$\n", "- $(g \\circ f)(d) = (g \\circ f)(x) + (g \\circ f)'(x) \\, \\varepsilon$\n", "\n", "Voici comment créer un nombre dual en `Julia` et récupérer les parties réelles et duales (avec ce que j'ai défini ci-dessous) :\n", "\n", "```julia\n", "using DualNumbers\n", "\n", "# scalar case\n", "d = 1 + 2ε # ou 1 + 2 * ε ou 1 + ε * 2\n", "real(d) # 1\n", "dual(d) # 2\n", "\n", "# vector case\n", "d = [1, 3] + [2, 4]ε # ou [1, 3] + [2, 4] * ε ou [1, 3] + ε * [2, 4] ou [1+2ε, 3+4ε]\n", "real(d) # [1, 3]\n", "dual(d) # [2, 4]\n", "```\n", "\n", "**Remarque.** On peut aussi utiliser le package `ForwardDiff` pour calculer des dérivées automatiquement. Il est plus performant que `DualNumbers`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div style=\"width:95%;\n", " margin:10px;\n", " padding:8px;\n", " background-color:#afa;\n", " border:2px solid #bbffbb;\n", " border-radius:10px;\n", " font-weight:bold;\n", " font-size:1.5em;\n", " text-align:center;\">\n", "Fonctions auxiliaires\n", "</div>" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# available methods\n", "methods = (:forward, :central, :complex, :dual, :forward_ad)\n", "\n", "# type of x or its coordinates\n", "function mytypeof(x::Union{T, Vector{<:T}}) where T<:AbstractFloat\n", " return T\n", "end\n", "\n", "# default step value\n", "function _step(x, v, method)\n", " T = mytypeof(x)\n", " eps_value = T isa AbstractFloat ? eps(T) : eps(1.)\n", " if method == :forward\n", " step = √(eps_value)\n", " elseif method == :central\n", " step = (eps_value)^(1/3)\n", " elseif method == :complex\n", " step = √(eps_value)\n", " else\n", " step = 0.0\n", " end\n", " step *= √(max(1., norm(x))) / √(max(1.0, norm(v)))\n", " return step\n", "end\n", "\n", "# default method value\n", "function _method()\n", " return :forward \n", "end;\n", "\n", "# creation of dual number ε\n", "import Base.*\n", "*(e::Function, x::Union{Number, Vector{<:Number}}) = e(x)\n", "*(x::Union{Number, Vector{<:Number}}, e::Function) = e(x)\n", "ε(x=1) = begin \n", " if x isa Number\n", " return Dual.(0.0, x)\n", " else\n", " return Dual.(zeros(length(x)), x)\n", " end\n", "end\n", "em = ε\n", "dual(x::Union{Dual, Vector{<:Dual}}) = dualpart.(x)\n", "real(x::Union{Dual, Vector{<:Dual}}) = realpart.(x);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div style=\"width:95%;\n", " margin:10px;\n", " padding:8px;\n", " background-color:#afa;\n", " border:2px solid #bbffbb;\n", " border-radius:10px;\n", " font-weight:bold;\n", " font-size:1.5em;\n", " text-align:center;\">\n", "La méthode principale pour le calcul de dérivées\n", "</div>\n", "\n", "La fonction `derivative` ci-dessous calcule la dérivée directionnelle\n", "\n", "$$\n", " f'(x) \\cdot v.\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## TO COMPLETE\n", "\n", "# compute directional derivative\n", "function derivative(f, x, v; method=_method(), h=_step(x, v, method))\n", " if method ∉ methods \n", " error(\"Choose a valid method in \", methods)\n", " end\n", " if method == :forward\n", " return f(x) # TO UPDATE\n", " elseif method == :central\n", " return f(x) # TO UPDATE\n", " elseif method == :complex\n", " return f(x) # TO UPDATE\n", " elseif method == :dual \n", " return f(x) # TO UPDATE\n", " elseif method == :forward_ad\n", " if x isa Number\n", " return ForwardDiff.derivative(f, x)*v\n", " else\n", " return ForwardDiff.jacobian(f, x)*v\n", " end\n", " end\n", "end;" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# function to print derivative values and errors\n", "function print_derivatives(f, x, v, sol)\n", "\n", " println(\"Hand derivative: \", sol, \"\\n\")\n", "\n", " for method ∈ methods\n", " dfv = derivative(f, x, v, method=method)\n", " println(\"Method: \", method)\n", " println(\" derivative: \", dfv)\n", " @printf(\" error: %e\\n\", norm(dfv - sol))\n", " if method ∈ (:forward, :central, :complex)\n", " step = _step(x, v, method)\n", " println(\" step: \", step)\n", " @printf(\" error/step: %e\\n\", norm(dfv - sol) / step)\n", " end\n", " println()\n", " end\n", "\n", "end;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercice 1\n", "\n", "1. Compléter la fonction `derivative` ci-dessus avec les méthodes de différences finies avant, centrée, par différentiation complexe et par différentiation automatique via les nombres duaux.\n", "2. Exécuter le code ci-dessous et vérifier les résultats obtenus." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Scalar case\n", "\n", "# check if the derivatives are correct\n", "f(x) = cos(x)\n", "x = π/4\n", "v = 1.0\n", "\n", "# solution\n", "sol = -sin(x)*v\n", "\n", "# print derivatives and errors for each method\n", "print_derivatives(f, x, v, sol)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Vectorial case\n", "\n", "# check if the derivatives are correct\n", "f(x) = [0.5*(x[1]^2 + x[2]^2); x[1]*x[2]]\n", "x = [1.0, 2.0]\n", "v = [1.0, -1.0]\n", "\n", "# solution\n", "sol = [x[1]*v[1]+x[2]*v[2], x[1]*v[2]+x[2]*v[1]]\n", "\n", "# print derivatives and errors for each method\n", "print_derivatives(f, x, v, sol)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div style=\"width:95%;\n", " margin:10px;\n", " padding:8px;\n", " background-color:#afa;\n", " border:2px solid #bbffbb;\n", " border-radius:10px;\n", " font-weight:bold;\n", " font-size:1.5em;\n", " text-align:center;\">\n", "Pas optimal\n", "</div>\n", "\n", "On se propose de tester pour la fonction $\\cos$ aux points $x_0=\\pi/3$, $x_1 = 10^6\\times\\pi/3$ et la fonction $\\cos+10^{-8} \\mathcal{N}(0, \\, 1)$ au point $x_0=\\pi/3$ l'erreur entre les différences finies et la dérivée au point considéré en fonction de $h$. On prendra $h=10^{-i}$ pour $i= \\{1,\\ldots,16\\}$ et on tracera ces erreurs dans une échelle logarithmique (en `Julia`, avec le package `Plots` on utilise l'option `scale=:log10`).\n", "\n", "### Exercice 2\n", "\n", "- Visualiser les différentes erreurs en fonction de $h$ pour les différentes méthodes de calcul de dérivées. Commentaires.\n", "- Modifier la précision de $x_0$ et $x_1$ en `Float32`. Commentaires." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# affichage des erreurs en fonction de h\n", "function plot_errors(steps, errors, h_star, title)\n", "\n", " steps_save = steps\n", " ymax = 10^10\n", "\n", " # supprimer les erreurs nulles\n", " non_nul_element = findall(!iszero, errors) \n", " errors = errors[non_nul_element]\n", " steps = steps[non_nul_element]\n", "\n", " # Courbe des erreurs pour les differents steps en bleu\n", " plt = plot((10.).^(-steps), errors, xscale=:log10, yscale=:log10, linecolor=:blue, lw=2, legend=false)\n", "\n", " # régler xlims pour toujours avoir tous les steps de départ\n", " plot!(plt, xlims=(10^(-maximum(steps_save)), 10^(-minimum(steps_save))))\n", "\n", " # ylims toujours entre 10^-16 et ymax\n", " plot!(plt, ylims=(10^(-16), ymax))\n", "\n", " # Ligne verticale pour situer l'erreur optimale h* en rouge\n", " plot!(plt,[h_star, h_star], [10^(-16), ymax], linecolor=:red, lw=1, linestyle=:dash)\n", "\n", " # titre de la figure et xlabel\n", " plot!(plt, xlabel = \"h\", title = title, legend=false, titlefontsize=10)\n", "\n", " # ajouter des marges en bas de la figure pour mieux voir le xlabel \n", " plot!(plt, bottom_margin = 5mm)\n", "\n", " #\n", " return plt\n", "\n", "end;" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Les différentes fonctions et la dérivée théorique\n", "fun1(x) = cos(x)\n", "fun2(x) = cos(x) + 1.e-8*randn()\n", "dfun(x) = -sin(x);" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# method \n", "method = :forward # TO PLAY WITH\n", "h_star = √(eps(1.0)) # TO UPDATE ACCORDING TO THE METHOD\n", "\n", "# Points pour lesquels on souhaite effectuer les tests\n", "x0 = π/3\n", "x1 = 1.e6*π/3\n", "\n", "# steps pour faire les tests\n", "steps = range(1, 16, 16)\n", "\n", "# Initialisation des vecteurs d'erreur\n", "err_x0 = zeros(length(steps))\n", "err_x0p = zeros(length(steps))\n", "err_x1 = zeros(length(steps))\n", "\n", "# Calcul des erreurs\n", "for i in 1:length(steps)\n", " h = 10^(-steps[i])\n", " err_x0[i] = abs(derivative(fun1, x0, 1.0, h=h, method=method) - (dfun(x0)))\n", " err_x1[i] = abs(derivative(fun1, x1, 1.0, h=h, method=method) - (dfun(x1)))\n", " err_x0p[i] = abs(derivative(fun2, x0, 1.0, h=h, method=method) - (dfun(x0)))\n", "end\n", "\n", "# Affichage des erreurs\n", "p1 = plot_errors(steps, err_x0, h_star, \"cos(x0)\")\n", "p2 = plot_errors(steps, err_x0p, h_star, \"cos(x0) + perturbation\")\n", "p3 = plot_errors(steps, err_x1, h_star, \"cos(x1)\")\n", "\n", "plot(p1, p2, p3, layout=(1,3), size=(850, 350))" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.9.0", "language": "julia", "name": "julia-1.9" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.9.0" } }, "nbformat": 4, "nbformat_minor": 2 }