Merge pull request #13 from vi3itor/math-diff-calc

Correct typos and formatting in differential calculus tutorial
main
Aurélien Geron 2022-05-22 09:57:47 +12:00 committed by GitHub
commit 44f2c3f8d5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 26 additions and 29 deletions

View File

@ -49,7 +49,6 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"#@title\n", "#@title\n",
"%matplotlib inline\n",
"import matplotlib as mpl\n", "import matplotlib as mpl\n",
"import matplotlib.pyplot as plt\n", "import matplotlib.pyplot as plt\n",
"import numpy as np\n", "import numpy as np\n",
@ -167,7 +166,7 @@
"id": "gcb7eqkmGGXf" "id": "gcb7eqkmGGXf"
}, },
"source": [ "source": [
"But what if you want to know the slope of something else than a straight line? For example, let's consider the curve defined by $y = f(x) = x^2$:" "But what if you want to know the slope of something other than a straight line? For example, let's consider the curve defined by $y = f(x) = x^2$:"
] ]
}, },
{ {
@ -226,7 +225,7 @@
"id": "4qCXg9nQSp6S" "id": "4qCXg9nQSp6S"
}, },
"source": [ "source": [
"How can we put numbers on these intuitions? Well, say we want to estimate the slope of the curve at a point $\\mathrm{A}$, we can do this by taking another point $\\mathrm{B}$ on the curve, not too far away, and then computing the slope between these two points:\n" "How can we put numbers on these intuitions? Well, say we want to estimate the slope of the curve at a point $\\mathrm{A}$. We can do this by taking another point $\\mathrm{B}$ on the curve, not too far away, and then computing the slope between these two points:\n"
] ]
}, },
{ {
@ -963,7 +962,7 @@
"source": [ "source": [
"# Differentiability\n", "# Differentiability\n",
"\n", "\n",
"Note that some functions are not quite as well-behaved as $x^2$: for example, consider the function $f(x)=|x|$, the absolute value of $x$:" "Note that some functions are not quite as well-behaved as $x^2$. For example, consider the function $f(x)=|x|$, the absolute value of $x$:"
] ]
}, },
{ {
@ -2231,7 +2230,7 @@
"& = \\underset{x_\\mathrm{B} \\to x_\\mathrm{A}}\\lim x_\\mathrm{B} \\, + \\underset{x_\\mathrm{B} \\to x_\\mathrm{A}}\\lim x_\\mathrm{A}\\quad && \\text{since the limit of a sum is the sum of the limits}\\\\\n", "& = \\underset{x_\\mathrm{B} \\to x_\\mathrm{A}}\\lim x_\\mathrm{B} \\, + \\underset{x_\\mathrm{B} \\to x_\\mathrm{A}}\\lim x_\\mathrm{A}\\quad && \\text{since the limit of a sum is the sum of the limits}\\\\\n",
"& = x_\\mathrm{A} \\, + \\underset{x_\\mathrm{B} \\to x_\\mathrm{A}}\\lim x_\\mathrm{A} \\quad && \\text{since } x_\\mathrm{B}\\text{ approaches } x_\\mathrm{A} \\\\\n", "& = x_\\mathrm{A} \\, + \\underset{x_\\mathrm{B} \\to x_\\mathrm{A}}\\lim x_\\mathrm{A} \\quad && \\text{since } x_\\mathrm{B}\\text{ approaches } x_\\mathrm{A} \\\\\n",
"& = x_\\mathrm{A} + x_\\mathrm{A} \\quad && \\text{since } x_\\mathrm{A} \\text{ remains constant when } x_\\mathrm{B}\\text{ approaches } x_\\mathrm{A} \\\\\n", "& = x_\\mathrm{A} + x_\\mathrm{A} \\quad && \\text{since } x_\\mathrm{A} \\text{ remains constant when } x_\\mathrm{B}\\text{ approaches } x_\\mathrm{A} \\\\\n",
"& = 2 x_\\mathrm{A}\n", "& = 2x_\\mathrm{A} &&\n",
"\\end{align*}\n", "\\end{align*}\n",
"$\n", "$\n",
"\n", "\n",
@ -2307,7 +2306,7 @@
"& = \\underset{\\epsilon \\to 0}\\lim\\dfrac{{x}^2 + 2x\\epsilon + \\epsilon^2 - {x}^2}{\\epsilon}\\quad && \\text{since } (x + \\epsilon)^2 = {x}^2 + 2x\\epsilon + \\epsilon^2\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim\\dfrac{{x}^2 + 2x\\epsilon + \\epsilon^2 - {x}^2}{\\epsilon}\\quad && \\text{since } (x + \\epsilon)^2 = {x}^2 + 2x\\epsilon + \\epsilon^2\\\\\n",
"& = \\underset{\\epsilon \\to 0}\\lim\\dfrac{2x\\epsilon + \\epsilon^2}{\\epsilon}\\quad && \\text{since the two } {x}^2 \\text{ cancel out}\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim\\dfrac{2x\\epsilon + \\epsilon^2}{\\epsilon}\\quad && \\text{since the two } {x}^2 \\text{ cancel out}\\\\\n",
"& = \\underset{\\epsilon \\to 0}\\lim \\, (2x + \\epsilon)\\quad && \\text{since } 2x\\epsilon \\text{ and } \\epsilon^2 \\text{ can both be divided by } \\epsilon\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim \\, (2x + \\epsilon)\\quad && \\text{since } 2x\\epsilon \\text{ and } \\epsilon^2 \\text{ can both be divided by } \\epsilon\\\\\n",
"& = 2 x\n", "& = 2x &&\n",
"\\end{align*}\n", "\\end{align*}\n",
"$\n", "$\n",
"\n", "\n",
@ -2343,7 +2342,7 @@
"\n", "\n",
"The $f'$ notation is Lagrange's notation, while $\\dfrac{\\mathrm{d}f}{\\mathrm{d}x}$ is Leibniz's notation.\n", "The $f'$ notation is Lagrange's notation, while $\\dfrac{\\mathrm{d}f}{\\mathrm{d}x}$ is Leibniz's notation.\n",
"\n", "\n",
"There are also other less common notations, such as Newton's notation $\\dot y$ (assuming $y = f(x)$) or Euler's notation $\\mathrm{D}f$." "There are other less common notations, such as Newton's notation $\\dot y$ (assuming $y = f(x)$) or Euler's notation $\\mathrm{D}f$."
] ]
}, },
{ {
@ -4164,7 +4163,7 @@
"\n", "\n",
"It is possible to chain many functions. For example, if $f(x)=g(h(i(x)))$, and we define $y=i(x)$ and $z=h(y)$, then $\\dfrac{\\mathrm{d}f}{\\mathrm{d}x} = \\dfrac{\\mathrm{d}f}{\\mathrm{d}z} \\dfrac{\\mathrm{d}z}{\\mathrm{d}y} \\dfrac{\\mathrm{d}y}{\\mathrm{d}x}$. Using Lagrange's notation, we get $f'(x)=g'(z)\\,h'(y)\\,i'(x)=g'(h(i(x)))\\,h'(i(x))\\,i'(x)$\n", "It is possible to chain many functions. For example, if $f(x)=g(h(i(x)))$, and we define $y=i(x)$ and $z=h(y)$, then $\\dfrac{\\mathrm{d}f}{\\mathrm{d}x} = \\dfrac{\\mathrm{d}f}{\\mathrm{d}z} \\dfrac{\\mathrm{d}z}{\\mathrm{d}y} \\dfrac{\\mathrm{d}y}{\\mathrm{d}x}$. Using Lagrange's notation, we get $f'(x)=g'(z)\\,h'(y)\\,i'(x)=g'(h(i(x)))\\,h'(i(x))\\,i'(x)$\n",
"\n", "\n",
"The chain rule is crucial in Deep Learning, as a neural network is basically as a long composition of functions. For example, a 3-layer dense neural network corresponds to the following function: $f(\\mathbf{x})=\\operatorname{Dense}_3(\\operatorname{Dense}_2(\\operatorname{Dense}_1(\\mathbf{x})))$ (in this example, $\\operatorname{Dense}_3$ is the output layer).\n" "The chain rule is crucial in Deep Learning, as a neural network is basically a long composition of functions. For example, a 3-layer dense neural network corresponds to the following function: $f(\\mathbf{x})=\\operatorname{Dense}_3(\\operatorname{Dense}_2(\\operatorname{Dense}_1(\\mathbf{x})))$ (in this example, $\\operatorname{Dense}_3$ is the output layer).\n"
] ]
}, },
{ {
@ -4296,7 +4295,7 @@
"\n", "\n",
"At each iteration, the step size is proportional to the slope, so the process naturally slows down as it approaches a local minimum. Each step is also proportional to the learning rate: a parameter of the Gradient Descent algorithm itself (since it is not a parameter of the function we are optimizing, it is called a **hyperparameter**).\n", "At each iteration, the step size is proportional to the slope, so the process naturally slows down as it approaches a local minimum. Each step is also proportional to the learning rate: a parameter of the Gradient Descent algorithm itself (since it is not a parameter of the function we are optimizing, it is called a **hyperparameter**).\n",
"\n", "\n",
"Here is an animation of this process on the function $f(x)=\\dfrac{1}{4}x^4 - x^2 + \\dfrac{1}{2}$:" "Here is an animation of this process for the function $f(x)=\\dfrac{1}{4}x^4 - x^2 + \\dfrac{1}{2}$:"
] ]
}, },
{ {
@ -5253,8 +5252,6 @@
], ],
"source": [ "source": [
"#@title\n", "#@title\n",
"from mpl_toolkits.mplot3d import Axes3D\n",
"\n",
"def plot_3d(f, title):\n", "def plot_3d(f, title):\n",
" fig = plt.figure(figsize=(8, 5))\n", " fig = plt.figure(figsize=(8, 5))\n",
" ax = fig.add_subplot(111, projection='3d')\n", " ax = fig.add_subplot(111, projection='3d')\n",
@ -5367,7 +5364,7 @@
"$\\nabla f(\\mathbf{x}_\\mathrm{A}) = \\begin{pmatrix}\n", "$\\nabla f(\\mathbf{x}_\\mathrm{A}) = \\begin{pmatrix}\n",
"\\dfrac{\\partial f}{\\partial x_1}(\\mathbf{x}_\\mathrm{A})\\\\\n", "\\dfrac{\\partial f}{\\partial x_1}(\\mathbf{x}_\\mathrm{A})\\\\\n",
"\\dfrac{\\partial f}{\\partial x_2}(\\mathbf{x}_\\mathrm{A})\\\\\n", "\\dfrac{\\partial f}{\\partial x_2}(\\mathbf{x}_\\mathrm{A})\\\\\n",
"\\vdots\\\\\\\n", "\\vdots\\\\\n",
"\\dfrac{\\partial f}{\\partial x_n}(\\mathbf{x}_\\mathrm{A})\\\\\n", "\\dfrac{\\partial f}{\\partial x_n}(\\mathbf{x}_\\mathrm{A})\\\\\n",
"\\end{pmatrix}$" "\\end{pmatrix}$"
] ]
@ -5407,7 +5404,7 @@
"source": [ "source": [
"# Jacobians\n", "# Jacobians\n",
"\n", "\n",
"Until now we have only considered functions that output a scalar, but it is possible to output vectors instead. For example, a classification neural network typically outputs one probability for each class, so if there are $m$ classes, the neural network will output an $d$-dimensional vector for each input.\n", "Until now, we have only considered functions that output a scalar, but it is possible to output vectors instead. For example, a classification neural network typically outputs one probability for each class, so if there are $m$ classes, the neural network will output a $d$-dimensional vector for each input.\n",
"\n", "\n",
"In Deep Learning we generally only need to differentiate the loss function, which almost always outputs a single scalar number. But suppose for a second that you want to differentiate a function $\\mathbf{f}(\\mathbf{x})$ which outputs $d$-dimensional vectors. The good news is that you can treat each _output_ dimension independently of the others. This will give you a partial derivative for each input dimension and each output dimension. If you put them all in a single matrix, with one column per input dimension and one row per output dimension, you get the so-called **Jacobian matrix**.\n", "In Deep Learning we generally only need to differentiate the loss function, which almost always outputs a single scalar number. But suppose for a second that you want to differentiate a function $\\mathbf{f}(\\mathbf{x})$ which outputs $d$-dimensional vectors. The good news is that you can treat each _output_ dimension independently of the others. This will give you a partial derivative for each input dimension and each output dimension. If you put them all in a single matrix, with one column per input dimension and one row per output dimension, you get the so-called **Jacobian matrix**.\n",
"\n", "\n",
@ -5532,7 +5529,7 @@
"& = \\underset{\\epsilon \\to 0}\\lim\\dfrac{g(x+\\epsilon)h(x+\\epsilon) - g(x)h(x+\\epsilon)}{\\epsilon} + \\underset{\\epsilon \\to 0}\\lim\\dfrac{g(x)h(x + \\epsilon) - g(x)h(x)}{\\epsilon} && \\quad \\text{since the limit of a sum is the sum of the limits}\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim\\dfrac{g(x+\\epsilon)h(x+\\epsilon) - g(x)h(x+\\epsilon)}{\\epsilon} + \\underset{\\epsilon \\to 0}\\lim\\dfrac{g(x)h(x + \\epsilon) - g(x)h(x)}{\\epsilon} && \\quad \\text{since the limit of a sum is the sum of the limits}\\\\\n",
"& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}h(x+\\epsilon)\\right]} \\,+\\, \\underset{\\epsilon \\to 0}\\lim{\\left[g(x)\\dfrac{h(x + \\epsilon) - h(x)}{\\epsilon}\\right]} && \\quad \\text{factorizing }h(x+\\epsilon) \\text{ and } g(x)\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}h(x+\\epsilon)\\right]} \\,+\\, \\underset{\\epsilon \\to 0}\\lim{\\left[g(x)\\dfrac{h(x + \\epsilon) - h(x)}{\\epsilon}\\right]} && \\quad \\text{factorizing }h(x+\\epsilon) \\text{ and } g(x)\\\\\n",
"& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}h(x+\\epsilon)\\right]} \\,+\\, g(x)\\underset{\\epsilon \\to 0}\\lim{\\dfrac{h(x + \\epsilon) - h(x)}{\\epsilon}} && \\quad \\text{taking } g(x) \\text{ out of the limit since it does not depend on }\\epsilon\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}h(x+\\epsilon)\\right]} \\,+\\, g(x)\\underset{\\epsilon \\to 0}\\lim{\\dfrac{h(x + \\epsilon) - h(x)}{\\epsilon}} && \\quad \\text{taking } g(x) \\text{ out of the limit since it does not depend on }\\epsilon\\\\\n",
"& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}h(x+\\epsilon)\\right]} \\,+\\, g(x)h'(x) && \\quad \\text{using the definition of h'(x)}\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}h(x+\\epsilon)\\right]} \\,+\\, g(x)h'(x) && \\quad \\text{using the definition of }h'(x)\\\\\n",
"& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}\\right]}\\underset{\\epsilon \\to 0}\\lim{h(x+\\epsilon)} + g(x)h'(x) && \\quad \\text{since the limit of a product is the product of the limits}\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}\\right]}\\underset{\\epsilon \\to 0}\\lim{h(x+\\epsilon)} + g(x)h'(x) && \\quad \\text{since the limit of a product is the product of the limits}\\\\\n",
"& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}\\right]}h(x) + h(x)g'(x) && \\quad \\text{since } h(x) \\text{ is continuous}\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{g(x+\\epsilon) - g(x)}{\\epsilon}\\right]}h(x) + h(x)g'(x) && \\quad \\text{since } h(x) \\text{ is continuous}\\\\\n",
"& = g'(x)h(x) + g(x)h'(x) && \\quad \\text{using the definition of }g'(x)\n", "& = g'(x)h(x) + g(x)h'(x) && \\quad \\text{using the definition of }g'(x)\n",
@ -5620,7 +5617,7 @@
"& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{1}{\\epsilon} \\, \\ln\\left(1 + \\dfrac{\\epsilon}{x}\\right)\\right]} && \\quad \\text{just moving things around a bit}\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{1}{\\epsilon} \\, \\ln\\left(1 + \\dfrac{\\epsilon}{x}\\right)\\right]} && \\quad \\text{just moving things around a bit}\\\\\n",
"& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{1}{xu} \\, \\ln\\left(1 + u\\right)\\right]} && \\quad \\text{defining }u=\\dfrac{\\epsilon}{x} \\text{ and thus } \\epsilon=xu\\\\\n", "& = \\underset{\\epsilon \\to 0}\\lim{\\left[\\dfrac{1}{xu} \\, \\ln\\left(1 + u\\right)\\right]} && \\quad \\text{defining }u=\\dfrac{\\epsilon}{x} \\text{ and thus } \\epsilon=xu\\\\\n",
"& = \\underset{u \\to 0}\\lim{\\left[\\dfrac{1}{xu} \\, \\ln\\left(1 + u\\right)\\right]} && \\quad \\text{replacing } \\underset{\\epsilon \\to 0}\\lim \\text{ with } \\underset{u \\to 0}\\lim \\text{ since }\\underset{\\epsilon \\to 0}\\lim u=0\\\\\n", "& = \\underset{u \\to 0}\\lim{\\left[\\dfrac{1}{xu} \\, \\ln\\left(1 + u\\right)\\right]} && \\quad \\text{replacing } \\underset{\\epsilon \\to 0}\\lim \\text{ with } \\underset{u \\to 0}\\lim \\text{ since }\\underset{\\epsilon \\to 0}\\lim u=0\\\\\n",
"& = \\underset{u \\to 0}\\lim{\\left[\\dfrac{1}{x} \\, \\ln\\left((1 + u)^{1/u}\\right)\\right]} && \\quad \\text{since }a\\ln(b)=\\ln(a^b)\\\\\n", "& = \\underset{u \\to 0}\\lim{\\left[\\dfrac{1}{x} \\, \\ln\\left((1 + u)^{1/u}\\right)\\right]} && \\quad \\text{since }a\\ln(b)=\\ln(b^a)\\\\\n",
"& = \\dfrac{1}{x}\\underset{u \\to 0}\\lim{\\left[\\ln\\left((1 + u)^{1/u}\\right)\\right]} && \\quad \\text{taking }\\dfrac{1}{x} \\text{ out since it does not depend on }\\epsilon\\\\\n", "& = \\dfrac{1}{x}\\underset{u \\to 0}\\lim{\\left[\\ln\\left((1 + u)^{1/u}\\right)\\right]} && \\quad \\text{taking }\\dfrac{1}{x} \\text{ out since it does not depend on }\\epsilon\\\\\n",
"& = \\dfrac{1}{x}\\ln\\left(\\underset{u \\to 0}\\lim{(1 + u)^{1/u}}\\right) && \\quad \\text{taking }\\ln\\text{ out since it is a continuous function}\\\\\n", "& = \\dfrac{1}{x}\\ln\\left(\\underset{u \\to 0}\\lim{(1 + u)^{1/u}}\\right) && \\quad \\text{taking }\\ln\\text{ out since it is a continuous function}\\\\\n",
"& = \\dfrac{1}{x}\\ln(e) && \\quad \\text{since }e=\\underset{u \\to 0}\\lim{(1 + u)^{1/u}}\\\\\n", "& = \\dfrac{1}{x}\\ln(e) && \\quad \\text{since }e=\\underset{u \\to 0}\\lim{(1 + u)^{1/u}}\\\\\n",
@ -5644,9 +5641,9 @@
"\n", "\n",
"We know the derivative of the exponential: $g'(x)=e^x$. We also know the derivative of the natural logarithm: $\\ln'(x)=\\dfrac{1}{x}$ so $h'(x)=\\dfrac{r}{x}$. Therefore:\n", "We know the derivative of the exponential: $g'(x)=e^x$. We also know the derivative of the natural logarithm: $\\ln'(x)=\\dfrac{1}{x}$ so $h'(x)=\\dfrac{r}{x}$. Therefore:\n",
"\n", "\n",
"$f'(x) = \\dfrac{r}{x}\\exp\\left({\\ln(x^r)}\\right)$\n", "$f'(x) = \\dfrac{r}{x} e^{\\ln(x^r)}$\n",
"\n", "\n",
"Since $a = \\exp(\\ln(a))$, this equation simplifies to:\n", "Since $e^{\\ln(a)} = a$, this equation simplifies to:\n",
"\n", "\n",
"$f'(x) = \\dfrac{r}{x} x^r$\n", "$f'(x) = \\dfrac{r}{x} x^r$\n",
"\n", "\n",
@ -5657,7 +5654,7 @@
"Note that the power rule works for any $r \\neq 0$, including negative numbers and real numbers. For example:\n", "Note that the power rule works for any $r \\neq 0$, including negative numbers and real numbers. For example:\n",
"\n", "\n",
"* if $f(x) = \\dfrac{1}{x} = x^{-1}$, then $f'(x)=-x^{-2}=-\\dfrac{1}{x^2}$.\n", "* if $f(x) = \\dfrac{1}{x} = x^{-1}$, then $f'(x)=-x^{-2}=-\\dfrac{1}{x^2}$.\n",
"* if $f(x) = \\sqrt(x) = x^{1/2}$, then $f'(x)=\\dfrac{1}{2}x^{-1/2}=\\dfrac{1}{2\\sqrt{x}}$" "* if $f(x) = \\sqrt{x} = x^{1/2}$, then $f'(x)=\\dfrac{1}{2}x^{-1/2}=\\dfrac{1}{2\\sqrt{x}}$"
] ]
}, },
{ {
@ -5800,17 +5797,17 @@
"source": [ "source": [
"The circle is the unit circle (radius=1).\n", "The circle is the unit circle (radius=1).\n",
"\n", "\n",
"Assuming $0 < \\theta < \\dfrac{\\pi}{2}$, the area of the blue triangle (area $\\mathrm{A}$) is equal to its height ($\\sin(\\theta)$), times its base ($\\cos(\\theta)$), divided by 2. So $\\mathrm{A} = \\dfrac{1}{2}\\sin(\\theta)\\cos(\\theta)$.\n", "Assuming $0 < \\theta < \\dfrac{\\pi}{2}$, the area of the blue triangle (area $\\mathrm{A}$) is equal to its height ($\\sin(\\theta)$) times its base ($\\cos(\\theta)$) divided by 2. So $\\mathrm{A} = \\dfrac{1}{2}\\sin(\\theta)\\cos(\\theta)$.\n",
"\n", "\n",
"The unit circle has an area of $\\pi$, so the circular sector (in the shape of a pizza slice) has an area of A + B = $\\pi\\dfrac{\\theta}{2\\pi} = \\dfrac{\\theta}{2}$.\n", "The unit circle has an area of $\\pi$, so the circular sector (in the shape of a pizza slice) has an area of A + B = $\\pi\\dfrac{\\theta}{2\\pi} = \\dfrac{\\theta}{2}$.\n",
"\n", "\n",
"Next, the large triangle (A + B + C) has an area equal to its height ($\\tan(\\theta)$) multiplied by its base (1) divided by 2, so A + B + C = $\\dfrac{\\tan(\\theta)}{2}$.\n", "Next, the large triangle (A + B + C) has an area equal to its height ($\\tan(\\theta)$) multiplied by its base (of length 1) divided by 2, so A + B + C = $\\dfrac{\\tan(\\theta)}{2}$.\n",
"\n", "\n",
"When $0 < \\theta < \\dfrac{\\pi}{2}$, we have $\\mathrm{A} < \\mathrm{A} + \\mathrm{B} < \\mathrm{A} + \\mathrm{B} + \\mathrm{C}$, therefore:\n", "When $0 < \\theta < \\dfrac{\\pi}{2}$, we have $\\mathrm{A} < \\mathrm{A} + \\mathrm{B} < \\mathrm{A} + \\mathrm{B} + \\mathrm{C}$, therefore:\n",
"\n", "\n",
"$\\dfrac{1}{2}\\sin(\\theta)\\cos(\\theta) < \\dfrac{\\theta}{2} < \\dfrac{\\tan(\\theta)}{2}$\n", "$\\dfrac{1}{2}\\sin(\\theta)\\cos(\\theta) < \\dfrac{\\theta}{2} < \\dfrac{\\tan(\\theta)}{2}$\n",
"\n", "\n",
"We can multiply all the terms by 2 to get rid of the $\\dfrac{1}{2}$ factors. We can also divide by $\\sin(\\theta)$, which is stricly positive (assuming $0 < \\theta < \\dfrac{\\pi}{2}$), so the inequalities still hold:\n", "We can multiply all the terms by 2 to get rid of the $\\dfrac{1}{2}$ factors. We can also divide by $\\sin(\\theta)$, which is strictly positive (assuming $0 < \\theta < \\dfrac{\\pi}{2}$), so the inequalities still hold:\n",
"\n", "\n",
"$cos(\\theta) < \\dfrac{\\theta}{\\sin(\\theta)} < \\dfrac{\\tan(\\theta)}{\\sin(\\theta)}$\n", "$cos(\\theta) < \\dfrac{\\theta}{\\sin(\\theta)} < \\dfrac{\\tan(\\theta)}{\\sin(\\theta)}$\n",
"\n", "\n",
@ -5843,7 +5840,7 @@
"\n", "\n",
"$\\dfrac{1}{cos(\\theta)} > \\dfrac{\\sin(\\theta)}{\\theta} > \\cos(\\theta)$\n", "$\\dfrac{1}{cos(\\theta)} > \\dfrac{\\sin(\\theta)}{\\theta} > \\cos(\\theta)$\n",
"\n", "\n",
"assuming $-\\dfrac{\\theta}{2} < \\theta < \\dfrac{\\pi}{2}$ and $\\theta \\neq 0$\n", "assuming $-\\dfrac{\\pi}{2} < \\theta < \\dfrac{\\pi}{2}$ and $\\theta \\neq 0$\n",
"<hr />\n", "<hr />\n",
"\n", "\n",
"Since $\\cos$ is a continuous function, $\\underset{\\theta \\to 0}\\lim\\cos(\\theta)=\\cos(0)=1$. Similarly, $\\underset{\\theta \\to 0}\\lim\\dfrac{1}{cos(\\theta)}=\\dfrac{1}{\\cos(0)}=1$.\n", "Since $\\cos$ is a continuous function, $\\underset{\\theta \\to 0}\\lim\\cos(\\theta)=\\cos(0)=1$. Similarly, $\\underset{\\theta \\to 0}\\lim\\dfrac{1}{cos(\\theta)}=\\dfrac{1}{\\cos(0)}=1$.\n",
@ -5872,12 +5869,12 @@
"\\begin{align*}\n", "\\begin{align*}\n",
"\\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(\\theta) - 1}{\\theta} & = \\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(\\theta) - 1}{\\theta}\\frac{\\cos(\\theta) + 1}{\\cos(\\theta) + 1} && \\quad \\text{ multiplying and dividing by }\\cos(\\theta)+1\\\\\n", "\\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(\\theta) - 1}{\\theta} & = \\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(\\theta) - 1}{\\theta}\\frac{\\cos(\\theta) + 1}{\\cos(\\theta) + 1} && \\quad \\text{ multiplying and dividing by }\\cos(\\theta)+1\\\\\n",
"& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\cos^2(\\theta) - 1}{\\theta(\\cos(\\theta) + 1)} && \\quad \\text{ since }(a-1)(a+1)=a^2-1\\\\\n", "& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\cos^2(\\theta) - 1}{\\theta(\\cos(\\theta) + 1)} && \\quad \\text{ since }(a-1)(a+1)=a^2-1\\\\\n",
"& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin^2(\\theta)}{\\theta(\\cos(\\theta) + 1)} && \\quad \\text{ since }\\cos^2(\\theta) - 1 = \\sin^2(\\theta)\\\\\n", "& = \\underset{\\theta \\to 0}\\lim\\dfrac{-\\sin^2(\\theta)}{\\theta(\\cos(\\theta) + 1)} && \\quad \\text{ since }\\cos^2(\\theta) - 1 = -\\sin^2(\\theta)\\\\\n",
"& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta}\\dfrac{\\sin(\\theta)}{\\cos(\\theta) + 1} && \\quad \\text{ just rearranging the terms}\\\\\n", "& = -\\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta}\\dfrac{\\sin(\\theta)}{\\cos(\\theta) + 1} && \\quad \\text{ just rearranging the terms}\\\\\n",
"& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta} \\, \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\cos(\\theta) + 1} && \\quad \\text{ since the limit of a product is the product of the limits}\\\\\n", "& = -\\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta} \\, \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\cos(\\theta) + 1} && \\quad \\text{ since the limit of a product is the product of the limits}\\\\\n",
"& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\cos(\\theta) + 1} && \\quad \\text{ since } \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta}=1\\\\\n", "& = -\\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\cos(\\theta) + 1} && \\quad \\text{ since } \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta}=1\\\\\n",
"& = \\dfrac{0}{1+1} && \\quad \\text{ since } \\underset{\\theta \\to 0}\\lim\\sin(\\theta)=0 \\text{ and } \\underset{\\theta \\to 0}\\lim\\cos(\\theta)=1\\\\\n", "& = -\\dfrac{0}{1+1} && \\quad \\text{ since } \\underset{\\theta \\to 0}\\lim\\sin(\\theta)=0 \\text{ and } \\underset{\\theta \\to 0}\\lim\\cos(\\theta)=1\\\\\n",
"& = 0\\\\\n", "& = 0 &&\n",
"\\end{align*}\n", "\\end{align*}\n",
"$\n", "$\n",
"\n", "\n",
@ -5911,7 +5908,7 @@
"\\begin{align*}\n", "\\begin{align*}\n",
"f'(x) & = \\underset{\\theta \\to 0}\\lim\\dfrac{f(x+\\theta) - f(x)}{\\theta} && \\quad\\text{by definition}\\\\\n", "f'(x) & = \\underset{\\theta \\to 0}\\lim\\dfrac{f(x+\\theta) - f(x)}{\\theta} && \\quad\\text{by definition}\\\\\n",
"& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(x+\\theta) - \\sin(x)}{\\theta} && \\quad \\text{using }f(x) = \\sin(x)\\\\\n", "& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(x+\\theta) - \\sin(x)}{\\theta} && \\quad \\text{using }f(x) = \\sin(x)\\\\\n",
"& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(x)\\sin(\\theta) + \\sin(x)\\cos(\\theta) - \\sin(x)}{\\theta} && \\quad \\text{since } cos(a+b)=\\cos(a)\\sin(b)+\\sin(a)\\cos(b)\\\\\n", "& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(x)\\sin(\\theta) + \\sin(x)\\cos(\\theta) - \\sin(x)}{\\theta} && \\quad \\text{since } \\sin(a+b)=\\cos(a)\\sin(b)+\\sin(a)\\cos(b)\\\\\n",
"& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(x)\\sin(\\theta)}{\\theta} + \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(x)\\cos(\\theta) - \\sin(x)}{\\theta} && \\quad \\text{since the limit of a sum is the sum of the limits}\\\\\n", "& = \\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(x)\\sin(\\theta)}{\\theta} + \\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(x)\\cos(\\theta) - \\sin(x)}{\\theta} && \\quad \\text{since the limit of a sum is the sum of the limits}\\\\\n",
"& = \\cos(x)\\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta} + \\sin(x)\\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(\\theta) - 1}{\\theta} && \\quad \\text{bringing out } \\cos(x) \\text{ and } \\sin(x) \\text{ since they don't depend on }\\theta\\\\\n", "& = \\cos(x)\\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta} + \\sin(x)\\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(\\theta) - 1}{\\theta} && \\quad \\text{bringing out } \\cos(x) \\text{ and } \\sin(x) \\text{ since they don't depend on }\\theta\\\\\n",
"& = \\cos(x)\\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta} && \\quad \\text{since }\\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(\\theta) - 1}{\\theta}=0\\\\\n", "& = \\cos(x)\\underset{\\theta \\to 0}\\lim\\dfrac{\\sin(\\theta)}{\\theta} && \\quad \\text{since }\\underset{\\theta \\to 0}\\lim\\dfrac{\\cos(\\theta) - 1}{\\theta}=0\\\\\n",