{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from ipywidgets import *\n",
    "import matplotlib.pyplot as plt\n",
    "from IPython.display import set_matplotlib_formats\n",
    "set_matplotlib_formats('svg')\n",
    "import numpy as np\n",
    "import scipy.stats as stats"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def spearman_pearson(example=\"1\", show=False):\n",
    "    fig, axes = plt.subplots(figsize=(5,5))\n",
    "    n = 50\n",
    "    x = np.linspace(0, 1, n)\n",
    "    y = x \n",
    "    if example==\"2\":\n",
    "        y = x**5\n",
    "    elif example==\"3\":\n",
    "        y = x + np.random.normal(0, 0.2, n)\n",
    "        x[-5:] = np.linspace(2, 3, 5)\n",
    "    elif example==\"4\":\n",
    "        y = x + np.random.normal(0, 1, n)\n",
    "    plt.scatter(x, y)\n",
    "    plt.xlabel(\"x\")\n",
    "    plt.ylabel(\"y\")\n",
    "    r = round(stats.pearsonr(x,y)[0],2)\n",
    "    rs = round(stats.spearmanr(x,y)[0],2)\n",
    "    if show:\n",
    "        plt.title(r\"$r_{Pearson}=\"+str(r)+\"$   $r_{Spearman}=\"+str(rs)+\"$\")\n",
    "    plt.grid()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Testy nieparametryczne"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test znaków"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "-  Odpowiednik sparowanego testu t dla dwóch populacji\n",
    "-  Założenia:\n",
    "\t-  próba losowa prosta\n",
    "\t-  pary $(X,Y)$ niezależne\n",
    "-  Skala co najmniej porządkowa\n",
    "-  Układ hipotez:\n",
    "\n",
    "$\\;\\;\\;\\;\\;H_0: p=P(Y>X)=0.5$\n",
    "\n",
    "$\\;\\;\\;\\;\\;H_1: p \\neq 0.5 |  p > 0.5 |  p < 0.5$\n",
    "\n",
    "-  Odrzucenie X=Y\n",
    "-  T: liczba znaków(+)\n",
    "-  $T_{H_0} \\sim B_n(0.5,n)$\n",
    "- Dla $np_0=n(1-p_0)>5$:\n",
    "\n",
    "$$Z=\\frac{T-np_0}{\\sqrt{np_0(1-p_0)}} \\sim N(0,1)$$\n",
    "\n",
    "$$Z =\\frac{2T-n}{\\sqrt{n}} \\sim N(0,1)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test znaków - przykład"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "|   lp | test 1 | test 2 | różnica | znak |\n",
    "| ---  |  ---   |  ---   |  ---    | --- |\n",
    "|   1 | 1.00 | 10.00 | 9.00 | $+$ |\n",
    "|     2 | 4.00 | 6.00 | 2.00 | $+$ |\n",
    "|     3 | 2.00 | 8.00 | 6.00 | $+$ |\n",
    "|     4 | 3.00 | 9.00 | 6.00 | $+$ |\n",
    "|     5 | 0.00 | 0.00 | \\textbf{0.00} |\n",
    "|     6 | 5.00 | 9.00 | 4.00 | $+$ |\n",
    "|     7 | 10.00 | 7.00 | -3.00 | $-$|\n",
    "|     8 | 9.00 | 5.00 | -4.00 | $-$|\n",
    "|     9 | 8.00 | 7.00 | -1.00 | $-$|\n",
    "|     10 | 8.00 | 4.00 | -4.00 | $-$|\n",
    "|     11 | 2.00 | 5.00 | 3.00 | $+$|\n",
    "|     12 | 3.00 | 5.00 | 2.00 | $+$|\n",
    "|     13 | 6.00 | 4.00 | -2.00 | $-$|\n",
    "|     14 | 5.00 | 7.00 | 2.00 | $+$|\n",
    "|     15 | 8.00 | 8.00 | \\textbf{0.00} |\n",
    "|     16 | 1.00 | 9.00 | 8.00 | $+$|"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\alpha = 0.05$\n",
    "\n",
    "$H_0: p=0.5$\n",
    "<br>$H_1: p \\neq 0.5$\n",
    "\n",
    "$n=14$\n",
    "\n",
    "$S_n=T=9$\n",
    "\n",
    "$Z=\\frac{2T-n}{\\sqrt{n}}=\\frac{2\\cdot9-14}{\\sqrt{14}}=1.069$\n",
    "\n",
    "$\\textrm{Zbiór krytyczny: }(-\\infty, -1.96)\\cup(1.96,\\infty)$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test Wilcoxona"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "-  Odpowiednik sparowanego testu t dla dwóch populacji\n",
    "-  Założenia:\n",
    "\t-  rozkład różnic symetryczny\n",
    "\t-  różnice są niezależne\n",
    "-  Układ hipotez:\n",
    "\n",
    "$\\;\\;\\;\\;\\;H_0: median(Y-X)=0$\n",
    "\n",
    "$\\;\\;\\;\\;\\;H_1: median(Y-X)\\neq0$\n",
    "\n",
    "-  Rangowanie wartości bezwzględnych różnic\n",
    "\n",
    "-  Dla równych różnic średnia arytmetyczna rang\n",
    "\n",
    "-  Statystyka $T=min[\\sum(+),\\sum(-)]$\n",
    "\n",
    "-  Zbiór krytyczny: $C_{kr}=[0, T_{kr}]$\n",
    "\n",
    "-  Dla dużych prób:\n",
    "\n",
    "$\\;\\;\\;\\;\\;\\mu_T = \\frac{n(n+1)}{4}$\n",
    "\n",
    "$\\;\\;\\;\\;\\;\\sigma_T = \\sqrt{\\frac{n(n+1)(2n+1)}{24}}$\n",
    "\n",
    "$\\;\\;\\;\\;\\;Z = \\frac{T-\\mu_T}{\\sigma_T}$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Test Wilcoxona - przykład"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "|   lp | test 1 | test 2 | różnica | moduł różnicy | ranga |\n",
    "| ---  |  ---   |  ---   | ---     |  ---          |  ---  |\n",
    "|   1 | 1.00 | 10.00 | 9.00 | 9.00 | |\n",
    "|     2 | 4.00 | 6.00 | 2.00 | 2.00 ||\n",
    "|     3 | 2.00 | 8.00 | 6.00 | 6.00 ||\n",
    "|     4 | 3.00 | 9.00 | 6.00 | 6.00 ||\n",
    "|     5 | 0.00 | 0.00 | 0.00 | 0.00 | |\n",
    "|     6 | 5.00 | 9.00 | 4.00 | 4.00 ||\n",
    "|     7 | 10.00 | 7.00 | -3.00 | 3.00 ||\n",
    "|     8 | 9.00 | 5.00 | -4.00 | 4.00 ||\n",
    "|     9 | 8.00 | 7.00 | -1.00 | 1.00 ||\n",
    "|     10 | 8.00 | 4.00 | -4.00 |4.00||\n",
    "|     11 | 2.00 | 5.00 | 3.00 | 3.00||\n",
    "|     12 | 3.00 | 5.00 | 2.00 | 2.00||\n",
    "|     13 | 6.00 | 4.00 | -2.00 | 2.00||\n",
    "|     14 | 5.00 | 7.00 | 2.00 | 2.00||\n",
    "|     15 | 8.00 | 8.00 | 0.00 | 0.00 ||\n",
    "|     16 | 1.00 | 9.00 | 8.00 | 8.00|| "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "|   lp | test 1 | test 2 | różnica | moduł różnicy | ranga |\n",
    "| ---  |  ---   |  ---   | ---     |  ---          |  ---  |\n",
    "|   1 | 1.00 | 10.00 | 9.00 | 9.00 | \\textbf{14.00}|\n",
    "|     2 | 4.00 | 6.00 | 2.00 | 2.00 | \\textbf{3.50}|\n",
    "|     3 | 2.00 | 8.00 | 6.00 | 6.00 | \\textbf{11.50}|\n",
    "|     4 | 3.00 | 9.00 | 6.00 | 6.00 | \\textbf{11.50}|\n",
    "|     5 | 0.00 | 0.00 | 0.00 | 0.00 | $-$|\n",
    "|     6 | 5.00 | 9.00 | 4.00 | 4.00 | \\textbf{9.00}|\n",
    "|     7 | 10.00 | 7.00 | -3.00 | 3.00 | 6.50|\n",
    "|     8 | 9.00 | 5.00 | -4.00 | 4.00 | 9.00|\n",
    "|     9 | 8.00 | 7.00 | -1.00 | 1.00 | 1.00|\n",
    "|     10 | 8.00 | 4.00 | -4.00 |4.00| 9.00|\n",
    "|     11 | 2.00 | 5.00 | 3.00 | 3.00| \\textbf{6.50}|\n",
    "|     12 | 3.00 | 5.00 | 2.00 | 2.00| \\textbf{3.50}|\n",
    "|     13 | 6.00 | 4.00 | -2.00 | 2.00| 3.50|\n",
    "|     14 | 5.00 | 7.00 | 2.00 | 2.00| \\textbf{3.50}|\n",
    "|     15 | 8.00 | 8.00 | 0.00 | 0.00 | $-$|\n",
    "|     16 | 1.00 | 9.00 | 8.00 | 8.00| \\textbf{13.00}|"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\alpha = 0.05$\n",
    "\n",
    "$H_0: median(Y-X)=0$\n",
    "\n",
    "$H_1: median(Y-X)\\neq0$\n",
    "\n",
    "$\\sum(-) = 6.5 + 9 + 1 + 9 + 3.5 = 29$\n",
    "\n",
    "$\\sum(+) = 14 + 3.5 + 11.5 + 11.5 + 9 + 6.5 + 3.5 + 3.5 + 14 = 76$\n",
    "\n",
    "$T=min[\\sum(+),\\sum(-)] = 29$\n",
    "\n",
    "$\\textrm{Zbiór krytyczny: }[0,26]$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Współczynnik korelacji Spearmana"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Współczynnik korelacji Pearsona dla rang obserwacji\n",
    "- Jeśli brak rang wiązanych:\n",
    "\n",
    "$$r_s=1-\\frac{6\\cdot\\sum_{i=1}^{n}d_i^2}{n\\cdot(n^2-1)}$$\n",
    "\n",
    "- Test istotności:\n",
    "\n",
    "$\\;\\;\\;\\;\\;H_0: \\rho_s=0$\n",
    "<br>$\\;\\;\\;\\;\\;H_1: \\rho_s\\neq0$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Przykład"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "|X|-4|8|9|\n",
    "|--|--|--|--|\n",
    "|Y|10|2|1|"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "|X    |-4|8|9|\n",
    "|--|--|--|--|\n",
    "|Ranga| | | |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "|Y    |10|2|1|\n",
    "|--|--|--|--|\n",
    "|Ranga|  | | |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\bar{x} = $\n",
    "\n",
    "$\\bar{y} = $"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$r = \\frac{\\sum_{i=1}^{n}(x_i-\\bar{x})(y_i-\\bar{y})}{\\sqrt{\\sum_{i=1}^{n}(x_i-\\bar{x})^2}\\sqrt{\\sum_{i=1}^{n}(y_i-\\bar{y})^2}}$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Współczynnik korelacji Pearsona a Spearmana"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "340c3d9062804660a1cd088e39ed75a7",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "interactive(children=(Dropdown(description='example', options=('1', '2', '3', '4'), value='1'), Checkbox(value…"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "<function __main__.spearman_pearson(example='1', show=False)>"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "interact(spearman_pearson, example=[\"1\",\"2\",\"3\", \"4\"])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}