{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Statystyka i Analiza danych\n",
    "# Laboratorium 10 - Test $\\chi^2$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Ćwiczenie 1: \n",
    "\n",
    "Typowy rozkład wiekowy ludności w pewnym regionie jest zgodny z podanym w tabeli Typowe. Natomiast w tabeli Zaobserwowane przedstawiono rozkład wiekowy ludności z miasteczka znajdującego się w rozpatrywanym regionie. Sprawdź czy rozkład ten można uznać za zgodny z rozkładem typowym dla tego regionu."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "kategorie <- c(\"<15\", \"15-24\", \"25-34\", \"35-44\", \"45-54\", \"55-64\", \"65-74\", \">74\")\n",
    "Typowe <- c(475,304,182,190,208,170,111,72)\n",
    "Zaobserwowane <- c(3016,2438,2037,2031,1253,977,585,163)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Sformułuj hipotezy zerową i alternatywną, zinterpretuj hipotezy. Przyjęty poziom istotności $\\alpha = 0.01$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# H0:\n",
    "# H1:\n",
    "alpha <- 0.01"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Oblicz liczbę stopni swobody:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Oblicz tabelę wartości oczekiwanych:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Oczekiwane <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Oblicz wartość statystyki testowej:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "chi2 <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wyznacz wartość krytyczną i podejmij decyzję."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wykonaj test korzystając z funkcji `chisq.test()`, podając jako pierwszy argument liczebności zaobserwowane, a jako parametr *p* -- odpowiadające im prawdopodobieństwa założone w hipotezie zerowej"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "chisq.test(Zaobserwowane, p=Oczekiwane/sum(Oczekiwane))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Czyścimy workspace:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rm(list = ls())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Ćwiczenie 2\n",
    "\n",
    "Dane do tego zadania opisują wszystkie osoby (2201) na pokładzie dużego statku wycieczkowego z punktu widzenia 3 prostych atrybutów jakościowych i pewnego rodzaju, nieznanego, jakościowego atrybutu decyzyjnego (zadowolenie z wycieczki? sympatia do kapitana? Co to są za dane?). Sporządź tabelę wielodzielczą i wykres. Wykonaj test niezależności chi-kwadrat dla atrybutów: płeć i decyzja, klasa i decyzja."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Dane <- read.csv(url(\"http://www.cs.put.poznan.pl/swilk/siad/11-cw2.csv\"), sep=\";\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Przyjmij poziom istotności 0.001:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "alpha <- 0.001"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Utwórz tablicę wielodzielczą dla atrybutów wiek i płeć korzystając z funkcji `table()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "WiekDecyzja <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Stwórz wykres tablicy korzystając z funkcji `plot()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Oblicz sumę obserwacji oraz liczebności brzegowe korzystając z funkcji `margin.table()` podając jako pierwszy argument tablicę wielodzielczą, jako drugi 1 dla wierszy a 2 dla kolumn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sum_rows <- margin.table( # ...\n",
    "sum_cols <- margin.table( # ...\n",
    "total <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Oblicz liczbę stopni swobody:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Oblicz tablicę wartości oczekiwanych. Utwórz macierz korzystając z funkcji `outer(sum_rows, sum_cols)`, wykonującej iloczyn zewnętrzny/Kroneckera (iloczyny wszystkich kombinacji współrzędnych wektorów)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "E <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Oblicz wartość statystyki testowej:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "chi2  <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Podaj wartość krytyczną i podejmij decyzję odnośnie hipotezy zerowej:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wykonaj test korzystając z funkcji `chisq.test()` podając jako argument tablicę wielodzielczą"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "chisq.test(WiekDecyzja, correct=FALSE) #bez poprawki Yatesa"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wykonaj wszystkie powyższe czynności dla atrybutów klasa i decyzja. Tablica wielodzielcza:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "KlasaDecyzja <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wykres tablicy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Suma obserwacji oraz liczebności brzegowe: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "margin_rows <- \n",
    "margin_cols <- \n",
    "total <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Liczba stopni swobody:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tablica wartości oczekiwanych:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "E <-"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wartość statystyki testowej:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "chi2  <- "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wartość krytyczna i decyzja:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Test używając funkcji `chisq.test()`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Czyszczenie przestrzeni roboczej:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rm(list = ls())"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "R",
   "language": "R",
   "name": "ir"
  },
  "language_info": {
   "codemirror_mode": "r",
   "file_extension": ".r",
   "mimetype": "text/x-r-source",
   "name": "R",
   "pygments_lexer": "r",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}