{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "When we want to work with SQL easily in an Ipython notebook, we'll load the ipython-sql [extension](https://github.com/catherinedevlin/ipython-sql) as follows:\n",
    "\n",
    "**Note: DO NOT PANIC** \n",
    "* Don't worry if you get (a) a big red-highlighted warning or (b) a note that the extension has already been loaded!  As long as your SQL commands work, it's loaded properly!\n",
    "\n",
    "* If you don't understand all the SQL queries... very soon, you will!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Library/Python/2.7/site-packages/IPython/config.py:13: ShimWarning: The `IPython.config` package has been deprecated. You should import from traitlets.config instead.\n",
      "  \"You should import from traitlets.config instead.\", ShimWarning)\n",
      "/Library/Python/2.7/site-packages/IPython/utils/traitlets.py:5: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.\n",
      "  warn(\"IPython.utils.traitlets has moved to a top-level traitlets package.\")\n"
     ]
    }
   ],
   "source": [
    "%load_ext sql"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we'll load an SQLite database stored as a file as follows:\n",
    "\n",
    "**NOTE: We load a file below (here, \"dataset_1.db\", which must be in the same directory as the notebook. You'll use this file throughout the next few lectures and PS1. Make sure to download the from the webpage!!**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "ERROR:root:Line magic function `%sql` not found.\n"
     ]
    }
   ],
   "source": [
    "%sql sqlite:///dataset_1.db"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets look at one table from the US National Oceanic and Atmospheric Administration (NOAA) Rainfall dataset- `precipitation_full`- having the following schema:\n",
    "\n",
    "> * `state_code`\n",
    "> * `station_id`\n",
    "> * `year`\n",
    "> * `month`\n",
    "> * `day`\n",
    "> * `hour`\n",
    "> * `precipitation`\n",
    "> * `flag_1`\n",
    "> * `flag_2`\n",
    "\n",
    "Each tuple in this table describes one hour of rainfall (`precipitation`- in hundredths of an inch) at one station (`station_id`) in one state (`state_code`).  Note that tuples with `hour=25` record the total rainfall for that day, and that we can ignore the values of attributes `flag_1` and `flag_2` for now.\n",
    "\n",
    "Let's take a look at some sample rows of the table via SQL queries."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that you can also load an empty in-memory database using:\n",
    "\n",
    "``%sql sqlite://``\n",
    "\n",
    "Now let's try out some queries!  (Don't worry, if you're new to SQL, we'll go over all the syntax in more depth next lecture)\n",
    "\n",
    "**`%sql` is used for single line SQL commands:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%sql SELECT * FROM precipitation_full LIMIT 5;"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%sql SELECT COUNT(*) FROM precipitation_full;"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And **`%%sql` is used for multi-line SQL commands:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%sql\n",
    "SELECT SUM(p.precipitation) \n",
    "FROM precipitation_full p, states s \n",
    "WHERE p.state_code = s.code AND s.abbrev = 'CA';"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also work with the output of the queries we issue:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "result = %sql SELECT * FROM states;"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "result.keys"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Note that the first row is the header row with column names!*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "result[1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "result[1].name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now feel free to have fun with the dataset- we'll see more of it later though!"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [ipykernel_py3]",
   "language": "python",
   "name": "Python [ipykernel_py3]"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}