{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Activity 6-1\n",
"------------\n",
"\n",
"The goal for this activity will be to compute some BCNF decompositions, using the tools from last lecture\n",
"\n",
"First we'll load those tools, and some sample data:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"from closure import compute_closure, display_side_by_side, print_setup"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Connected: None@None'"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%load_ext sql\n",
"%sql sqlite://"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"Done.\n",
"1 rows affected.\n",
"1 rows affected.\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql DROP TABLE IF EXISTS T;\n",
"CREATE TABLE T(course VARCHAR, classroom INT, time INT);\n",
"INSERT INTO T VALUES ('CS 364', 132, 900);\n",
"INSERT INTO T VALUES ('CS 245', 140, 1000);\n",
"INSERT INTO T VALUES ('EE 101', 210, 900);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise 1\n",
"\n",
"First, let's decompose `T` into BCNF! Explicitly go through the steps of the BCNF algorithm using the `compute_closure` function, then decompose the following table (i.e. by creating new SQL tables) into BCNF:\n",
"\n",
"We've also made a function, `display_side_by_side`, for nicer display!"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"
\n",
" \n",
" course | \n",
" classroom | \n",
" time | \n",
"
\n",
" \n",
" CS 364 | \n",
" 132 | \n",
" 900 | \n",
"
\n",
" \n",
" CS 245 | \n",
" 140 | \n",
" 1000 | \n",
"
\n",
" \n",
" EE 101 | \n",
" 210 | \n",
" 900 | \n",
"
\n",
"
"
],
"text/plain": [
"[('CS 364', 132, 900), ('CS 245', 140, 1000), ('EE 101', 210, 900)]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT * FROM T;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We are given the following FDs:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Attributes = {time,classroom,course}\n",
"FDs:\n",
"\n",
"=> \tcourse -> classroom\n",
"\n",
"=> \ttime,classroom -> course\n"
]
}
],
"source": [
"A = set(['course', 'classroom', 'time'])\n",
"F = [('course', 'classroom'), (set(['classroom', 'time']), 'course')]\n",
"print_setup(A, F)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q:** What real-world constraints do these FDs express?\n",
"\n",
"Now, use the `compute_closure` function to help decompose this table to BCNF:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'classroom', 'course'}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# We find an X s.t. X^+ != A and X^+ != X\n",
"X = set(['course'])\n",
"compute_closure(X, F)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Compose into two tables, $T_1$ and $T_2$:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"Done.\n"
]
},
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"DROP TABLE IF EXISTS T1;\n",
"CREATE TABLE T1 AS SELECT DISTINCT * FROM (\n",
" SELECT course, classroom FROM T\n",
");"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"Done.\n"
]
},
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"DROP TABLE IF EXISTS T2;\n",
"CREATE TABLE T2 AS SELECT DISTINCT * FROM (\n",
" SELECT course, time FROM T\n",
");"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now run the below to display the decomposed tables side-by-side:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"course | classroom |
---|
CS 364 | 132 |
CS 245 | 140 |
EE 101 | 210 |
course | time |
---|
CS 364 | 900 |
CS 245 | 1000 |
EE 101 | 900 |
"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"l = %sql SELECT * FROM T1;\n",
"r = %sql SELECT * FROM T2;\n",
"display_side_by_side(l,r)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q:** Is this now in BCNF?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise 2\n",
"\n",
"In the next section of lecture, we'll discuss a shortcoming of BCNF decompositions; let's see if we can get a glimpse of this now.\n",
"\n",
"See if you can insert rows into $T_1$ and/or $T_2$ _which respect the local FDs that still hold_, such that **when $T_1$ and $T_2$ are now recomposed, the original FDs do not hold!**"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 rows affected.\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"INSERT INTO T1 VALUES ('CS 145', 132);\n",
"INSERT INTO T2 VALUES ('CS 145', 900);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, reconstruct and print the re-composed table using a SQL query:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" course | \n",
" classroom | \n",
" time | \n",
"
\n",
" \n",
" CS 364 | \n",
" 132 | \n",
" 900 | \n",
"
\n",
" \n",
" CS 245 | \n",
" 140 | \n",
" 1000 | \n",
"
\n",
" \n",
" EE 101 | \n",
" 210 | \n",
" 900 | \n",
"
\n",
" \n",
" CS 145 | \n",
" 132 | \n",
" 900 | \n",
"
\n",
"
"
],
"text/plain": [
"[('CS 364', 132, 900),\n",
" ('CS 245', 140, 1000),\n",
" ('EE 101', 210, 900),\n",
" ('CS 145', 132, 900)]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT T1.course, T1.classroom, T2.time\n",
"FROM T1, T2\n",
"WHERE T1.course = T2.course;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q:** What went wrong?? And how could we prevent this from occuring?"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 1
}