{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Simulating the PHYS 211 M&M lab\n", "\n", "NOTE: In this notebook I use the `stats` sub-module of `scipy` for all statistics functions, including generation of random numbers. There are other modules with some overlapping functionality, e.g., the regular python random module, and the `scipy.random` module, but I do not use them here. The `stats` sub-module includes tools for a large number of distributions, it includes a large and growing set of statistical functions, and there is a unified class structure. (And namespace issues are minimized.) See https://docs.scipy.org/doc/scipy/reference/stats.html." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from scipy import stats\n", "\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ " \n", "# M.L. modification of matplotlib defaults\n", "# Changes can also be put in matplotlibrc file, \n", "# or effected using mpl.rcParams[]\n", "plt.rc('figure', figsize = (6, 4.5)) # Reduces overall size of figures\n", "plt.rc('axes', labelsize=16, titlesize=14)\n", "plt.rc('figure', autolayout = True) # Adjusts supblot params for new size" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Intro\n", "+ 6 Colors: Yellow, Blue, Orange, Red, Green, and Blue\n", "+ Assume 60 M&Ms in every bag\n", "+ Assume equal probabilities (well mixed, large \"reservoir\")\n", "+ Assume 24 students (bags) per section" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To get started, sample one bag of M&Ms, and count the numberof brown M&Ms.
\n", "Do this by generating 60 random integers from the set 0, 1, 2, 3, 4, 5, and let's say that \"brown\" = 0." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[3 0 5 1 2 5 4 5 2 0 5 5 5 1 5 2 0 0 4 1 5 2 1 5 0 4 4 0 3 5 5 0 4 4 0 2 4\n", " 0 3 2 2 1 5 3 1 3 2 2 2 3 1 4 5 3 3 3 1 2 1 1]\n" ] } ], "source": [ "bag = stats.randint.rvs(0,6,size = 60) # or sp.random.randint(0,6,60)\n", "print(bag)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Count the number of each color in the bag using `np.bincount(bag)`. The first element in the array is the number of occurences of 0 in \"bag,\" the second element is the number of occurences of 1, etc." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 9, 10, 11, 9, 8, 13], dtype=int64)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.bincount(bag)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ For our \"brown\" = 0 choice, the number of brown M&Ms is the last element in the array returned by `bincount`, or `sp.bincount(bag)[0]`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.bincount(bag)[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "+ Now sample many bags\n", "+ Record number of brown M&Ms in each bag" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([10., 10., 6., 10., 14., 11., 8., 13., 7., 16., 8., 12., 9.,\n", " 10., 13., 9., 9., 8., 6., 7., 13., 9., 15., 15.])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Long version of sampling many bags\n", "nb = 24 # number of bags \n", "data_section = np.zeros(nb) # array in for data for a lab section\n", "for i in range(nb):\n", " bag = stats.randint.rvs(0,6,size=60)\n", " data_section[i] = np.bincount(bag)[0]\n", "\n", "data_section" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([13, 10, 9, 12, 6, 12, 12, 13, 6, 7, 14, 8, 10, 7, 7, 9, 19,\n", " 9, 8, 12, 6, 8, 8, 10], dtype=int64)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Concise version of sampling many bags\n", "nb = 24 # number of bags\n", "data_section = np.array([np.bincount(stats.randint.rvs(0,6,size=60))[0] for i in range(nb)])\n", "data_section" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(9.791666666666666, 3.0547663122114956, 0.6369628076720711)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.mean(data_section), np.std(data_section), np.std(data_section)/np.sqrt(len(data_section)-1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Answer for results from this single lab section:
\n", "$\\overline N = 9.8 \\pm 0.6$" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plt.figure()\n", "nbins = 20\n", "low = 0\n", "high = 20\n", "plt.hist(data_section,nbins,[low,high],rwidth = 0.9)\n", "plt.xlim(0,20)\n", "plt.title(\"Histogram of brown M&Ms per bag - Single Section\",fontsize=14)\n", "plt.xlabel(\"Number of brown M&Ms\")\n", "plt.ylabel(\"Occurences\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Version information\n", "`version_information` is from J.R. Johansson (jrjohansson at gmail.com); see Introduction to scientific computing with Python for more information and instructions for package installation.\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "%load_ext version_information" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "application/json": { "Software versions": [ { "module": "Python", "version": "3.11.5 64bit [MSC v.1916 64 bit (AMD64)]" }, { "module": "IPython", "version": "8.15.0" }, { "module": "OS", "version": "Windows 10 10.0.26100 SP0" }, { "module": "numpy", "version": "1.23.2" }, { "module": "scipy", "version": "1.11.1" }, { "module": "matplotlib", "version": "3.7.2" } ] }, "text/html": [ "
SoftwareVersion
Python3.11.5 64bit [MSC v.1916 64 bit (AMD64)]
IPython8.15.0
OSWindows 10 10.0.26100 SP0
numpy1.23.2
scipy1.11.1
matplotlib3.7.2
Sat Feb 08 14:33:34 2025 Eastern Standard Time
" ], "text/latex": [ "\\begin{tabular}{|l|l|}\\hline\n", "{\\bf Software} & {\\bf Version} \\\\ \\hline\\hline\n", "Python & 3.11.5 64bit [MSC v.1916 64 bit (AMD64)] \\\\ \\hline\n", "IPython & 8.15.0 \\\\ \\hline\n", "OS & Windows 10 10.0.26100 SP0 \\\\ \\hline\n", "numpy & 1.23.2 \\\\ \\hline\n", "scipy & 1.11.1 \\\\ \\hline\n", "matplotlib & 3.7.2 \\\\ \\hline\n", "\\hline \\multicolumn{2}{|l|}{Sat Feb 08 14:33:34 2025 Eastern Standard Time} \\\\ \\hline\n", "\\end{tabular}\n" ], "text/plain": [ "Software versions\n", "Python 3.11.5 64bit [MSC v.1916 64 bit (AMD64)]\n", "IPython 8.15.0\n", "OS Windows 10 10.0.26100 SP0\n", "numpy 1.23.2\n", "scipy 1.11.1\n", "matplotlib 3.7.2\n", "Sat Feb 08 14:33:34 2025 Eastern Standard Time" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "version_information numpy, scipy, matplotlib" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 4 }