Estimating mean and standard deviation from numerical data

One method is to look for the median. There are 15 data points. The middle value is 4.19365, meaning 7 values are larger than this and 7 are smaller. The median can be a poor estimate for the mean if the distribution is very asymmetric, but these numbers go up to 4.5 and down to 3.9, so it looks like they are fairly well centered on 4.2.

Using H&H's "rough and ready" estimate (p. 11) for the standard deviation, we find that the maximum value minus the mean is about 0.3. Taking 2/3 of that gives a estimated standard deviation of 0.2.

Quantitative check

In [1]:
import numpy as np
In [2]:
data = np.array([4.1075, 4.39831, 4.19365, 4.20259,
4.26921, 4.13037, 3.97548, 4.51314, 4.01286, 4.0101, 4.15578, 4.35153,
4.30801, 4.21082, 3.94315])
In [3]:
np.std(data,ddof=1)
Out[3]:
0.1640200187259383

Not too bad!

In [4]:
%load_ext version_information
%version_information numpy
Out[4]:
SoftwareVersion
Python3.7.8 64bit [GCC 7.5.0]
IPython7.17.0
OSLinux 3.10.0 1127.19.1.el7.x86_64 x86_64 with centos 7.9.2009 Core
numpy1.19.1
Sun Jan 16 15:32:31 2022 EST
In [ ]: