Some basics around probabilities
basics of probability theory¶
In the context of a course in Computational Neuroscience, I am teaching a basic introduction in Probabilities, Bayes and the Free-energy principle.
Let's learn to use probabilities in practice by generating some "synthetic data", that is by using the computer's number generator. 2017-03-13_NeuroComp_FEP
Let's begin with a dice and the numpy.random
library which provides with a wide range of function to handle random variables:
dice¶
the numpy
library is a module
import numpy as np
np
from which you can use numpy.random
np.random
help(np.random.randint)
Let's draw a cubic dice 10 times:
np.random.randint(6, size=10)
a note on RNGs¶
On a computer, randomness is (possibly) deterministic !
np.random.randint(6, size=10)
np.random.seed(42)
np.random.randint(6, size=10)
np.random.seed(43)
np.random.randint(6, size=10)
np.random.seed(42)
np.random.randint(6, size=10)
np.random.seed(None)
np.random.randint(6, size=10)
coin¶
result = np.random.randint(2, size=10)
print(result)
result.mean(), result.std(), result.var()
N = 1000
result = np.random.randint(2, size=N)
print('Mean ', result.mean(), ', std ', result.std())
N = 1000
N_trials = 100
result = np.random.randint(2, size=(N, N_trials))
print('Mean ', result.mean(axis=0), ', std ', result.mean(axis=0).std())
print('Grand average=', result.mean())
%matplotlib inline
import matplotlib.pyplot as plt
N = 1000
N_trials = 100
result = np.random.randint(2, size=(N, N_trials))
print('Mean ', result.mean(axis=0), ', std ', result.mean(axis=0).std())
fig, ax = plt.subplots(figsize=(13, 8))
ax.hist(result.mean(axis=0), bins=np.linspace(0, 1, 100));
N = 100
N_trials = 1000
result = np.random.randint(2, size=(N, N_trials))
print('Mean ', result.mean(axis=0), ', std ', result.mean(axis=0).std())
fig, ax = plt.subplots(figsize=(13, 8))
ax.hist(result.mean(axis=0), bins=np.linspace(0, 1, 100));
N = 10000
N_trials = 10
result = np.random.randint(2, size=(N, N_trials))
print('Mean ', result.mean(axis=0), ', std ', result.mean(axis=0).std())
fig, ax = plt.subplots(figsize=(13, 8))
ax.hist(result.mean(axis=0), bins=np.linspace(0, 1, 100));
N = 10
N_trials = 10000
result = np.random.randint(2, size=(N, N_trials))
print('Mean ', result.mean(axis=0), ', std ', result.mean(axis=0).std())
fig, ax = plt.subplots(figsize=(13, 8))
ax.hist(result.mean(axis=0), bins=np.linspace(0, 1, 100));
introducing the normal variable: the Gaussian distribution¶
N = 10000
N_trials = 10000
result = np.random.randint(2, size=(N, N_trials))
print('Mean ', result.mean(axis=0), ', std ', result.mean(axis=0).std())
fig, ax = plt.subplots(figsize=(13, 8))
ax.hist(result.mean(axis=0), bins=np.linspace(0.48, .52, 100));
Central limit theorem:
Over multiple measurements, the average of an identical random variable converges to a normal law : $$ p (x) = \frac {1} { \sqrt{ 2\pi } \sigma} \cdot \exp{(- \frac {1} {2} \cdot \frac {(x - m)^2} {\sigma^2} )} $$
(moreover, we know from this theorem that the asymptotic mean is the mean or the rv and that the variance decreases inversely proportionally with the number of measurements)
mean = result.mean(axis=0).mean()
std = result.mean(axis=0).std()
bins = np.linspace(0.48, .52, 100)
fig, ax = plt.subplots(figsize=(13, 8))
ax.hist(result.mean(axis=0), bins=bins, density=True)
ax.plot(bins, 1 / np.sqrt(2*np.pi) / std * np.exp(-.5*(bins-mean)**2/std**2), 'r', lw=2 );
The central limit theorem works with all forms of random variables:
N = 10000
N_trials = 1000
result = np.random.rand(N, N_trials)
print(result)
mean = result.mean(axis=0).mean()
std = result.mean(axis=0).std()
bins = np.linspace(0.48, .52, 100)
fig, ax = plt.subplots(figsize=(13, 8))
ax.hist(result.mean(axis=0), bins=bins, density=True)
ax.plot(bins, 1 / np.sqrt(2*np.pi) / std * np.exp(-.5*(bins-mean)**2/std**2), 'r', lw=2 );
N = 10000
N_trials = 1000
result = np.random.randn(N, N_trials) + .5
mean = result.mean(axis=0).mean()
std = result.mean(axis=0).std()
bins = np.linspace(0.48, .52, 100)
fig, ax = plt.subplots(figsize=(13, 8))
ax.hist(result.mean(axis=0), bins=bins, density=True)
ax.plot(bins, 1 / np.sqrt(2*np.pi) / std * np.exp(-.5*(bins-mean)**2/std**2), 'r', lw=2 );