5. Computational Statistics Lectures

Author:

Raymond Bisdorff, Emeritus Professor of Applied Mathematics and Computer Science, University of Luxembourg

Url:

https://rbisdorff.github.io/

Copyright:

Bisdorff © 2013-2023

5.1. Introduction

From 2007 to 2011 the Algorithmic Decision Theory COST Action IC0602, coordinated by Alexis Tsoukiàs, gathered researchers coming from different fields such as Decision Theory, Discrete Mathematics, Theoretical Computer Science and Artificial Intelligence in order to improve decision support in the presence of massive data bases, combinatorial structures, partial and/or uncertain information and distributed, possibly interoperating decision makers.

A positive result a.o. of this COST action was the organisation from 2010 to 2020 of a Semester Course on Computational Statistics at the University of Luxembourg in the context of its Master in Information and Computer Science.

Below are gathered 2x2 reduced copies of the presentation slides for 8 Lectures from the Winter Semester 2019.

5.2. Lectures

L1. Generating random numbers for simulations

On numbers “chosen at random”, computer generated random numbers and multiple recursive random number generators over F2
Recommendations and traps to watch for with home brewed generators
Combining random number generators and testing randomness

L2. Introduction to statistical computing

On generating simulation data with Python and exploring simulation data with gretl. Getting started with R. Introducing R objects: vectors, matrices, lists, data frames. Reading CSV data files into data frames
Doing linear algebra in R. Constructing matrix objects, matrix operations and inversion, solving linear systems, eigen-values and -vectors, singular value decomposition, Choleski and QR decompositions
Principal component analysis (PCA) and discrete Markov chain simulating

L3. Continuous Random Variables

Probability distributions in R-core, simulating a continuous uniform random distribution, the spectral test for random number generators
Simulating random variables by a continuous inverse transform, standard exponential law based generators
The Gaussian random variables, important properties, simulating Gaussian random variables

L4. Simulating from Discrete Random Variables

Simulating Bernoulli and Binomial random variables. The CLT for binomial distributions
Simulating a Poisson random variable and Poisson processes with exponential time intervals
Simulating Gamma variables, integer alpha parameter and the sum rule for Gamma Variables

L5. Simulating from arbitrary empirical random distributions

Single pass estimation of arbitrary quantiles: computing sample quantiles, quantiles via selecting algorithms, tracking the M-largest element in a single pass
Computing quantiles from binned data: equally binned observation data, linear integration formulas, regular binned data quantiles
Incremental quantiles estimation with the IQ-agent: using the IQ-agent for Monte-Carlo simulations

L6. Two distributions, are they of the same kind?

Comparing statistical distributions: methodological approach and statistical tests
Comparing histograms: Chi-square test against a known distribution, comparing two binned data sets, testing uniform randomness
Comparing continuous distributions with the Kolmogorov-Smirnov test

L7. On Averaging

The benefit from averaging: the law of large numbers, estimating distribution parameters, how to reduce noise
Convergence of the averaging: Convergence of the mean for a standard Gaussian, and if there are outliers? Non convergence of a Cauchy mean
Comparing two empiric means: robustness of the t statistic, estimating t statistics, Monte Carlo simulation of the H0 rejection

L8. Accept-Reject Simulation Methods

Classical Monte Carlo Integration: principles and applications
Accept-reject simulation methods
Accept-reject simulation applications: pi estimation, Box-Muller transform, Ratio-Of-Uniforms method