The lottery is supposed to be a game of chance. So let's check if it's true !

For this study, I get the past result of the french lottery available here from October 2008 to to February 2016 with a total of 1152 trials.

The goals of this post are first to be present concrete application of statistical tests on real data and then look for some funny features that could be in the lottery dataset. The chi2 test being the most known statistical test, I will first give a short introduction on how this test works as an example for statistical tests and show how to compute the statistic with an experiment on dice. Then, we will move to the lottery dataset to first check randomness hypothesis on the output results using our chi2 test knowledge. Finally, We will look for some behaviour that may appears in this datasets.

From time to time I will try to edit this article adding result for a new features that may appear in the data and also update the dataset with the new data. Here are the findings so far.

You can either read the text version of this article to learn more about the chi2 test and see the results, or the python notebook version to see the code behind the results. There will be less explanation is the notebook version but it could be interesting to see what is behind. The python code is also available on my github.

For this study, I get the past result of the french lottery available here from October 2008 to to February 2016 with a total of 1152 trials.

The goals of this post are first to be present concrete application of statistical tests on real data and then look for some funny features that could be in the lottery dataset. The chi2 test being the most known statistical test, I will first give a short introduction on how this test works as an example for statistical tests and show how to compute the statistic with an experiment on dice. Then, we will move to the lottery dataset to first check randomness hypothesis on the output results using our chi2 test knowledge. Finally, We will look for some behaviour that may appears in this datasets.

From time to time I will try to edit this article adding result for a new features that may appear in the data and also update the dataset with the new data. Here are the findings so far.

- Choose the 'numero chance' 1
- There are 42% less winners on Mondays than the rest of the week. Let's play on Mondays to have less opponent and less people to share little win with ?
- There are 42% less winners in June than the rest of the year.

You can either read the text version of this article to learn more about the chi2 test and see the results, or the python notebook version to see the code behind the results. There will be less explanation is the notebook version but it could be interesting to see what is behind. The python code is also available on my github.

In practice the chi2 table is construct with monte carlo method. The protocol consist in simulation a sample from the theoretical distribution, compute the chi2 statistic and store it. We do this process a lot of time (100 000 times for instance). From all this realisation, we can then compute an estimation of the probability of having a score greater than x and being really sample from the theoretical distribution.