Skip to the content.

[HOME]

Mathematical / Statistical Topics

The following provide links to tutorials that I’ve written on various topics in statistical analysis, machine learning, and data science. There are also some links to related resources farther below (e.g., books on Bayesian data analysis).

NOTE: In case it helps, I run the Jupyter notebooks from Anaconda-Navigator using different environments for each notebook.

Extreme Value Analysis (EVA)

“EVA is widely used in many disciplines, such as structural engineering, finance, earth sciences, traffic prediction, and geological engineering. For example, EVA might be used in the field of hydrology to estimate the probability of an unusually large flooding event, such as the 100-year flood.” – Wikipedia

This is the PDF file from a presentation on Extreme Value Analysis (EVA) that I gave to the central Texas IEEE Section on Oct 20, 2022.

Covid-19 and Blood Types (using a Bayesian approach)

In the recent paper by Jiao Zhao, et al. it was reported that blood group (e.g., A, B, AB, or O) appears to have an effect on the likelihood of becoming infected with the Covid-19 virus. Basically, people with blood type A appear to be more susceptible to the virus, while people with blood type O appear to be less susceptible.

The authors of the paper performed several types of statistical analyses to arrive at their conclusion: one-way ANOVA, 2-tailed chi-square, and a meta-analysis using random effects models. In this Jupyter notebook, I’ve performed a different type of analysis, Bayesian Data Analysis (BDA), using the data reported in their paper.

[CAVEAT: No one has checked my work, so there could be errors in it] This BDA appears to support their conclusion, but also provides posterior density estimates for the proportions of A, B, AB, and O blood groups among the infected, along with credible intervals for those proportions. See the four posterior density plots at the end of this notebook.

Bayesian Beta-Binomial Example

This Jupyter notebook provides a very simple example of Bayesian parameter estimation using the Beta-Binomial model. Both analytical and simulation-based results are presented. Three different approaches are used to obtain a parameter estimate for this model:

Monoids 101 for Apache Spark

This Jupyter notebook describes what monoids are and the role they play in reduction and aggregation in Spark, specifically PySpark. To illustrate the use of the monoid concept, the following examples are included:

Misc. Resources

[HOME]