Statistics with R and SQL

Associative Statistics – One sample T-Test with TSQL and R

In this post am going to attempt to explore a statistical procedure called ‘One Sample T Test’. A T-Test is used to test the mean value of a data sample against a known mean of the entire population from where the sample came from. An example would be, if the average voting preference of the… Continue reading Associative Statistics – One sample T-Test with TSQL and R

Statistics with R and SQL

Statistics with TSQL and R: Chi Square Test

As I move on from descriptive and  largely univariate (one variable based) analysis of data into more multivariate data – one of the first data analysis tests that came to mind is the Chi Square Test. It is a very commonly used test to understand relationships between two variables that are largely categorical in nature.… Continue reading Statistics with TSQL and R: Chi Square Test

Statistics with R and SQL

Statistics with T-SQL and R – the Pearson’s Correlation Coefficient

In this post I will attempt to explore calculation of a very basic statistic based on linear relationship between two variables. That is, a number that tells you if two numeric variables in a dataset are possibly correlated and if yes, by what degree. The Pearson’s coefficient is a number that attempts to measure this… Continue reading Statistics with T-SQL and R – the Pearson’s Correlation Coefficient

Data Mining · Statistics with R and SQL

Descriptive Statistics with SQL and R – II

In the previous post I looked into some very basic and common measures of descriptive statistics – mean, median and mode, and how to derive these using T-SQL, R as well as a combo of the two in SQL Server 2016. These measures also called measures of ‘Central Tendency‘. In this post am going to… Continue reading Descriptive Statistics with SQL and R – II

Statistics with R and SQL

Script to create demo database and load data for statistics and R

Make sure you have a working install of SQL Server 2016. The size of the database is only 8 MB. USE [master] GO /****** Object: Database [WorldHealth] Script Date: 7/15/2016 4:44:58 PM ******/ CREATE DATABASE [WorldHealth] CONTAINMENT = NONE ON PRIMARY ( NAME = N’WorldHealth’, FILENAME = N’D:\DATA\WorldHealth.mdf’ , SIZE = 8192KB , MAXSIZE =… Continue reading Script to create demo database and load data for statistics and R