Statistics with R and SQL

Associative Statistics – One sample T-Test with TSQL and R

In this post am going to attempt to explore a statistical procedure called ‘One Sample T Test’. A T-Test is used to test the mean value of a data sample against a known mean of the entire population from where the sample came from. An example would be, if the average voting preference of the… Continue reading Associative Statistics – One sample T-Test with TSQL and R

Statistics with R and SQL

Statistics with TSQL and R: Chi Square Test

As I move on from descriptive and  largely univariate (one variable based) analysis of data into more multivariate data – one of the first data analysis tests that came to mind is the Chi Square Test. It is a very commonly used test to understand relationships between two variables that are largely categorical in nature.… Continue reading Statistics with TSQL and R: Chi Square Test

Statistics with R and SQL

Statistics with T-SQL and R – the Pearson’s Correlation Coefficient

In this post I will attempt to explore calculation of a very basic statistic based on linear relationship between two variables. That is, a number that tells you if two numeric variables in a dataset are possibly correlated and if yes, by what degree. The Pearson’s coefficient is a number that attempts to measure this… Continue reading Statistics with T-SQL and R – the Pearson’s Correlation Coefficient

DBA

Script for creating test data for odds ratio

Make sure you have a working version of SQL Server 2016. USE [master] GO /****** Object: Database [WorldHealth] ******/ CREATE DATABASE [WorldHealth] CONTAINMENT = NONE ON PRIMARY ( NAME = N’WorldHealth’, FILENAME = N’D:\Microsoft SQL Server\DATA\WorldHealth.mdf’ , SIZE = 8192KB , MAXSIZE = UNLIMITED, FILEGROWTH = 65536KB ) LOG ON ( NAME = N’WorldHealth_log’, FILENAME… Continue reading Script for creating test data for odds ratio