ANOVA – or analysis of variance, is a term given to a set of statistical models that are used to analyze differences among groups and if the differences are statistically significant to arrive at any conclusion. The models were developed by statistician and evolutionary biologist Ronald Fischer. To give a very simplistic definition – ANOVA… Continue reading Understanding ANOVA
R is particularly good with drawing graphs with data. Some graphs are familiar to most DBAs as it has been things we have seen and used over time – bar charts, pie diagram and so on. Some are not. Understanding exploratory graphics is vitally important to the R programmer/data science newbie. This week I wanted… Continue reading Box-and-whisker plot and data patterns with R and T-SQL
What is the difference between reading numbers as they are presented, and interpreting them in a mature, deeper way? One way perhaps to look at the latter is what statisticians call ‘confidence interval’. Suppose I look at a sampling of 100 americans who are asked if they approve of the job the supreme court is… Continue reading Confidence Intervals for a proportion – using R
I am still trying to get up to speed on blogging after a gap. Today I managed to push myself to write some R code and test it, and it worked. Am getting there, although need more work to turn it into a blog post. So, here is another on the lines of professional development.… Continue reading What is networking, really?
I have been trying to get my blogging going again after a gap of two months. It has been incredibly hard. To warm up, I decided to try some non technical posts. One of them is stuff I have been wanting to write a long time – with this year I will complete attending 14… Continue reading 14 years of Summit…
The past two months have been very hectic for me. I had an unexpected job offer towards end of July, which I gladly accepted – that was followed by some much needed home renovation, and a long vacation/tour of the west coast with my beloved sister. All of this has taken a toll on my… Continue reading Getting back to blogging
In this post we will explore a common statistical term – Relative Risk, otherwise called Risk Factor. Relative Risk is a term that is important to understand when you are doing comparative studies of two groups that are different in some specific way. The most common usage of this is in drug testing – with… Continue reading Understanding Relative Risk – with T-SQL