My Epic Life Quest

I have always maintained a private bucket list. I have not had the courage to actually put it down in writing – but this year I decided that it is time. My good friend Brent Ozar has been doing this for a few years now, and his list is my inspiration. He has the link to the original post that inspired him too. My goals for the next 15 years are as below. I put it in 3 categories – 2017, this year, 2018 next, and then long term, after.

2017:
1 Complete Microsoft Data Science Program and Diploma in Healthcare Analytics from UCDavis.
2 Stick to blogging goals – one blog post per week, one contribution to sqlservercentral.com per two weeks.
3 Keep up exercise goals of 10,00 steps per day and one yoga workout per week.
4 Speak at local user group as often as I can (my limitations with travel do not allow me to speak at too many sql saturdays or out of town events)
5 Submit to speak at PASS speaker idol event.
6 Hike the Grand Canyon with my sister, we are travelling companions and love to see places together.
7 See two new countries atleast – Mexico with SQL Cruise, one more towards end of the year – remains undecided for now. But two countries it is.
8 Blog on books read so that I can understand the time I devote to reading and range I read in that time.
9 Get home renovation work done – am undecided on if I want to keep this condo or sell it, but either way, I’d have to get work done on it. Best if it got done this year, but involves considerable financial commitment that am not sure I can meet. As of now it looks doable for this year, but may move to next year if I have to reset goals.
10 Increase collection of annotated classics by year end. This is an ongoing goal to build a library for retirement. The only books I buy in print are annotated ones or those with pictures. There are not many of those and my collection is upto 30-40% of what I need already. I keep adding to it @3-4 books a year.
10 Take a course on cartooning and short story writing – both of these are my pet hobbies and never had as much time for them as I’d like – this year would like to atleast take a course on each to deepen my love and interest.
2018:
1  Submit to speak at PASS Summit.
2 Organize SQL Saturday #10 at Louisville (not clear how different this will be yet…).
3 Keep up same goals for exercising.
4 Visit one new country with my sister – am looking at Bali/Indonesia now.
5 Visit one more new country on SQLCruise, hopefully, or on my own. Either way, I do it.
6  Biggie – Pay off my mortgage. Yes, this is important and am not that far away. The only thing that keeps me from it is a bit undecided on how long I can live here with job opportunites being what they are. But I will assume those will be the same and in that case, the house will be ready to be paid off in 2018.
7 Do actual analytics work – by this time I will have a reasonable understanding of R/SAS/Microsoft Data science related skills, and expect it to take me to the next level professionally.

Long term goals:
1 Be a consultant on analytics – right now am looking at healthcare analytics but the area of application may change depending on how my career pans out..but in my 60s I hope to have knowledge and expertise to be a consultant, hopefully without a lot of travel,make the kind of $$$ I want to make and only work the hours I want to.
2 Be respected in SQL/Data community for both knowledge and community work.
3 Attend weddings of my favorite niece and nephew – it has been 25+ years since I attended an indian wedding. It goes back to days when I attended the wedding of a very wealthy classmate/friend, got treated without much respect, and vowed to never set foot in another wedding again. And I have not. But I have two youngsters whose weddings I’d like to attend and break that vow – I will not name them, I do not want them to feel pressure around getting married (which is already enormous in indian culture) – but if they choose to and have a wedding, I’d love to attend and offer my love and blessings.
4 Buy a home by the beach in my native Chennai, where I want to retire. Honestly, this definitely looks like the most difficult one to pull off. Real estate costs a lot of $$$ and homes by the beach are even more expensive. Also for a lot of NRIs this depends on how well the USD holds up against other currencies. But I will put it there and see what fate deals me and what I can do to make this happen.
5 Travel..travel…travel..this list is so enormously long that I don’t know where to start..but I will list the top 12 countries I want to visit in this lifetime.
1 Bali-Indonesia – for the temples, culture, food and scenic beauty – I’ve read about Bali and always wanted to visit, but never found time.
2 Thailand-Singapore-Malaysia – group these together as they are close and can be done in one trip. Singapore – for the sheer inspiration of the world’s safest, successful economy, Thailand and Malaysia – buddhist temples and culture.
3 Tulip/Flower festival – Holland – am a flower buff and very enamored by Holland’s flower festival for years. Am looking at a National Geographic Tour for this but may look at other cheaper options too.
4 Belgium – for Tintin and chocolate. Enough said.
5 Norway – for the wonderful fjords and scenic beauty.
6 UK – cultural connections and so many places to see. I think I’d need a month in London alone.
7 Revisit Italy and Spain with my sister. So love these  two countries – in particular Italy, for the food and the history. I love showing off what I’ve already seen to family members, and she is my first pick to show off :))
8 Australia/New Zealand – no reason to not go, and I have family members there.
9 Costa Rica/Panama Canal – hopefully this will happen soon.
10 Switzerland – scenic beauty and chocolate, again.
11 Austria – Sound of Music Country, not possible to not visit.
12 Antartica – last but not the least, ice laden continent would be absolutely a great item to get off bucket list.

Aside from these places, I’d love to see as many national parks in USA as possible, and and many many places in India – too many to list.  I am getting a national geographic map with pins to put on my wall next year, and hopefully when am done there won’t be too many unpinned places on it. I look forward to updating this list as I go along each year and see where I get with it.

Thank you, Brent Ozar and Steve Jones, for your inspiration.

 

2016 – A Year to remember

2016 has undoubtedly been a landmark year in my life. To me it marked my first conscious entry into mid age. It was the first year that I really pondered some of the questions that people need to think of as they get older in life – with clarity that I had not enjoyed before. I think that that clarity only comes with age, and at the right time, no matter how hard we try to make it happen earlier. Some of these questions , for me, included –

1 How much longer do I want to work in IT?
2 Am I doing the kind of work that energizes me and makes me feel like I am contributing something real to the world in some way?
3 What do I want out of where I live? (Or in other words, am I happy with connections I have and social life am having?)

I have been pondering these questions for a couple of years now, but it came to a climax towards end of 2015. I was at a new job – a DBA position again. I was making great $$$, the benefits were very good and the place was just a couple of miles from where I lived. But, the job had certain issues that led me to pondering these questions deeper. It got to be a mental struggle that made it very hard for me to go in to work every day with a positive mindset. I looked at my savings, and also talked to some of the many connections I had made with the SQL community. It became clear to me that I needed a sabbatical to ponder some of this – with some part time consulting work to keep my bills up and stay in touch with technology. So, I decided to leave the position to do just that – take a sabbatical with a part time job and ponder what I want to do next.

Although it sounds like a romantic/cool thing to do now – it really did not feel that way. It felt like relief, and the extra time was a true blessing – but there were fears that went with it – fears  that am doing something very radical, that I will run out of money, or fall sick, on and on. But in truth, none of that happened. I spent a good 4 months doing consulting and learning some great new things, catching up on my reading, talking walks ,meditating, and pondering the questions I had set myself to answer. I was led to understanding that a switch to BI and Analytics would be a better option for me, after two decades of production DBA work. I also figured that working in healthcare related analytics would give me the kind of satisfaction I craved – that my work was making a difference, in some small way, to the bigger world and was not just about putting out fires on servers.

By the end of March I found myself a BI position with a healthcare analytics firm, not too far from home. It also involved significant amount of DBA work, which I was glad for, as someone switching lanes. At the end of December am glad to say that I am loving what I do and planning to keep at it as much as I can. I am also blogging and writing articles on analytics, in addition to pursuing an associate degree. So, it all lined up like it was meant to. The year was hard in so many ways – there were some health challenges towards the end of it, and finding time to put into learning is still very hard. But I believe am on the right path and will be guided towards my eventual goal of retiring happy and doing the kind of work I want to be doing.

My goals for 2017 are as below  :
1 Make time for what matters – not let work run my life. By that – eat well, exercise, meditate, take time to blog and learn outside work. The time management is easier said than done but setting the goal is the first step.
2 Understand that time is limited, and retain this understanding on a proactive basis. This is the big difference in thinking from youth to mid age. I believe I have 10-12 years of full time work left. In that time I want to be doing what I enjoy and not give in to fears and insecurities.
3 Take time for connections that matter – for friends, family and people who need me. To me that includes connecting with my family of origin (atleast one trip to India every two years), connecting with #sqlfamily (PASS summit, as many sql saturdays as possible), staying active with local community  – organizing sql saturdays with my partners in crime – John Morehouse,Chris Yates and several loyal volunteers, speaking at local user group and so on.

My suggestions to anyone else in the same place as I am – or getting there –

1 Life is short. If you are stuck in a seriously unhappy job or doing work that does not seem to mean anything – reconsider. Honor your heart’s calling, and take time to find it.
2 If you are over 45 – make a bucket list, and check off items. Make solid plans to get atleast one or two items off the list every year.
3 Learn proactively – very few people I know got ahead by just learning on the job. You need a fantastically good job for that, and granted, there are a few, but not many of us are that lucky. How does one find time? Yes, that is a hard question ,but that should never be left unanswered. Two simple things that I am doing are
1 Listening to pod casts or watch pluralsight videos when I exercise,
2 Blog on one thing I learned every week.

I want to increase this as I go, but am making strides even with this much.

Wish you all health, peace and happiness in 2017!! Thank you for reading.

 

 

Multivariate Variable Analysis using R

So far I’ve worked on simple analytical techniques using one or two variables in a dataset. This article is a sort of a summary – about various techniques we can use for such datasets, depending on the type of variable in question. The techniques include – how to get summary statistics out as relevant, and how to plot the appropriate descriptive graph. There are fundamentally two types of variables –

1 Continues – variables which are not restricted in their value – such as height, weight, temperature etc.
2 Categorical – variables which can have specific values or a specific domain to which values belong.

The types of relationships we want to look at are

1 Categorical-Continues
2 Categorical-Categorical
3 Continues-Continues

1 Categorical and Continues Variable:  For this purpose we will use one of R’s built in datasets, called Orange. This dataset has information on five types of orange trees, coded 1 to 5, with their circumference and age. The categorical variable here is the type of orange tree, for continues variable we can consider either circumference or age.

1 List the dataset. We can do this by simply typing Orange.
cc1

2 To get  the summary statistics on each type of tree - 
by(Orange$circumference,Orange$Tree, summary)

cc2

3 Now we want to understand what this data is made of. What are the min and max values , which is the biggest tree and which is the smallest, what is the overlap in values and so on. The easiest way to do this is  to get the box-and-whisker plot of this data

boxplot(circumference~Tree, data=Orange, main="Circumference of Orange Trees",  xlab="Tree", ylab="Circumference", col=c("red","blue", "yellow","orange","darkgreen"))

cc3

From the plot we can see that type 3 trees have the smallest circumference while type 4 have the largest, with type 2 close to type 4. We can also see that type 1 trees have the thinnest dispersion of circumference while type 4 has the highest, closely followed by type 2.  We can also see that there are no significant outliers in this data.

2 Relationship between two categorical variables:

For this purpose I can use the built in dataset in R called ‘HairEyeColor’. This dataset has gender wise hair and eye color for several people. Gender, Hair color, Eye color are categorical variables. For simplicity’s sake I am only going with gender and eye color.

Let us say we want the following information from this dataset –

1 % of men and women across eye color (and the unpivot – % of men and women per eye color)

2 % of men and women across hair color (and the unpivot – % of men and women per eye color)

3 Count of men and women in the mix (total)

4 Count of men and women for each eye/hair color

The code that accomplishes  all of this is as below:

# Flatten the data into gender/eye color

gendereyemix<-xtabs(Freq~Sex+Eye,data.frame(HairEyeColor))

# % of men and women across eye color

prop.table(gendereyemix, 1)

# % of men and women for each specific eye color

prop.table(gendereyemix, 2)

# Number of men and women in the mix

margin.table(gendereyemix, 1)

# Number of men and women per eye color

margin.table(gendereyemix, 2)

Results are as below:

cc4

We can accomplish the same using 3 variables in the mix for both prop.table and margin.table functions.

We can also run many tests of independence with them.

The appropriate chart for comparing categorical variable may be bar chart.

barplot(gendereyemix, main="Gender-Eye Color Distribution",          xlab="Eye Color",col=c("darkblue","red"),         legend = rownames(gendereyemix))

cc5

3 Relationship between 2 continues variables:

For this let us consider the airquality dataframe, and the two continues variables it has, windspeed and temperature. To begin with we can get a summary of windspeed for each temperature by

by(airquality$Wind, airquality$Temp, summary)

cc6

Although this gives a decent idea of airspeeds at each temperature, we have no idea if there is any specific correlation between the two. To understand if there is a correlation –

t.test(airquality$Wind, airquality$Temp, data=airquality)

cc7

The p value is really small indicating that there may be a statistically significant correlation between airquality and temperature. We may want to draw a graph to understand what this correlation is – I tried a scatter plot

plot(airquality$Wind, airquality$Temp, main="Scatterplot of Wind versus Temperature", 
+ xlab="Wind ", ylab="Temperature", pch=19)

cc8

Now we add a line of best fit to the graph

lines(lowess(airquality$Wind,airquality$Temp), col="blue")

cc9

From this we can see that there is a correlation between temperature falling with increasing wind speed.

These are just a few examples of how to correlate variables. In many cases there may be a lot more than two variables in the mix and there are many strategies involved to study the correlation. But understand what types of variables they are, and what are best graphs to use for the correlation helps further our data analysis skills. Thanks for reading!

Associative Analytics: Two sample T Test

In the previous post we looked at a one way T-Test. A one way T Test helped us determine if a selected sample was indeed truly representative of the larger population. A Two way T Test goes a step further – it helps us determine if both samples came from the same population, or if their average mean equals zero.

I am using a similar example as I did in the earlier post – but with two people. There are two walkers – A and B. They want to know if having a heavy meal has any influence on walking. Below are how many steps they took the week following thanksgiving – I , by the way, did not indulge in a heavy thanksgiving meal, but she did. Our data is as below:

Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7
Walker A 10001 10300 10200 9900 10005 9900 10000
Walker B 8000 9200 8200 8900 9800 9900 9700

For applying the two sample T-test – we need the following conditions to be met:

1 There needs to be one continues dependant variable (walking steps) and one categorical independant variable – with two levels (whether walker had a heavy meal , or not).

2 The two samples are independant – they don’t walk together or at the same place.

3 The two samples follow normal distributions – there is no real, direct way to ensure this and there are many debates on how to determine it. For simplicity’s sake, i followed the example here – I came up with p values of under 0.05 for both walker A and B, so am good with proceeding with two sample t-test.

That done, let us define the null hypothesis, also called H0. Our null hypothesis here is that both samples have no mean difference, or in other words, the meal(s) had no influence on the differences in walking steps. The alternate, or opposite of that is that walker B walked less than walker A because he/she had a meal.

There are also parameters you need to pass to the T value function in R – if the variance of the two datasets is the same, you have to say that var.equal = true. In this case variance happens to be quite different, so we can safely leave that out and go with default T value function.

Below is the calls to the function – first from T, then TSQL, then R via TSQl. For R I did not choose to connect to SQL because the dataset in this case is seriously small.

I just open R studio and run code as below:

Using R:

a<-c(10001,10300,10200, 9900, 10005, 9900, 10000)
b<-c(8000, 9200,8200, 8900, 9800, 9900, 9700)
t.test(a,b)

My results are as below.

twowayttestr

I can do the same calculation of T value using T-SQL. I cannot calculate p value from TSQL as that comes from a table, but it is possible to look it up. I imported the set of values into a table called WalkingSteps with two columns, walkerAsteps and walkerBsteps. For doing the math on T value the formula stated here may be useful. My T-SQL code is as below

Using T-SQL:

/*TSQL to calculate T value*/
DECLARE @meanA FLOAT, @meanB FLOAT, @varA FLOAT, @varB FLOAT, @count int
SELECT @meanA = sum(walkerAsteps)/count(*) FROM WalkingSteps
SELECT @meanB = sum(walkerBsteps)/count(*) FROM WalkingSteps
SELECT @varA = var(walkerAsteps) FROM WalkingSteps
SELECT @varB = var(walkerBsteps) FROM WalkingSteps
SELECT @count = count(*) FROM WalkingSteps
SELECT 'T value:',((@meanA-@meanB)/sqrt((@varA/@count)+(@varB/@count)))

The result I get is as below:

tvalue

To get the corresponding P value I need to stick this into a calculator, like the one here.

tcalc

I get the same result I got with R – 0.17233. To be noted that I took the degrees of freedom from what R gave me – 6.46. If I had no access to R I’d go with 6 , as that makes logical sense. To understand more on degrees of freedom go here.

TSQL with R:

I can run the same exact R code from within TSQL and get the same results – as below:

EXEC sp_execute_external_script
 @language = N'R'
 ,@script = N'Tvalue<-t.test(InputDataSet$WalkerAsteps, InputDataSet$WalkerBsteps);
 print(Tvalue)'
 ,@input_data_1 = N'SELECT WalkerAsteps, WalkerBsteps FROM [WorldHealth].[dbo].[WalkingSteps];'

tr

From the results we can see that the t-statistic is equal to 3.181 and the p-value is 0.007907,well below the 0.05 for the 95% confidence interval needed to accept the null hypothesis.Hence we conclude that with this data there is some evidence that a heavy meal does impact walking/exercise the week after.

I’d eat well, regardless :)) HAPPY THANKSGIVING :))

PASSion Award and what it means to me

passionaward

2016 is going to be a special year in my life. There was an article on Oscar awards a while ago – on reasons why the oscar is the most watched awards ceremony around the world. No, it is not just because of movie stars. Everyone, secretly or publicly – longs for their ‘oscar’ moment. The moment when the world (or the equivalent of the ‘world’, a lot of people) would get to see who they are and what they did, and give them kudos for it. Not all of those people get to achieve that moment.

Many of us do not work in environments that are exposed to public scrutiny. Some companies offer ‘outstanding employee’  awards which come close to a form of public recognition- but are usually ridden with politics in how they are awarded. Personally – I have received two awards like this in places I worked at – I promptly got rid of the trophies after I left their employment. There was too much jealousy/back stabbing and politics surrounding it, many of which was revealed openly and not even private or polite. The memories associated with these were not pleasant or rewarding – and it often felt like a feeling of awkwardness for a long time later. I do not say work related awards do not mean anything – they do and must be appreciated, but they also do generate a lot of tension and politics in most places. The need for recognition and having ‘oscar moments’ though, is human and universal. Some people are able to get it met. For others, we have to dig deeper to find other ways of getting our needs for recognition – or , perhaps do work that is its own reward.

Part of the reason I started to work in community was that it was proactive work or service to people. It has no titles attached, no $$ attached, and it is a lot of hard work. It is work with people who  do it because they enjoy doing it and nothing else.I wanted to be around that kind of people, and I wanted to grow that feeling in me – of finding rewards in what I do, not waiting for someone to call me out or give me something in return. This year was 10 years since I started the Louisville SQL Server User group – which started in the public library in 2005 with 12 people in attendance. Soon after that I also started running SQL Saturdays, we clocked 8 of them this year. I can’t claim to have been supremely happy all through these years – i’ve had my frustrations and low moments with it, but I did get a glimpse of doing work for the joy of work alone, serving people and working with others who thought and believed likewise.

My  oscar moment came without asking – during the summit this year. I was given the PASSion award – the highest honor for a PASS volunteer for outstanding service, this year. The award was presented by PASS president Adam Jorgenson at a ceremony in front of a huge audience of 5000+ people. I was congratulated by hundreds of people – many of whom I do not know or have never met personally. It was a huge, huge honor and one that am still coming to terms with. There are many people I want to thank for this..the list is long, but there are some I can and must single out – as people who have inspired me to stick with community work as its own reward.

Kevin Kline – one of the founder leads of the PASS organization itself – he has educated me with many stories of the hard work and thought that went into the early days of the organization. He is always there for me as friend, mentor and guide.
Karla Landrum – would have been impossible for me to run so many events without Karla’s rock solid support to lean on. I’ve cried on her shoulder many, many times when I’ve been frustrated and tired – she has been there for me always as a true friend and guide. I will miss her dearly in the years to come.
Rob Farley – when I had some real moments of frustration some years ago with things not going well – Rob helped me understand the real purpose and meaning of community work, and to remember that it was often meant to be its own reward. Rob’s sense of humor and spirit helped me persist with what I was doing and got me where I am today.
Jes Borland – I met Jes on SQL Cruise six years ago. She and I are as different personalities as chalk and cheese (she is the cheese in that anology) – but she is on top of the list of SQL women I look up to – for sensitivity, understanding, guidance and just pure fun.

Thank you to all of you, and to the PASS Organization – for making my oscar moment happen. Not everyone gets to experience it – it is life altering, and it is a blessing of pure love and regard that is very hard to find elsewhere. I am humbled and hope to continue to live upto it.

 

 

Days 1,2 and 3 of PASS Summit 2016

Today is Thursday, October 27th already. For some of us the summit begins monday – with precons and PASS Volunteering related meetings on Tuesday. For most other attendees the first day was Wednesday.

I arrived in the afternoon on Sunday with six other friends from Louisville,including my good friend Chris Yates. I have been travelling to the summit 11 years now – this is the first year that I had so many co passengers from my town heading there. I plan to write a blog post entirely on that subject. But was proud and happy to see attendance and interest growing from our small town. Following arrival I met with one of my favorite friends in #SQLfamily, Arlene Rose – we went shopping at Pike Place market. I have been going to Pike place for many years now – was a bit sad to see a few of my favorite stores gone. They included a tibetan buddhist store selling masks, a consignment store selling gently used scarves and jackets, and a herbal store.It is a way of life and i hope they are well somewhere.

Monday, Day 1: I went in to attend Itzik Ben Gan’s pre con on Advanced TSQL. I have not been attending precons after they stopped recording the sessions – since it was too pricey and was easier to find cheaper equivalents at sql saturdays and other places. But Ben-Gan was someone you really want to learn T-SQL from, and he did not teach at too many other places. The class was worth every dime. The session was packed with tips and tricks and presented in elegant, simple, easy-to-understand ways. I greatly enjoyed it and would highly recommend it for anyone considering it next year. In the evening I had dinner with Chana Cohn, one of my old friends from my days at Kindred Healthcare. We had a great evening catching up.

Tuesday, Day 2: Tuesdays are usually reserved for PASS Volunteer related meetings – my meeting started with rehearsing for the keynote for wednesday. (More on that below). It was fascinating for me to witness the amount of work that goes on behind-the-scenes for the keynote – from staff at PASS and volunteers including directors and several others. We as attendees and even as volunteers at other levels do not normally see this – we owe them a  thank you if we enjoy a keynote, and not just for the content. It is a ton of hard work to pull off. Following this we had the yearly meet-up of SQL Saturday organizers. Many items were discussed including funding from PASS, anti = harassment policy, website changes, sponsorship changes and so on.Overall it was a productive and informative meeting. In the afternoon we had a meeting of Regional Mentors. Could not attend the meeting for Chapter leaders as i had some work to take care of. But all meetings were useful and was great to meet with volunteer friends you don’t get to see otherwise. Attended the opening ceremony in the evening – which had some really good food options for vegans and vegetarians. I got my fill of dinner here and decided to pass up on the volunteer party – given the weather and the distance, a good six miles from where I was. Retired early since i knew next day was a long one.

Wednesday, Day 3: I normally do not dress up for any PASS event days. Just wear one of my many SQL Saturday shirts – and jeans to go with it. But today was special – today was my day as I had won the PASSion award – this news was communicated to me a month ago but was asked to keep it secret as NDA information.The PAssion award is a true honor – it is the highest award in the PASS community for service to the community. It is via nomination from fellow community members and approval by the board. I was humbled and honored to receive it. After a bit of make up and a tiara – I made my way to the convention center. I was given a special seat in the front row – which in itself was an honor – among so many outstanding volunteers, microsoft customers and VIPs. My good friends and directors – Grant Fritchey, Allen White and Argenis Fernandez were around to help with nerves.  The ceremony was over quickly and social media started to near blow-up the phone with tweets and facebook messages of congratulations. It was a unique, once-in-a lifetime experience and one that I shall greatly treasure and remember. Sadly, there is no video recording available yet – I am told that it will be there on summit recordings. I will be happy to share a clip when I find one.

The rest of my day was taken with hugs and thanking many people – many of whom I did not know at all. I want to say THANK YOU again to the awesome sql community who made all this possible. Am very humbled and honored by your love and regard, and hope to continue to live up to it.

 

TSQL Tuesday #83 – The Stats update solution

TSQL Tuesday is a monthly blog part hosted by a different blogger every month – it was started by Adam Machanic. This week’s TSQL Tuesday is hosted by Andy Mallon – the topic is ‘We’re dealing with the same problem’. I have chosen to write about a common problem I have encountered at many places I have worked at. The problem has to do with queries or stored procedures that suddenly start to perform poorly – when no changes have been made to the code or to the server.

The common perception/misunderstanding I have encountered for this is that it is only an issue with statistics updates and updating statistics with full scan should take care of it. In many cases this is the real reason. In some cases it really isn’t. It could be an issue with parameter sniffing, and a plan being reused that was generated for a set of parameters and appropriate for that set of parameters. But, most people jump to fixing statistics. This is especially true when they don’t really have the runtime plan it used, can’t find it in the cache and are just going with some sort of past experience.

At one place I was at people would update statistics every 10 minutes or so in a frantic attempt to ‘fix the slow query’, which would at many times not respond at all. At another place they actually had an automated check for when the stored procedure finished running and if it was still running beyond its normal duration a full scan statistics update would fire off. None of these are wrong – but repeatedly doing the same thing when the query does not improve, and assuming that is the only reason for the problem is wrong.

What I do is to recompile the plan with different parameters – test it to see the performance. If you are able to get a difference in performance with a different set of parameters then it is probably a parameter sniffing issue. And, if the stats updates do not fix it it is probably that too. Statistics updates are never really a bad thing to do, but they may not fix every slow query there is. Check if the issue is with parameter sniffing also. Also make sure there are no changes that went out – to the code and to the environment, that may be contributing to it.

 

11 years of PASS Summit

This is a story of my 11-year association with PASS, and the many ways it helped me grow as a person and in my career. And the many ways I saw other people grow.

 Summit #1 – 2006: Gaylord,TX: I was a visa holding dba-developer at a small shop. The Microsoft marketing person who came to meet my boss sent me some info on the summit when I asked him for info on SQL Server related training. I could only afford two days along with paying for lodging and airfare. The resort was lovely. I did not know anyone in the crowd.Most of what was discussed was going above my head as a small-shop dba. In the vendor area I met a bald guy named Rushabh Mehta who was handing out fliers about starting user groups. I found out from him that there was no user group in Louisville. He encouraged me to start one, and readily gave me his cell number if I had any questions. On my way back home I met a lady at the airport who was from my town and worked as a DBA.She and I struck up a conversation and she was willing to help me with starting the user group.Our first user group meeting was at the local library attended by 12 people. Rushabh was the first person in the SQL community that I got to be friends with. Through the year he responded patiently to my many phone calls regarding setting up the site, getting speakers, getting sponsors, on and on.

Summit #2 – Denver, CO: By now the user group was going strong and I had gotten to know many people in the community as a result of running it. Craig Utley, Dave Fackler and Sarah Barela were among my first speakers. I got permission from work to spend a whole week at the summit – and since the registration was now comp-ed as a chapter lead I could afford to. At the Denver summit my best memory is around sitting at the breakfast table with a tall guy from Chicago named Brent Ozar who said he was an aspiring speaker.  I enjoyed the summit greatly and learned many new things.

Summit #3 – Seattle, WA: This was my first ‘proper’ summit – as this was the year they started doing chapter leader meetings. I still did not know too many people. Rushabh and another volunteer named Sujata from Rhode Island were the only people I knew. But I met many people at the chapter leader meeting and liked the discussions a lot. My earliest memories are around meeting TJ Belt and Troy Schuh. I also got a chance to meet Kevin Kline and Andy Warren. Andy talked to me about this day long event called SQL Saturdays that he was doing in Orlando. He readily offered me his cell number and help with setting up one in our town. Kevin offered to drive in from Nashville to speak for our user group. What impressed me right away was how sincere and committed they were to the cause.SQL Saturday #1 at Louisville started this year, with Andy’s coaching and at a small venue in New Horizons Louisville. Although we only had 50-60 attendees – it was a huge success and appreciated by many. We also had the honor of being sponsored by another user group – John Magnabosco from IndyPASS was our first sponsor. Don’t think there are too many sql saturdays who have been helped in this manner.

Summit #5 – Seattle, WA: By now I had started doing other things besides being a chapter lead and running a SQL Saturday – I wrote regularly for their news letter. I was a Regional Mentor for the South Asia region – and this year I also helped run the pass booth at Tech Ed. The summit had a table per chapter at lunch – it was at this table that I met a gentleman who would open doors for my next job soon after I got home. Two days after I was home – I received a phone call with a message from a large firm with a great reputation – that the DBA manager wanted to talk to me. Someone on his team was at the summit and had met me there, and had recommended me for a senior role based on our conversation. I could hardly believe my ears. I am not a naturally extroverted person. It is even harder for me to drum up my skills when needed. And in this case all that I did was to have a conversation with somebody at the lunch table. I met the person who called me and in a week I landed the best job of my career as a senior DBA. They also included in the contract that they would pay every dime of my expenses to the summit.

Summit #6, 7, 8, 9, 10 and this year…11 – time flies in a blur. I have done so many activities with PASS during these years – served on selection committee, moderated 24HOP, been first time attendee mentor..in fact I even forget some of those titles now as so much time has gone by. We have 10 years of SQL Saturdays to our credit now.  I intentionally book my room a little further from the summit for quiet time after the day, I can barely walk 10 steps without someone calling my name. I have never, ever, ever looked for jobs using headhunters or monster or dice or any such thing. after that one incident when I received a phone call. It has always been via referrals through the community. I think that is what I’d  consider the best reward ever professionally – that jobs come to you, you don’t go searching for them. And the friendships and relationships I’ve made via this community really don’t have a price tag. They have all grown along with me, as family – we will grow old together, retire and recall many of these good times.

Thank you SQLPASS, and #SQLFAMILY.

 

 

 

Why SQL Cruise?

I was riding the elevator up from lunch today, at work. I am relatively new at my job and do not know several people at my workplace – yet. I live in a small town, and quite a lot of them know me as someone active in the local community. I am very used to strangers asking me SQL – community related questions at places like Trader Joes, DMV, airport and all sorts of different places. Today – there was a young lady riding up with me. I knew her to be from the same workplace but not too well so we exchanged polite smiles and I looked down at my shoes, as I do always when am in the elevator with someone I barely know 🙂  When I got off she asked rather hesitantly – ‘excuse me, can I ask you something? Do you know much about what they call SQL Cruise?’..Needless to say , I was delighted. SQL cruise is one of my utter favorite topic of conversations, even with total strangers. For the next 10 minutes or so – she and I had a great conversation – I told her all about the cruise, about the places we’ve been to, the fun activities on board and on shore, the great training, office hours, everything. Most importantly what I told her was this –

I am 11 year regular attendee of the PASS summit.  Whenever anyone mentions a conference regarding SQL server – the summit is what comes to my mind atleast, to recommend. I love the PASS community and try to promote it whenever and wherever I can. But, there are things I don’t get at the summit.  One of them is about having time and leisure to grow good bonds  and getting to know people better. Make no mistake – 11 years of the summit have yielded me many, many friends for which am wholly grateful for. But not everyone can keep going that long, and not everyone has the same social skills to make friends from among 3000 + people. The summit is a huge gathering, and there are way too many distractions or things that get in the way of really hanging out even with people you know. I’ve lost count of times when just the noise and the crowd gets to me and I am left with a sort of a dazed look by thursday evening. By Friday I want to go home and all other thoughts are pretty much gone. I am wiped and need to recharge. I have decided, in the past 3 years – that the summit is a week to touchbase with people I know, and to attend some sessions on topics I don’t know and want to know. It does not work for me as a place to grow close friendships or even to network very well.

That is where  SQL Cruise comes in. When you meet, eat and spend time with people and their families – a bond develops. People get interested in you as a person, and you in turn are interested in them.A friendship is born and that can lead to many amazing possibilities – including job offers.And, to a die hard traveller like me – the amazing places I get to see are in themselves worth every dime. I get to see them with friends, people I really respect and have regard for. I cannot think of the gorgeous beaches of St Thomas, or the picture perfect Amalfi Coast, the grandeur of the Colosseum in Rome , the food at Barcelona, or the sights of Mendenhall Glacier – without thinking of them at the same time. And that kind of memories are simply not  created at any other conference or training.

So, if you’ve been reading my post so far – and if your goals are the same as mine – to see fun places, learn good SQL from among the best teachers, have discussions to the background of waves rocking a boat, make some good friends who are genuinely interested in your career and your success – sign up now! You will not regret it, I promise – and will come back for more.

Associative Statistics – One sample T-Test with TSQL and R

In this post am going to attempt to explore a statistical procedure called ‘One Sample T Test’.

A T-Test is used to test the mean value of a data sample against a known mean of the entire population from where the sample came from. An example would be, if the average voting preference of the USA is Democrat, I want to test the average voting preference in KY to see if that corresponds to the national average. Another simpler example is if all coke bottles produced are 150 ml on an average – I want to see if this quantity is exactly true (on an average) of the 10 coke bottles I have in the fridge.

For this post I decided to go with a simple example of how many steps I walked with my per day for the month of August. My goal is 10,000 steps per day – that has been my average over the year but is this true of the data I gathered in August? I have a simple table with two columns – day and steps. Each record has how many steps I took in August per day, for 30 days. So – SELECT AVG(steps) FROM [dbo].[mala-steps] gives me 8262 as my average number of steps per day in August. I want to know if am consistently under performing my goal, or if this is a result of my being less active in August alone. Let me state my problem first – or state what is called ‘null hypothesis’:

I walk 10,000 steps on an average per year. 

How is T value calculated? The formula for T value is a bit complex –

tvalue

The numerator x bar is the mean of the sample. The mew-zero as it is spelled is the hypothesised mean – or what I say I expect of the sample value – in my case , 10,000 steps.. The denominator – s, is the standard deviation of the sample or the square root of sum of difference between mean and each value, and n refers to sample size. Without pulling hair out on what this stands for etc – what it really means is the ratio of differences between sample values to the mean compared to ‘inner noise’, or the difference within the sample set itself. If you get a high value it means the sample set is probably not fitting my hypothesis, or there are too many differences between values and the hypothesised mean. A low value means the opposite, that the differences are internal to the sample.

My goal is to prove or disprove this with the sample selected from August. I am using a significance level of 0.01 or what is called 99% confidence level.

Using TSQL:

SELECT AVG(steps) AS 'MEAN' FROM [dbo].[mala-steps]
SELECT (AVG(STEPS)-10000)/(SQRT(VAR(STEPS))/SQRT(COUNT(*))) AS 'T VALUE' 
FROM [dbo].[mala-steps]

tsqltvalue

We get a mean of 8262.36 and a T value of -5.023.

Calculating the p value or probability for this T value can be done via calculator here – unfortunately this is not possible with TSQL. If we stick in the values of 5.023 and 29 degrees of freedom (30 values – 1) we get a really low p value

R does the entire math for us in one simple step as below:

install.packages("RODBC")
library(RODBC)
datasteps <- sqlQuery(cn, 'SELECT steps FROM [dbo].[mala-steps]') 
t.test(datasteps$steps, mu=10000, alternative="less", conf.level=0.95)

rttest

The same R code can be called from within TSQL too – giving results as below:

-- calculate one way T value
EXEC sp_execute_external_script
 @language = N'R'
 ,@script = N' tvalue <-t.test(InputDataSet$steps, mu=10000, alternative="less", conf.level=0.95);
 print(tvalue)'
 ,@input_data_1 = N'SELECT steps FROM [dbo].[mala-steps];'

rtsql-tvalue

From the output, we can see that the mean number of steps I have taken for the month of August is 8262. The t value is 5.02 (the sign does not matter). Which means that difference between individual values and the mean is higher than the ‘inner noise’ or the difference between values in the sample sample set.

The p-value ,or the probability of getting this t value from this sample is really really low –  1.187e-05.So, it is more likely this dataset/sample is probably not a good one – or in other words, i can’t accept the null hypothesis that I did walk an average of 10,000 steps a day based on this sample. Maybe August was not the best month to judge my commitment to exercise…or maybe i’d have to try more samples than August alone! More on those in next post!