2016年7月17日星期日

Kobe Bryant Shot Selection: Data Exploration & Visulization

The dataset, Kobe Bryant Shot Selection is from Kaggle website. You can find more information and details about this dataset here: https://www.kaggle.com/c/kobe-bryant-shot-selection/data

I was inspired by Alexandru Papiu's report: https://www.kaggle.com/apapiu/kobe-bryant-shot-selection/exploring-kobe-s-shots/notebook He gave very detailed description of his thoughts and many fantastic visualization using R programming. What I am thinking is, can I use tableau to do the same thing? Then, let's do it.

1. Loading in the data:

The data I downloaded directly from the Kaggle website is data.csv, a kind of text file. So, when connecting the data, choose Text File.
You can keep this connection live or extract. Here, I won't change the original data, so I will choose Extract. 
The data are shown here:

2. Shot_made_flag

According to the dataset and Alexandru Papiu's report, if shot_made_flag = 1, it means Kobe made this shot, 0 means missed, observations with null are what we should predict. 
Therefore, we should move the shot_made_flag from Measures to Dimensions for future use. 
Using tables in tableau:
In this dataset, Kobe made 11,465 and missed 14,232. 5,000 nulls are what we will predict. 

3. Plot

In Alexandru Papiu's report, he created two plot functions:
* pplot() is a function to plot to see accuracy by feature. (bar chart, x=feature, show the counts)
* courtplot() is a function to plot to see position by feature (point graph, x=lon,y=lat)
And we can do similar things in Tableau
1). The locations for the various shot_types.
Move loc_x and loc_y to Dimension. Scatter Plot, row = loc_x, column = loc_y, use filter shot_made_flage = 1 or 0, use combined_shot_type for color.

2) As we can see from this scatter plot, there are many red points (Jump Shot). Then, highlight other shot type.

3) Shot from up close
Move shot_distance to Dimension. Two filters: shot_made_flag = 0 or 1, shot_distance < 5. Color: 0-made-green, 1-missed-orange.
4) Shot Made vs. Shot Missed

Distance really matters when shot.
5) Histogram of Shot & Accuracy by x_bins
Create a bin using suggest bin size.


Column is x_bins, row is count of Number of Records, and edit table calculation. Filter is shot_made flag = 0 and 1, also the color.

6) Histogram & Accuracy by action_type

Find action_type that count < 20

Then, create a set that excludes these action types. Column is count of Number of Records, and edit table calculation. Row is set of action type > 20.
Kobe likes Jump Shot. However, he was good at Dunk Shot.
7) Shot zone area
Column: loc_x. Row: loc_y. Scatter Plot. Shot_Zone_Area as color.
8) Shot Zone Basic
Column: loc_x. Row: loc_y. Scatter Plot. Shot_Zone_Basic as color.
9) Shot Zone Range
Column: loc_x. Row: loc_y. Scatter Plot. Shot_Zone_Range as color.
10) Histogram & Accuracy by minutes_remaining

Move minutes_remaining from Measure to Dimension. Column: Minutes remaining. Row: Count of Number of records with calculation (Percent of total, table down)
11) Histogram & Accuracy by Period

Move period from Measure to Dimension. Column: period. Row: Count of Number of records with calculation (Percent of total, table down).
The team, which Kobe was in, was pretty powerful, they needn't to get to the 7th periods. And, the accuracy of Kobe shot is stable enough. 
12) Histogram & Accuracy by Seconds Remaining
Move seconds_remaining from Measure to Dimension. Column: seconds_remaining. Row: Count of Number of records with calculation (Percent of total, table down).
From this histogram, we can see that Kobe really tried himself best at the end of time. But the accuracy is only around 30%, the lowest.
13) Histogram & Accuracy by Season

Kobe shot most in 2005~2006, and his accuracy of shot is highest in 2007~2008.
14) Histogram & Accuracy by shot distance
Everyone likes shot near the basket. When far away, he did really a good job from around 30 to 45. 
15) Histogram & Accuracy by Combined shot type

4. Tableau Dashboards

I decide to make dashboards to clarify "Where?" "When?"
https://public.tableau.com/views/KobeBryantShots/KobeBryantShots?:embed=y&:display_count=yes&:showTabs=y