Plot Your Data

Originally posted on Data Column: The Collaborative Student Blog for the Institute for Advance Analytics

Visual data exploration or how filters will change your life

 

Plotting your data is a necessary first step with any large data set driven project whether that is forecasting, predictive modeling or just providing summary statistical insights.

There are many ways to plot in as many software packages as you can imagine.  I’ve enjoyed Tableau for a few reasons.

 

1- For larger data sets being able to summarize millions of rows into an interactive picture is a plus

 

2- Especially useful ability to connect directly to the SAS data sets

 

3- Filters.  I love filters to subcategorize your data.  If you are used to SAS for exploring your data just think of Tableau filters as dynamic SAS “where” data step statements.

I use Tableau to connect to my data and then employ the filters to dynamically pinpoint missing and miss keyed values.  The filters allow me to exclude these values from the visualization without altering the data set itself.

 

Once I’ve found some interesting relationships I can select the most useful filter variables and their values as a guide to traditional SAS programming and SQL queries.

Lastly I appreciate the ability to output the data used to create any visualization as well as see and export the full underlying data.

Showing is better than telling right?  Up next an example of visualization built primarily for exploration. . .

 

Visualize Whirled Peas

 

Let’s say for the sake of argument you don’t have any finance or budgeting background.

 

Let’s also pretend you’re given a data of all the General Government state budget line items for the past 13 years for North Carolina, ~ 3 million rows of transactional data.

 

Finally, let’s pretend your team needs to present to representatives from the Office of State Budget Management.

 

How are you going to understand the data you are given with very little domain knowledge well enough to present it to subject matter experts?

 

My answer is plot it and explore it with filters.

Reversion Exploritory Dashboard

 

The visual above was created in Tableau but you could visualize using other programs.

 

I experimented with different fields for both the X and Y initially while referring often to the data dictionary, but instead of writing queries in lines of code for new views it was easy to change the view with a drag and drop.

 

In the above I needed actual spending in relation to authorized to see when departments went over or under budget.  I used the filters to get to the correct actual and authorized fields as well as the correct fund and account category, but I only knew what to look for after inspecting many options and seeing all aspects of the data set. The hierarchy of filters let me select down to the individual account code.

 

The exploratory sheet I used is here

 

Feel free to play with it and create some of the views that just don’t make any sense.   Why was the point and click visual approach better than coding a number of visuals to see data relationships?   To me it felt more like exploring an unfamiliar physical object.  I found it easier to pull variables in turn and see them here rather than coding one variable or set of variables at a time in a static output.   Bottom line The dynamic nature was faster for me insofar as the insights I could glean.   The exploratory sheet was the basis for a suite of dashboards created using the same filter based data exploration.  The process was:

    • explore a set of variables creating a view

 

 

    • combine those views to answer questions and provide insight into the data

 

Ultimately we wanted to allow the state budget office to dynamically explore their data in ways they might not have thought of before.   See Dashboard here

 

There are a number of tips and techniques I learned along the way creating this suite of dashboards which I’ll summarize in a future post.

 

Important note:  The OSBM data set report and presentation was a team effort and while much of the data exploration I discuss here was my own work it is due in no small part to hard work of the entire team.  The above is posted here with their kind permission.  Go Team Blue 3!

 

 

Elevator pitches that just won’t work for IAA Employer Information Sessions

 

I sometimes feel awkward mingling at networking events and the thought of pitching myself in 20 seconds with something memorable gives me the heebie jeebies. Here are a few pitches or memorable phrases that I know won’t work, but may be useful to excercise my mental block demons. Hopefully getting these out of the way I can come up with something that does work

 

Hello, (pause) Evan Miracle (Shake hand, maintain eye contact),

 

  1. I’m an avid kickboxer and Octagon of Doom four time champion.
  2. I’m a childhood survivor of a concerted campaign of wedgies.
  3. My body is made of nearly 50% aftermarket computers parts from Radio Shack.
  4. I have three children so these dark circles are all natural with no zombie makeup required.
  5. I once ate 11 hamburgers and 2 large fries as well as a large vanilla shake in one sitting for a bet.
  6. I am a new convert and Tableau zealot (best used during mingling opportunities at SAS).
  7. Scream “Constant Vigilance!” Best used when their back is turned.
  8. I wandered in here from ABB on the third floor.  I’m just here for the free food.
  9. What’s your sign? You’ve got a very interesting and dynamic aura.
  10. I’d like to recite for you all of the prologue to Chaucer’s Canterbury Tales, “Whan that Aprille with his shoures soote …”
  11. Do you ever feel like squirrels are watching you from the trees and oh so silently judging you? Me too! (Note do not wait for answer before saying “Me too”)

 

It was great to meet you and thank you for coming to talk with us today.

 

Ask question then …

 

I would love to follow up with you by email as I have a few more questions. Can I give you my card?

 

 

5 ways the summer project gets it right

 

swim

The summer project at the IAA is a “toss you into the deep end of the pool” experience to teach you how to swim.  Swim in this case as a data scientist or at least the in training version of one.

 

The project structure gets many things right and the first is

shock

Shock you out of your comfort zone

Unlike most other projects in an academic setting which come at the end of a semester worth of lecturing on a single subject area the summer project is a learn by doing affair.  It shakes up the paradigm and forces the participant to learn while attacking the seemingly daunting task and many of us need that initial shaking to let us know this isn’t your typical masters.  But, how you might ask, can you successfully complete a project with little background experience, crucial subject knowledge, or full ability?

teamwork

Team work from the word go 

Luckily you aren’t alone.  You have four other people with different backgrounds, experience, and talents to drawn from and rely on.  Again unlike most projects for academia that I have been involved in you work as a team rather than on your own.  The volume, pace and scope of the project aren’t achievable by a single individual, but the teams handled the load and made it to presentation time in good order.

hi5

High level of initial motivation, talent and work ethic 

I may have been lucky in my group but from talking with other students it doesn’t seem so.  These people are good.  They have talent and drive as well as a wealth of damn useful experiences they bring to the table.  Our team, and other teams I spoke to, learned a great deal from each other.  Whether it was using R in unexpected ways, coding novel SAS macros, slide design/visualization or organizational skills the whole team experience is more than the sum of its parts.

diversity

Diversity of experience 

Every group had the same data, but looking at and sitting in on presentations during the final phase, the learnings were as diverse as the cohort.  We saw demonstrated analysis with perspectives on aspects of the data that our group never saw.  Employee workload across regions or policy initiation versus claims were two that stood out to me as data crops we could have harvested but just didn’t see.  

work

Learn as you work 

The institute doesn’t leave us to our own devices for a month and expect a complete project.  There is a learn by doing and just in time teaching methodology at play.  Formal lectures and training on all aspects of data analysis happens concurrent with the project.  This hands on experience and lecture material is enough to push the student to learn more and investigate with the overarching goal of the completion of the summer project.  A lecture on data cleaning in the morning can be directly applied to the project set in the evening while a visualization tenant can be applied to slide design the same day.  Having a reason to use and learn the material other than its presence on a midterm in the nebulous future is big reason the summer project works for teaching the cohort the career of data analytics.  I learned more about SAS in 3 weeks than I did in 10 years on the job because I needed to use it to get the task done.

takeaway

Take Away 

In no way a magic bullet, but a big take home for future analysis on my part, is the idea that looking at a set of provided data, of for example income, one can achieve powerful results with the addition of time and/or location to this basic data.  Moving forward, transforming or enhancing the data with new variables of this type will definitely become part of my analytics tool box.  The summer project experience also showcased the talent and diversity in the cohort as a whole and increased my energy and enthusiasm for the program and team based methodology that is central to the IAA.

 

Icons from thenounproject.com used under a creative commons attribution.  Click each icon for links to the original and author.

Introducing the new and improved Evan Miracle (now with less employment)

Greetings,

I’ve created a space to record my experiences as a M.S. Candidate in Analytics at the Institute for Advanced Analytics at North Carolina State University.

Who am I?

My name is Evan Miracle (Robert Evan Miracle or R. Evan Miracle depending on what you’ve asked me specifically). I am a 38 year old husband and father of 3.  I am a former employee, for just over 10 years, of the University in the FBNS department.  I worked for Dr. MaryAnne Drake managing her instrumental flavor analysis section of the Sensory Service Center.

Now I am a master’s candidate at the Institute for Advanced Analytics here at State.

I am coming to this blog a bit late in the 10 month program as we are already 1 month down but I hope to record some thoughts at least for the previous month and move forward successfully jotting down thoughts, impressions, and tips as well as provide future me with a look back on how I felt and on what I learned.