Garrett Mayock's Blog

blawg

Why do I even want to be a data scientist?

I'm not just interested in the buzz; it's something real.


  Garrett Mayock posted 2019-01-14 21:49:05 UTC

Is it a buzzword that caught my attention? Did my jaw hit the ground when I saw the salary figures?

Naw, bruh. I worked with data scientists for a couple years and I found it really interesting. Plus, the more experience I got around it, the more I realized it was something I could do.

Even in the short time I’ve been practicing data science using Python on my own, I’ve found my love for mathematics rekindled by the fact that it is so much better doing math with code than longhand. Like, honestly, if I had known about Python when I was deciding my career, I would have studied statistics instead of finance. Every aspect of math that I didn’t enjoy as a child is completely removed by the convenience of using code.

And what’s the point of learning everything longhand? Do they force you to build a car from scratch in driver’s ed before allowing you to get behind the wheel? No! But I digress.

I also really enjoy the coding aspect of it. I like the process of research, experimentation, troubleshooting, and the joy of deploying working code. I haven’t been a programmer by trade, but I programmed my own website (with a text-editor, not a site-builder). And placed in the top 13% of Kaggle’s Titanic competition. And I’ve built a few dozen Excel macros in my career to speed up my reporting. All of that requires the process. I just didn’t happen to discover how much I liked it until I was through with university.

And I’ve been an analyst in my career. I have a strong intuition for data and analysis. Better understanding how to use computer programming to analyze data is a natural extension of the work I’ve already been doing.

Furthermore, I also have a knack for interacting with people. If I can build my capabilities with programming and statistical analysis, I will have a very marketable skillset.

There’s a strong demand data scientists who can interact well with customers. So yeah, maybe I’m coming from a different direction than many data scientists – from the customer-facing side towards the mathematical side, rather than vice versa – but I think that will provide me with unique advantages.

So that’s why my two-year career goal is to become a competent data scientist. It seems really enjoyable, and I’d be good at it.

Wait, what do I mean by “competent data scientist”?

There are three main skills required to be a competent data scientist:

1.      Data engineering skills

2.      Business intelligence skills

3.      Data science skills

For data scientists, data engineering (not feature engineering) is a necessary skill because in order to analyze data, one needs to get the data. Therefore, understanding how databases work, how to access databases, how to query, join, and export the data, et cetera, is very important. Furthermore, in order for work to be put in production, the “data science” code will have to be hooked up to databases, dashboards, and so on.

Business intelligence skills – or, analyzing data and presenting actionable insights – is also very important. In order for a data scientist’s work to mean anything, they have to have intuition for data analysis, when what types of analysis are appropriate, and presenting that information in a way that helps make decisions. It’s one thing to be able to calculate the mean of data. It’s another thing to be able to understand how to use that information to impact the business.

“Data science skills” means knowing how to understand data, use code to manipulate it, apply machine learning techniques, and create meaningful output (sometimes meant for direct human consumption and sometimes for consumption by artificial intelligence software). It requires a strong understanding of the mathematical concepts behind the various techniques used.

Why am I confident I can become one?

I have all of the right aptitudes:

Languages (spoken and programming):

I have a knack for languages, specifically grammar. I believe spoken and programming language aptitudes are highly correlated. That is because programming languages like Python are very conscious of “grammar” – that is, syntax, white space, etc. My accomplishments:

·        I became fluent in German, and have maintained fluency since. I also studied Spanish for four years in high school and two in college, French (one and one), and Japanese (one in high school).

·        I became fluent in VBA years ago through self-direction (although it's been a while)

·        I hand-coded my own website in HTML, CSS, JavaScript, and PHP at www.gmayock.com

·        I am studying Python & data science (check out www.gmayock.com/blog for the latest updates)

Mathematics:

One of my weak spots is also one of my strengths.

The weakness is that I haven’t had formal education in mathematical theory since I was 16 (I consider my degree in Finance and my study of Economics to be more “applied” than theory).

The strength is the reason why - I finished all of my high school’s math offerings by the end of tenth grade (including AP Calculus and AP Statistics).

Even in the self-directed study I’ve had over the past two months, I have not lost my intuitional understanding of the concepts I learned half a lifetime ago. Therefore, I believe I am capable of learning all of the mathematics required to be a competent data scientist.

Analysis, troubleshooting, and presentation:

Successfully analyzing data requires being able to make sense of the data and investigate it. I have a history of successfully finding meaning in data and presenting it in useful ways.

A good example comes from when I worked for Disney. I was the analyst chosen to participate in the integration of a third-party system to help manage ad inventory. A few weeks after their system went live, we began to see a massive spike in inventory on one ad channel. However, this was not usable inventory.

·        The ad type in question was video ad which played in ad breaks for live-streaming content.

·        Those ad breaks have capacity based on duration (see first image).

·        They had previously run only 30 second ads (see second image).

·        A new backfill method meant they ran 5-second ads in the remaining time (see third image).

Because the vendor’s inventory projections ignored ad duration, ad breaks as in the example started showing capacity for four ads instead of one. This resulted in a huge inventory spike. However, the actual capacity for 30-second ads had not increased.

I was able to troubleshoot this problem without access to the vendor’s code. I had to create projections on my own and compare the results to theirs without seeing behind the curtain. I found the key variable which accounted for the difference in projections, presented the results to executives internally so they could inform their teams and prevent oversell, designed the solution, and shared it with the vendor to prevent recurrence. This prevented millions of dollars of discounted oversell.

At Microsoft, these skills were also my responsibilities. I even won the Financial Analyst Service Excellence award in December 2013. Later, because of my competence with these skills, I was chosen to lead a project for department leadership to analyze the viability of a new role.

At Maana, my role was a customer-facing solutions analyst doing consulting-type work, and also included analyzing and understanding the data. In other words, the customers would only have to tell us once what the data meant – I would come prepared, ask them the questions I needed to get complete understanding, and then serve as the expert to internal teams throughout the project.

Summary:

I’ve got the right aptitudes and desire to be a competent data scientist because I enjoy coding, mathematics, data analysis, and problem solving.

My background excelling in mathematics and languages projects a high ceiling as a data scientist. My career as an analyst has proven a knack for digging into and making sense of data. My recent experiences with programming have piqued my interest and have not raised any red flags.

As I said earlier, this is a report on the results of my self-analysis on whether or not I’d be good at data science. If you or someone you know is interested in talking with me about this, please reach out.

I wouldn’t be trying to be a data scientist if I had encountered any red flags. There are certain qualifications I am putting in (like saying I want to do project-based work rather than academia or research) because I know what I’m interested in – but I can be confident in the statement I am putting out. This is not sales messaging. This is a report on the results of my self-analysis on whether or not I’d be good at data science.

contact me