[ETR #62] I Got The Same ? In 12 Interviews


Extract. Transform. Read.

A newsletter from Pipeline

Hi past, present or future data professional!

I once participated in a remote job interview in which the interviewer was on the video call while driving... and smoking.

While that instance was among the most memorable interview experiences (for the wrong reasons), I’ve had just as many interviews that have blended together and faded into the recesses of my mind.

The common denominator, however, was the insistence on asking one question.

The answer you provide can make or break your interview.

The question I heard repeatedly, especially after I presented a project from my portfolio, was: “Where did you get your data?”

It’s an innocent question, but it’s a brilliant way for an interviewer to gauge your resourcefulness.

And while there’s no truly "wrong" answer, I quickly learned there's a definite best answer. The truth is, relying on perfectly clean, pre-packaged data from repositories like Kaggle is a trap. I’m not saying Kaggle is necessarily bad. I mean, I’ve used it myself for school projects. It just isn't always representative of the majority of data sources you'll encounter.

As I got deeper into the field and understood employer expectations, I realized that real-world data is messy, incomplete, and rarely comes in a perfectly formatted CSV. Using a stock dataset doesn’t show a potential employer that you’re ready for the reality of the job; it just demonstrates your ability to use read_csv.

When I started offering responses that showed my ability to source and manipulate data in a novel way, the interviews took a noticeable turn for the better.

Here’s what you should be saying:

  • “I scraped the data from a website and converted it to a dataframe.”
  • “I combined an existing dataset with data scraped from a Wikipedia table.”
  • “I accessed an API and built a pipeline to gather the information.”

These answers signal a crucial skill: you’re not just a data consumer; you’re an aggregator of information. You’re resourceful and you're not afraid of the messiness that accompanies the process of mining real-world data.

Creating your own unique dataset (even a small, niche one) demonstrates 3 things to a hiring manager:

  • You’re comfortable converting messy data into something usable
  • You are willing to deviate from "stock" datasets and approach problems with creativity
  • It showcases a genuine passion for the field and you’re invested in the craft of the role

As a bonus, if you can find a dataset that’s relevant to the industry you're applying to, you'll also prove that you have relevant domain knowledge, which is truly a rarity among technically-inclined candidates.

So, before your next interview, take a look at your portfolio. If it's full of projects using perfectly clean data, consider spending some time creating a new end-to-end build that starts messier.

You don’t have to build a custom data warehouse from scratch. In fact, even a simple project that involves scraping a Wikipedia table with Pandas can demonstrate additional effort that goes beyond downloading and reading a CSV.

In the end, the best source of data is yourself.

Read the original story here.

Thanks for ingesting,

-Zach Quinn

Extract. Transform. Read.

Reaching 20k+ readers on Medium and nearly 3k learners by email, I draw on my 4 years of experience as a Senior Data Engineer to demystify data science, cloud and programming concepts while sharing job hunt strategies so you can land and excel in data-driven roles. Subscribe for 500 words of actionable advice every Thursday.

Read more from Extract. Transform. Read.

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! If you live in the U.S., this week marks the end of back to school season; though, if you’re like my southern relatives, you’ve been back since July. The closest feeling most adults get to back to school (aside from the teachers), is starting a new job. While a new org, title and compensation package represents new opportunities, it’s also easy to feel like the “new kid”, which can lead to being...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! One of the most loaded terms, after AI, is upskilling. It’s something everyone should always be doing, yet, only the most dedicated can consistently dedicate time to learning and expanding beyond their comfort zones. If you’re on the path to becoming a data professional, you’ve probably spent countless hours learning, only to find yourself wondering if you’re actually making progress. I’ve been...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! When I worked as a resume consultant, the toughest mental block for clients was identifying and expressing material contributions at work; avoiding this communication is why so many job hunters revert to regurgitating their job duties rather than clarifying the outcomes of their work. In addition to overcoming the hurdle of distilling a complex technical role for non-technical recruiters to...