[ETR #26] 4 Bespoke Data Tools Saved Me 100 Hrs In ‘24


Extract. Transform. Read.

A newsletter from Pipeline.

Hi past, present or future data professional!

This holiday season will be a little less bright thanks to my lack of personal GitHub commits.

Like you, I began 2024 full of ideas and motivation that, let’s be honest, was depleted by the end of Q1 when I was cranking out enough code at work that would please even notorious code volume stickler Elon Musk.

Despite my lacking output, I managed to hunker down to create useful “bespoke” (a.k.a. self-created) data tools to save hours on tedious work. Even if you don’t take inspiration from the code or create similar code, my hope is that you’ll come away with 1 takeaway:

Coding for function == coding for fun.

As a beginner it might make sense to spend hours processing available datasets to refine coding techniques and get comfortable with data manipulation; however, as you gain experience you’ll realize that some hobby projects can hit the trifecta of programming satisfaction.

  • Solve a small problem in your personal life
  • Teach or reinforce a new skill
  • Provide a sense of satisfaction or, dare I say, fun

You might not think “fun” when you think of parsing Walmart receipts to reign in grocery spending, but that’s what I spent 1-2 hours coding up once I realized my Walmart Plus usage was getting a bit “spendy.” The added bonus is that the parser can be leveraged to either calculate groceries or, in a much more relevant application, help me determine what items were missing from my order (the double-edged sword of passive shopping); read more.

Once I discovered and gained experience with PDF parser PyPDF, I began to see a variety of applications, including the parsing and ingestion of 6-8 page credit card statements. Since financial data is notoriously difficult to obtain, being able to cleanly and reliably process credit card statements provides transparency for my spending habits.

In an earlier edition of this newsletter I mentioned iterative backfilling as one of my top recommendations for small-scale automation; I literally spent 4 hours at work running a script that functions nearly identically to what I outline here. As a bonus, I didn’t even have to be in the room.

Although Python served me well for projects this year, I’ve developed an appreciation for Google Sheets’ data connections. Instead of having to update a static sheet of sources I reference in my published work and in these notes to you, I created a very niche search engine using only a Sheets connection and a bit of SQL.

If you’re even an occasional BigQuery user, check it out.

While I’ve enjoyed coding these backend builds, in 2025 I’d like to venture out of my comfort zone and create simple UIs for builds I access repeatedly or would like to share in a neater format.

If you’d like to read the expanded, published version of this newsletter, you can do so here.

Thanks for ingesting,

-Zach Quinn

Extract. Transform. Read.

Reaching 20k+ readers on Medium and over 3k learners by email, I draw on my 4 years of experience as a Senior Data Engineer to demystify data science, cloud and programming concepts while sharing job hunt strategies so you can land and excel in data-driven roles. Subscribe for 500 words of actionable advice every Thursday.

Read more from Extract. Transform. Read.

Hi fellow data professional! If you read my note on Tuesday you’ll know I’m coming off of the data engineering week from hell that seeped into my personal life, and delayed the launch of something cool I was planning to share with you; if you want to know more about that, scroll to the end of this message. Last week a flagship data source had a major problem and since it’s within my ownership area, I was the one with the knowledge and responsibility to fix it. I wanted to share the experience...

Hi fellow data professional! Hardly a work day goes by without receiving a request from a data analyst. They range from the mundane “Can you add this column?” to the occasional emergency “The data didn’t load all weekend and the leadership call starts in 15 minutes!” At the end of a jam-packed week I received an unusual request: Help with a Python script. My teammate wanted to know: Best practices How to commit to GitHub What the best way to deploy is They admitted the task was simple,...

Hi fellow data professional! It finally happened. I fell for a job scam. Luckily I realized my naivety after responding to the initial email. But let’s back up. We’ll examine Why this particular attempt was so “real” What made me skeptical How to prevent this from happening to you Established professionals in any field have the privileged problem of receiving unsolicited recruiter inquiries. If it’s from a random firm I typically move it to junk; if it’s a big name company, I give a look...