[ETR #24] Thanks(less)giving


Extract. Transform. Read.

A Newsletter From Pipeline

Hi past, present or future data professional!

Since today marks Thanksgiving in the US, I hope this reaches you before your eyes glaze over from the tryptophan-induced turkey coma we all inevitably slip into.

While today is a day of gratitude, from a data engineering perspective, I’d like to focus, instead, on the under-the-radar tasks that can make a difference at this time of year—even if they don’t gain you any recognition at work.

The reason you should consider doing (or assigning) these tasks in the near future is because, for many data teams that support businesses, Q4 is when things slow down.

With many companies closing for the last week of December, it’s important to complete “housekeeping” work to keep interruptions minimal so you don’t have to fix a pipeline after an eggnog or two.

Update These Documentation Areas

Ahh documentation… Everyone needs it but no one wants to write it. Attempting to overhaul several repos or products’ worth of documentation is daunting. But to simplify an already thankless task, at a minimum ensure your documentation clearly identifies the following.

  • Escalation contact: Who do we call when this thing breaks?
  • Schedule/Trigger: When does it run and what is the trigger?
  • Data source/API documentation: If a field suddenly gets deleted upstream, how do we troubleshoot?

Delete Unused Resources

In many orgs, Q4 represents “pencils down” for the financial divisions responsible for tracking revenue and, more importantly, expenses. I’ve said in the past that if what you’re working on isn’t directly generating revenue, the next best thing is optimizing costs/resource consumption.

Deleting unused resources like SQL tables must first happen at the individual level. I can’t tell you how many tables I currently have in production right now with the “_test” suffix. Since the work day inevitably gets busy, it may be a good idea to institute a quarterly deletion.

And it’s an even better idea to automate the calculation of your savings.

Enable Logs

I’d bet that if your production scripts are missing one thing it is a consistent form of logging. Simply writing a “Beginning script” and “data loaded” message isn’t going to cut it.

If you’re stuck after logging.info(), read my perspective on how to effectively communicate execution steps using your logs.

If you’re working on cloud-based infrastructure, you’ll want to assure your logs are synced with your SaaS/PaaS provider. It’s not very helpful if you can only see error logs locally.

You may not get promoted for submitting 100 PRs for adding logging to hastily-written production scripts, but you will save yourself and your team significant time debugging.

Ensure Your Privacy Policy Is Updated

The holidays usher in a time of year when companies are unusually generous, offering parties, gifts and bonuses. The last thing you want to do is get slapped with a fine for failing to comply with an increasing number of international data privacy policies.

This slow time is a good time to volunteer to audit your highly sensitive data sources and, if necessary, leverage your cloud provider’s data protection tools to encrypt and/or flag problematic fields.

Knowing your data is sufficiently encrypted and your builds are extensively documented should allow you to sleep much more soundly during your post-feast nap.

Here are this week’s links:

Questions? I’ll try to answer between bites: zach@pipelinetode.com

Happy Thanksgiving and thanks for ingesting,

-Zach Quinn

Extract. Transform. Read.

Reaching 20k+ readers on Medium and nearly 3k learners by email, I draw on my 4 years of experience as a Senior Data Engineer to demystify data science, cloud and programming concepts while sharing job hunt strategies so you can land and excel in data-driven roles. Subscribe for 500 words of actionable advice every Thursday.

Read more from Extract. Transform. Read.

Hi past, present or future data professional! As the winter holidays approach, we’re entering a period of downtime for most orgs. Assuming your employer has hit goals (or accepted losses), allocated coverage for the slew of inevitable vacation requests and maybe even entered a “code freeze”, you’re entering data & tech’s slow season. If you’re working, during this time you may be asked to do any number of “downtime” (actual free time, not data outages) tasks ranging from code refactors to...

Hi past, present or future data professional! If you’re in the U.S., Happy Thanksgiving! I’m prepping for my food coma, so I’ll make this week’s newsletter quick. Like millions of Americans, I’ll be watching NFL football (go Ravens!). The average NFL game is 3 hours. If you can skip just one of today’s games and carve out that time for professional development, here’s how I’d spend it. In the spirit of football, I’ll split the time designation into 4 quarters. Documentation pass - if you read...

Extract. Transform. Read. A newsletter from PipelineToDE Hi past, present or future data professional! In 2 weeks or so The Oxford English Dictionary will reveal its 2025 word of the year, a semi-democratic process that lends academic legitimacy to words like “rizz” (2023’s pick). If you’re currently employed or interact with white collar workers, you would think the word of the year is “headwinds.” Used in a sentence: “We’ve pivoted our AI strategy but still encountered headwinds that...