Can Your Data Pipelines Be Dangerous?


Extract. Transform. Read.

A newsletter from Pipeline: Your Data Engineering Resource

Hi past, present or future data professional!

Data engineering can be dangerous; ok—not, like, physically, but by building and maintaining data infrastructure, data engineers are given a surprising amount of access and responsibility. Every commit, table alteration and deletion must be made with care. It took 2 years, but I finally learned a shortcut to make developing SQL staging tables less risky and more efficient.

Even seemingly minor mistakes like joining on the wrong key can result in losing days or months of valuable data, which can be equal to hundreds of thousands or millions of dollars in revenue visibility. Outside of code mistakes, not paying attention to logistic factors like vendor contracts and API usage can not only result in downtime, in a worst-case scenario it can lead to an all-out blackout.

If the stakes sound ominous, I’d suggest examining the root of your hesitation to work more confidently and efficiently—it may even be the code itself.

There is a happy medium between freely building data pipelines and using the appropriate guard rails. As long as you take your time and don’t commit code directly to the main branch then you can do data engineering safely and avoid bursting your pipelines.

For those who are anti-virus minded, here are this week’s links as plain text:

P.S. Want to learn how to go from code to automated pipeline? Take advantage of my 100% free email course:

Deploy Google Cloud Functions In 5 Days.

Thanks for ingesting,

-Zach

Pipeline To DE

Top data engineering writer on Medium & Senior Data Engineer in media; I use my skills as a former journalist to demystify data science/programming concepts so beginners to professionals can target, land and excel in data-driven roles.

Read more from Pipeline To DE

Extract. Transform. Read. A newsletter from Pipeline. *Today's edition was initially published on Medium on 12/10/24 Hi past, present or future data professional! I’ve recently been honing a data engineering skill that might not occur to you—drawing. When I first started my data engineering job 3+ years ago, any description or information related to my code would be in written form. This meant everything from README documentation to illegible legal pad scribbles would be all I had to inform...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! If you haven’t heard "Happy New Year" enough in the past week… let me be, hopefully, the last to say it as we embrace all 2025 has to offer. Beginning a new year comes with the inevitable conception (and ultimately ignorance) of a new year’s resolution. Instead of focusing on one abstract goal to improve, I’d like to suggest, instead, that you form lasting habits, especially when it comes to...

a blue and pink background with the numbers 2024

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! In 2024 I published roughly 75 stories, mostly about data engineering or technology; understandably, with the pace of life and media, you most likely missed something I hope you’ll find valuable and actionable. Keeping with one of my core beliefs, that data-driven tools should result in both professional enrichment and reduce personal problems, my methodology for picking stories out of that...