Can Your Data Pipelines Be Dangerous?


Extract. Transform. Read.

A newsletter from Pipeline: Your Data Engineering Resource

Hi past, present or future data professional!

Data engineering can be dangerous; ok—not, like, physically, but by building and maintaining data infrastructure, data engineers are given a surprising amount of access and responsibility. Every commit, table alteration and deletion must be made with care. It took 2 years, but I finally learned a shortcut to make developing SQL staging tables less risky and more efficient.

Even seemingly minor mistakes like joining on the wrong key can result in losing days or months of valuable data, which can be equal to hundreds of thousands or millions of dollars in revenue visibility. Outside of code mistakes, not paying attention to logistic factors like vendor contracts and API usage can not only result in downtime, in a worst-case scenario it can lead to an all-out blackout.

If the stakes sound ominous, I’d suggest examining the root of your hesitation to work more confidently and efficiently—it may even be the code itself.

There is a happy medium between freely building data pipelines and using the appropriate guard rails. As long as you take your time and don’t commit code directly to the main branch then you can do data engineering safely and avoid bursting your pipelines.

For those who are anti-virus minded, here are this week’s links as plain text:

P.S. Want to learn how to go from code to automated pipeline? Take advantage of my 100% free email course:

Deploy Google Cloud Functions In 5 Days.

Thanks for ingesting,

-Zach

Extract. Transform. Read.

Reaching 20k+ readers on Medium and nearly 3k learners by email, I draw on my 4 years of experience as a Senior Data Engineer to demystify data science, cloud and programming concepts while sharing job hunt strategies so you can land and excel in data-driven roles. Subscribe for 500 words of actionable advice every Thursday.

Read more from Extract. Transform. Read.

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! I dreaded entering the job market after my data science master's. I felt like I knew more than a data analyst but less than a professional data scientist. I've since realized my program was more effective than I thought, but it couldn't prepare me for the key areas like cloud deployments and real-world problem-solving I had to learn on the job as a data engineer. And I’ve noticed these gaps in...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! If you live in the U.S., this week marks the end of back to school season; though, if you’re like my southern relatives, you’ve been back since July. The closest feeling most adults get to back to school (aside from the teachers), is starting a new job. While a new org, title and compensation package represents new opportunities, it’s also easy to feel like the “new kid”, which can lead to being...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! I once participated in a remote job interview in which the interviewer was on the video call while driving... and smoking. While that instance was among the most memorable interview experiences (for the wrong reasons), I’ve had just as many interviews that have blended together and faded into the recesses of my mind. The common denominator, however, was the insistence on asking one question. The...