Extract. Transform. Read.A newsletter from Pipeline Hi past, present or future data professional! In 2024 I published roughly 75 stories, mostly about data engineering or technology; understandably, with the pace of life and media, you most likely missed something I hope you’ll find valuable and actionable. Keeping with one of my core beliefs, that data-driven tools should result in both professional enrichment and reduce personal problems, my methodology for picking stories out of that stack is informed by data I ingested daily from the unofficial Medium API. Instead of focusing on ambiguous metrics like views and/or read time and to avoid getting noisy data due to multiple clapper fans, I calculated only individual accounts reacting to my work. From there, it was a simple matter of writing a quick query against data stored within my BigQuery project. To respect your time, I’ll present the links as plain text with a 1 sentence summary. For a more fully rendered version of this list, along with some important framing, you can read the full story. How I Reduced My Query’s Run Time From 30 Min. To 30 Sec. In 1 Hour A shared responsibility of data engineers, regardless of organizational hierarchy, is to ensure that teammates and fellow data analysts, data scientists and data consumers are querying efficiently; this story, published in the publication I co-edit, Learning SQL, takes you from my receiving a monstrous query to the point of optimization. Pandas’ 2.0 Release Deprecated Your Favorite Method. What Now? In 2023 Pandas deprecated one of its users favorite method–one that I used in nearly every data pipeline; learn what was deprecated and what workarounds are available. Why Your Data Pipelines Will Fail On These 10 Days Every Year — And What To Do About It Building robust pipelines doesn’t end with assuring your code can run using production dependencies; it ends when you ensure your work functions within the constraints of Earth time. SQL Developers: Take These 5 Create Table Steps To Improve Performance In a piece for Learning SQL, I explain why table performance doesn’t begin with your query, but with the act of table creation itself. How Not To Annoy Senior Developers — Sincerely, A Senior Data Engineer After being promoted to a senior position, I wrote about strategies covering how not to annoy your senior data scientists, engineers or developers from a senior’s perspective. 2023 In 12 Data Engineering Errors That Ultimately Advanced My Skills Before I wrote a year-end wrap up like this, I took an unconventional route, reflecting not on my successes as a developer and engineer in a given year, but sharing errors that resonated with me so much, I thought about them all year; typically, these errors fall into broad categories based on the technology, i.e. Python, SQL and Airflow. Not Getting Data Science Job Interviews? You Have A Visibility Problem Data science job applicants are facing more competition than ever; how to create and publicize compelling work to stand out — the right way. I’ve blogged about data science and data engineering concepts for as long as I’ve been a data engineer; somehow, 2025 marks 4 years of working in an engineering role in tech. I’m continuously humbled by you, the reader, who takes time to read, engage and reach out. The one-time journalist in me is thrilled to have had the opportunity to develop and engage with a loyal audience. I appreciate you reading in 2024 and look forward to our conversations in ‘25. Happy New Year and thanks for ingesting, -Zach Quinn |
Reaching 20k+ readers on Medium and over 3k learners by email, I draw on my 4 years of experience as a Senior Data Engineer to demystify data science, cloud and programming concepts while sharing job hunt strategies so you can land and excel in data-driven roles. Subscribe for 500 words of actionable advice every Thursday.
Hi fellow data professional! In a previous newsletter, I mentioned an idea that I wanted to explore deeper. At the risk of double-quoting a la The Office’s Michael Scott quoting Wayne Gretzky (“You Miss 100% Of The Shots You Don’t Take - Waynze Gretzky - Michael Scott”), here is the idea. “To be marketable as a candidate, you don’t just want to show how you can go from A to B (requirements->pipeline). You need to go from A to C (requirements->pipeline->scale/support).” You might be asking...
Hi fellow data professional! Remember when the world ended? This month, 6 years ago, the world shut down and entered “unprecedented times.” Shortly after COVID-19 was designated a pandemic, I was unceremoniously furloughed from my day job at Disney World for 3-ish months. During COVID while others quarantined, I was on the move. After quickly feeling isolated in our third floor Central Florida apartment, my now-wife and I joined millions of other American 20-somethings who took a pandemic as...
Hi fellow data professional! I’ve broken my own data project rule. I’ve used the same data over and over again. For 3 years. It sounds boring but that depth exposure may actually be one of the few moats that slows encroaching AI. A little context: I support subscriptions, newsletters and growth for my employer. Spoiler alert: These areas are all basically the same thing. And they use basically the same three data sets. While I have opportunities to jump to other projects, this has been my...