profile

Pipeline To DE

Top data engineering writer on Medium & Senior Data Engineer in media; I use my skills as a former journalist to demystify data science/programming concepts so beginners to professionals can target, land and excel in data-driven roles.

Featured Post

[ETR #40] Data Engineering Uncontained

Extract. Transform. Read. A newsletter from Pipeline For a STEM discipline, there is a lot of abstraction in data engineering, evident in everything from temporary SQL views to complex, multi-task AirFlow DAGs. Though perhaps most abstract of all is the concept of containerization, which is the process of running an application in a clean, standalone environment–which is the simplest definition I can provide. Since neither of us has all day, I won’t get too into the weeds on containerization,...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! From 2014-2017 I lived in Phoenix, Arizona and enjoyed the state’s best resident privilege: No daylight saving time. If you’re unaware (and if you're in the other 49 US states, you’re really unaware), March 9th was daylight saving, when we spring forward an hour. If you think this messes up your microwave and oven clocks, just wait until you check on your data pipelines. Even though data teams...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! As difficult as data engineering can be, 95% of the time there is a structure to data that originates from external streams, APIs and vendor file deliveries. Useful context is provided via documentation and stakeholder requirements. And specific libraries and SDKs exist to help speed up the pipeline build process. But what about the other 5% of the time when requirements might be structured, but...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! To clarify the focus of this edition of the newsletter, the reason you shouldn’t bother learning certain data engineering skills is due to one of two scenarios— You won’t need them You’ll learn them on the job You won’t need them Generally these are peripheral skills that you *technically* need but will hardly ever use. One of the most obvious skills, for most data engineering teams, is any...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! One of the dirty secrets about my job is how easy it can be to fix broken pipelines. Often I’m retriggering a failed DAG task or, if using a code-less pipeline, literally hitting refresh. In fact, “refresh” is a great example for one of the more abstract data engineering concepts: State. And, specifically the maintenance of state under any condition. This is the definition of an important...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! Software engineers can package anything— including buzzwords. “Learn new, industry-relevant skills” was compressed to “upskilling.” And while I’m a proponent of continuous learning, especially when it helps you avoid stagnation, at the end of the day, upskilling is a lot of work. Without proper structure and no mandate from a school or employer, it’s difficult to remain engaged, no matter how...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! Winter in the western hemisphere is grim. Even in sunny Florida, where I write from, we’ve experienced weeks of gray skies and plunging temperatures. In the corporate world, winter (Q1) presents another grim reality: Layoffs. Unfortunately, no position, no matter how “critical to the organization” is layoff-proof. Even your CEO can be let go; hence, the “golden parachute” many executives build...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! For data engineering, a profession built on principles of automation, it can be counterintuitive to suggest that any optimizations or “shortcuts” could be negative. But, as someone who was once a “baby engineer”, I can tell you that a combination of temptation and overconfidence will inevitably drive you to say “I could do without x development step.” Doing so increases reputational risk (loss...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! A peer of mine once revealed the reason they were sleep deprived: They were up past midnight writing ad hoc SQL queries with a c-suite leader literally hovering over their shoulder. The visibility of data analysts (like the one in the anecdote) and data scientists’ products, dashboards and ML models, means they are often the first on a Business Intelligence team to be bothered when something...

Extract. Transform. Read. A newsletter from Pipeline. *Today's edition was initially published on Medium on 12/10/24 Hi past, present or future data professional! I’ve recently been honing a data engineering skill that might not occur to you—drawing. When I first started my data engineering job 3+ years ago, any description or information related to my code would be in written form. This meant everything from README documentation to illegible legal pad scribbles would be all I had to inform...