Extract. Transform. Read.A newsletter from Pipeline Hi past, present or future data professional! I dreaded entering the job market after my data science master's. I felt like I knew more than a data analyst but less than a professional data scientist. I've since realized my program was more effective than I thought, but it couldn't prepare me for the key areas like cloud deployments and real-world problem-solving I had to learn on the job as a data engineer. And I’ve noticed these gaps in many other data science curricula, not just my own. My hope is that this can help manage your expectations if you're considering a data science MS, identify areas where you need to upskill. Most importantly, I want to reassure you no program can teach you everything. Gaps In The Cloud My entire grad school experience was constrained to a local environment. My coding was done in Jupyter Notebook VMs, but the focus was on the code itself, not the infrastructure it ran on. The only kind of development process I knew of was GitHub, and even that wasn’t entirely professional, because we each had our own individual repos. This left me unprepared for cloud deployments. I had no idea how to provision a virtual machine or orchestrate a simple pipeline. The good news is that you can quickly close this gap. Both GCP and AWS offer free tiers and resources. At a minimum, you should learn how to provision a VM, orchestrate simple pipelines (like with Airflow), schedule processes, and visualize data from a data warehouse. Beyond Basic SQL While my program introduced database fundamentals, the emphasis on SQL was lacking. We used SQLite and PostgreSQL, but never really dove into the intermediate or advanced concepts. I was largely self-taught in SQL, and it was my least confident skill when I started my job. The reality is that production-level work requires you to write and debug large, complex queries. My advice is to focus on problem-solving over memorizing functions. Learn to write bloated queries to understand what not to do, and efficient ones to understand optimization. From "Mad Libs" to Real-World Requirements My biggest critique of school coding courses is the "color-by-number" approach, where you’re just filling in blanks or, more accurately, Jupyter Notebook cells. While this is great for learning syntax, it completely misses the point of coding, which is the process of creative problem-solving. School projects often provide a perfectly defined problem, but in the real world, you'll be translating vague business requirements into a technical solution. To prepare, you can simulate this process on your personal projects. Before you write a single line of code, ask yourself: What data do I need? Who is the end user? How will I accomplish this? When does it need to be done? And most importantly, why am I using this method? Answering these questions will distinguish you in job interviews and give you a huge head start. Before You Go… If you’re still unsure of how to make your projects more professional, I just finished (writing, not reading!) a new ebook I plan to release within the next month. Its goal is to lend structure to the project development process so you can create work that makes you proud and hiring managers salivate. Join the waitlist to get exclusive early access. Thanks for ingesting, -Zach Quinn |
Reaching 20k+ readers on Medium and nearly 3k learners by email, I draw on my 4 years of experience as a Senior Data Engineer to demystify data science, cloud and programming concepts while sharing job hunt strategies so you can land and excel in data-driven roles. Subscribe for 500 words of actionable advice every Thursday.
Extract. Transform. Read. A newsletter from PipelineToDE Hi past, present or future data professional! One of the most validating and terrifying professional moments is reaching the final interview round. It is in this context that you meet candidacy’s final boss, who incidentally, usually ends up being your boss' boss. Specifically I’m referring to the department executive responsible for bringing in additional headcount, i.e. you. While this may sound intimidating, the role of the executive...
Extract. Transform. Read. A newsletter from PipelineToDE Hi past, present or future data professional! If you’re a job seeker in the data space, your GitHub portfolio has only one job: To act as a calling card that gets you to the next step of the hiring process. Too often, I review portfolios for potential referrals and see brilliant code buried under structural mistakes that have nothing to do with programming skill. Your GitHub is not just cloud storage for your code; it’s a public display...
Extract. Transform. Read. A newsletter from PipelineToDE Hi past, present or future data professional! Despite crushing autocorrect scenarios, most AI code assistants like CoPilot miss a critical step when helping developers of any experience level: Validation. Arguably, leveraging an AI Agent to validate a code’s quality is on the user. But a surprising amount of experienced programmers are taking the worrying approach of believing an AI’s first “thought” when it comes to code that will...