Extract. Transform. Read.A newsletter from Pipeline Hi past, present or future data professional! I don’t know about you, but this email won’t be the only thing I scroll through today. Incidentally, one of the most infinitely scroll-able (and doom scroll-able) sites, Reddit, was my source for a data project demonstrating alerting systems. And the best part? You can and should absolutely steal this approach to gain an understanding of how to monitor ingested data and trigger alerts. This project demonstrates practical skills in data extraction, processing, and automation – and it's easier than you might think. The TLDR of my use case for extracting the content of the Reddit API is monitoring a watch sale subreddit for a particular model (an Omega Seamaster 2531.80 if you happen to be a fellow watch enthusiast). To do this, I ingested the 100 latest posts from the Reddit thread, used a string match to determine whether the model number was included in the post and triggered an automated email with the seller’s link, which was ultimately delivered to my gmail. Here's the basic process: Extract Data
Define Triggers
Connect to Email
Although this is a simple application, this custom alerting system has real-world application. Including this or a similar monitoring project in your portfolio shows you’re not only dedicated to creating data pipelines, but that you have the professional instinct to be accountable for the output. By creating alerts for specific events (errors, delays, unusual data), you can:
This helps ensure smooth and efficient data systems. For a complete walkthrough, including code, read the story on Medium: https://medium.com/pipeline-a-data-engineering-resource/engineering-custom-email-alerts-for-reddit-in-3-steps-710e5191bd3f Happy alerting and thanks for ingesting, -Zach Quinn |
Reaching 20k+ readers on Medium and nearly 3k learners by email, I draw on my 4 years of experience as a Senior Data Engineer to demystify data science, cloud and programming concepts while sharing job hunt strategies so you can land and excel in data-driven roles. Subscribe for 500 words of actionable advice every Thursday.
Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! One of the most loaded terms, after AI, is upskilling. It’s something everyone should always be doing, yet, only the most dedicated can consistently dedicate time to learning and expanding beyond their comfort zones. If you’re on the path to becoming a data professional, you’ve probably spent countless hours learning, only to find yourself wondering if you’re actually making progress. I’ve been...
Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! When I worked as a resume consultant, the toughest mental block for clients was identifying and expressing material contributions at work; avoiding this communication is why so many job hunters revert to regurgitating their job duties rather than clarifying the outcomes of their work. In addition to overcoming the hurdle of distilling a complex technical role for non-technical recruiters to...
Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! Data science just cracked the top 40… of jobs whose main functions are most likely to be replaced by AI. If you’re up to speed on your AI doomerism news you’ll know that at the end of July, Microsoft released a list of jobs across disciplines and industries that could be majorly disrupted by AI. On a more positive economic outlook, data engineering is specifically cited as a growing role in the...