[ETR #48] Impress With This 3-Step Alerting Project


Extract. Transform. Read.

A newsletter from Pipeline

Hi past, present or future data professional!

I don’t know about you, but this email won’t be the only thing I scroll through today.

Incidentally, one of the most infinitely scroll-able (and doom scroll-able) sites, Reddit, was my source for a data project demonstrating alerting systems.

And the best part?

You can and should absolutely steal this approach to gain an understanding of how to monitor ingested data and trigger alerts. This project demonstrates practical skills in data extraction, processing, and automation – and it's easier than you might think.

The TLDR of my use case for extracting the content of the Reddit API is monitoring a watch sale subreddit for a particular model (an Omega Seamaster 2531.80 if you happen to be a fellow watch enthusiast).

To do this, I ingested the 100 latest posts from the Reddit thread, used a string match to determine whether the model number was included in the post and triggered an automated email with the seller’s link, which was ultimately delivered to my gmail.

Here's the basic process:

Extract Data

  • Get the Reddit info you want making a single API call
  • Make specific requests to this end point: “https://oauth.reddit.com/{subreddit}/{new/hot/best}”
  • Filter the data: Are you looking at certain subreddits? Keywords? Users?

Define Triggers

  • Set the rules that determine when you get an alert
  • Create a regex expression to search a field for a given string (psst AI Agents can help make regex writing less tedious)
  • Example: Email me when the “post” field in r/dataengineering mentions "Airflow."

Connect to Email

  • Automate the email send when your rules are met
  • Establish a custom subject and body containing the link or links to relevant posts
  • If you prefer gmail, you can read the story (linked at the bottom) for a template to create and send a simple email

Although this is a simple application, this custom alerting system has real-world application. Including this or a similar monitoring project in your portfolio shows you’re not only dedicated to creating data pipelines, but that you have the professional instinct to be accountable for the output.

By creating alerts for specific events (errors, delays, unusual data), you can:

  • Catch problems early.
  • Reduce downtime.
  • Keep your data reliable.

This helps ensure smooth and efficient data systems.

For a complete walkthrough, including code, read the story on Medium: https://medium.com/pipeline-a-data-engineering-resource/engineering-custom-email-alerts-for-reddit-in-3-steps-710e5191bd3f

Happy alerting and thanks for ingesting,

-Zach Quinn

Extract. Transform. Read.

Reaching 20k+ readers on Medium and nearly 3k learners by email, I draw on my 4 years of experience as a Senior Data Engineer to demystify data science, cloud and programming concepts while sharing job hunt strategies so you can land and excel in data-driven roles. Subscribe for 500 words of actionable advice every Thursday.

Read more from Extract. Transform. Read.

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! One of the most loaded terms, after AI, is upskilling. It’s something everyone should always be doing, yet, only the most dedicated can consistently dedicate time to learning and expanding beyond their comfort zones. If you’re on the path to becoming a data professional, you’ve probably spent countless hours learning, only to find yourself wondering if you’re actually making progress. I’ve been...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! When I worked as a resume consultant, the toughest mental block for clients was identifying and expressing material contributions at work; avoiding this communication is why so many job hunters revert to regurgitating their job duties rather than clarifying the outcomes of their work. In addition to overcoming the hurdle of distilling a complex technical role for non-technical recruiters to...

Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! Data science just cracked the top 40… of jobs whose main functions are most likely to be replaced by AI. If you’re up to speed on your AI doomerism news you’ll know that at the end of July, Microsoft released a list of jobs across disciplines and industries that could be majorly disrupted by AI. On a more positive economic outlook, data engineering is specifically cited as a growing role in the...