Extract. Transform. Read.A newsletter from Pipeline Hi past, present or future data professional! For years, a start-up cliche was being the “Uber” of (product, service, etc.). Now, it seems like any content platform wants to be the “Tik Tok” of a given subject area. Case in point for the latter: A fun app I came across called, fittingly, “Gittok.”* Like Tik Tok, Gittok feeds users an endless stream of distraction but instead of dance challenges it serves up a random GitHub repository, like geo-data-viewer, an HTML-based VS Code plug-in to conduct codeless analysis on geographic and spatial data. Linger too long and Gittok can actually be as distracting as TikTok, which undoubtedly inspired the tagline “Get addicted to code.” Which got me thinking, what actually makes a repo compelling enough to click on? In other words, what makes a GitHub repo, possibly your portfolio, interesting and discoverable (both in a TikTok-esque stream or otherwise) at a glance? Because GitHub profiles are often submitted explicitly as part of a review process like an application or interview, they aren’t designed with a goal of needing to “grab” a user’s attention. And this might be a mistake—or at least a missed opportunity. Gittok reduces your repo to three elements; beside each I’ll provide a suggestion to make them more compelling in your work.
Optimizing the design elements assumes you’ve done the work to create well-conceived, well-executed, domain-specific projects that would impress even the most cynical engineering manager. Because the last thing you want is for them to keep scrolling. Thanks for ingesting, -Zach Quinn **I am not affiliated with nor am I being compensated by Gittok or its featured contributors; I simply admire the platform** |
Top data engineering writer on Medium & Senior Data Engineer in media; I use my skills as a former journalist to demystify data science/programming concepts so beginners to professionals can target, land and excel in data-driven roles.
Extract. Transform. Read. A newsletter from Pipeline For a STEM discipline, there is a lot of abstraction in data engineering, evident in everything from temporary SQL views to complex, multi-task AirFlow DAGs. Though perhaps most abstract of all is the concept of containerization, which is the process of running an application in a clean, standalone environment–which is the simplest definition I can provide. Since neither of us has all day, I won’t get too into the weeds on containerization,...
Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! From 2014-2017 I lived in Phoenix, Arizona and enjoyed the state’s best resident privilege: No daylight saving time. If you’re unaware (and if you're in the other 49 US states, you’re really unaware), March 9th was daylight saving, when we spring forward an hour. If you think this messes up your microwave and oven clocks, just wait until you check on your data pipelines. Even though data teams...
Extract. Transform. Read. A newsletter from Pipeline Hi past, present or future data professional! As difficult as data engineering can be, 95% of the time there is a structure to data that originates from external streams, APIs and vendor file deliveries. Useful context is provided via documentation and stakeholder requirements. And specific libraries and SDKs exist to help speed up the pipeline build process. But what about the other 5% of the time when requirements might be structured, but...