The glamour of data science and analytics has been seductive, but now that you see that behind the scenes, you know that there is a ton of janitorial work to be done for every pound of cool algorithms and tools. Most of your time is spent building and maintaining Rube Goldberg machines to manage big data processing tasks, probably in Python and various scripting tools. Where’s the fun in that?
Gabel is presenting his experiences in going another direction. First, a less well known but interesting framework for processing in NodeJS will be described: ActionHero (https://www.actionherojs.com/). ActionHero is an open source API server that integrates Node’s clustering and background task execution. Second, some approaches to managing complex workflows are described that layer new features on top of ActionHero. Last, Gabel is providing some demonstrations and results showing the utility of these tricks and techniques applied to ETL workflows, GEO-Intelligence and data sampling.