Articles by Dan Crosta

Dan joined Magnetic in August 2012, just as Magnetic was beginning to build its Python-based real-time bidding system, where he helped scale the platform from tens to hundreds of thousands of requests per second. Dan is currently the Director of Magnetic Labs, where he works on predictive modeling and automated optimization algorithms.

Dan can be found on the web at:

  1. Demystifying Logistic Regression

    For our hackathon this week, I, along with several co-workers, decided to re-implement Vowpal Wabbit (aka “VW”) in Go as a chance to learn more about how logistic regression, a common machine learning approach, works, and to gain some practical programming experience with Go.

    Though our hackathon project focused on learning Go, in this post I want to spotlight logistic regression, which is far simpler in practice than I had previously thought. I’ll use a very simple (perhaps simplistic?) implementation in pure Python to explain how to train and use a logistic regression model.

  2. Embarrassingly Serial

    The past decade has seen a surge in technologies around “big data,” claiming to make it easy to process large data sets quickly, or at least scalably, by distributing work across a cluster of machines. This is not a story of success with a big data framework. This is a story of a small data set suffering at the hands of big data assumptions, and a warning to developers to check what your big data tools are doing for you.

  3. Click Prediction with Vowpal Wabbit

    At the core of our automated campaign optimization algorithms lies a difficult problem: predicting the outcome of an event before it happens. With a good predictor, we can craft algorithms to maximize campaign performance, minimize campaign cost, or balance the two in some way. Without a good predictor, all we can do is hope for the best.

  4. Optimize Python with Closures

    Magnetic’s real-time bidding system, written in pure Python, needs to keep up with a tremendous volume of incoming requests. On an ordinary weekday, our application handles about 300,000 requests per second at peak volumes, and responds in under 10 milliseconds. It should be obvious that at this scale optimizing the performance of the hottest sections of our code is of utmost importance. This is the story of the evolution of one such hot section over several performance-improving revisions.

  5. Good Test, Bad Test

    A good test suite is a developer’s best friend — it tells you what your code does and what it’s supposed to do. It’s your second set of eyes as you’re working, and your safety net before you go to production.

    By contrast, a bad test suite stands in the way of progress — whenever you make a small change, suddenly fifty tests are failing, and it’s not clear how or why the cases are related to your change.

  6. Magnetic’s Inaugural Hackathon

    On Wednesday and Thursday, Magnetic hosted our first quarterly internal hackathon. We were lucky enough to be able to bring the whole team into the New York office, from London, Russia, and Slovakia.

    Engineers gather to kick off the hackathon in Magnetic's
kitchen Engineers gather to kick off the hackathon in Magnetic’s kitchen