1. # Finding a Confidence Interval for Lift

The motivation for this blog post is simple: I was having trouble searching Google for a simple formula for the confidence interval of lift. Lift is a very important metric in our industry, and after all the work I put into researching it I want to make sure the next person to google ‘confidence interval of lift’ has an easier time.

2. # Distributed Metrics for Conversion Model Evaluation

At Magnetic we use logistic regression and Vowpal Wabbit in order to determine the probability of a given impression resulting in either a click or a conversion. In order to decide which variables to include in our models, we need objective metrics to determine if we are doing a good job. Out of these metrics, only the computation of lift quality (in it’s exact form) is not easily parallelizable. In this post, I will show how the computation of lift quality can be re-ordered to make it distributable.

3. # Computing Distributed Groupwise Cumulative Sums in PySpark

When we work on modeling projects, we often need to compute the cumulative sum of a given quantity. At Magnetic, we are especially interested in making sure that our advertising campaigns spend their daily budgets evenly through out the day. To do this we need to compute cumulative sums of dollars spent through out the day in order to identify the moment at which a given campaign has delivered half of it’s daily budget. Another example where being able to compute a cumulative sum comes in handy is transforming a probability density function into a cumulative distribution function.

Because we deal with large quantities of data, we need to be able to compute cumulative sums in a distributed fashion. Unfortunately, most of the algorithms described in online resources do not work that well when groups are either: large (in which case we can run out of memory) or un-evenly distributed (in which case the largest group becomes the bottle neck).

4. # Mag-a-thon

Our engineers from all offices participated in the latest hackathon, cleverly titled “Straight Outta Mag-a-Thon”.

5. # Demystifying Logistic Regression

For our hackathon this week, I, along with several co-workers, decided to re-implement Vowpal Wabbit (aka “VW”) in Go as a chance to learn more about how logistic regression, a common machine learning approach, works, and to gain some practical programming experience with Go.

Though our hackathon project focused on learning Go, in this post I want to spotlight logistic regression, which is far simpler in practice than I had previously thought. I’ll use a very simple (perhaps simplistic?) implementation in pure Python to explain how to train and use a logistic regression model.

6. # VIRBs and Sampling Events from Streams

VIRB (Variable Incoming Rate Biased) reservoir sampling is a streaming sampling algorithm that stores a representative fixed-sized sample of events from the recent past (the user specifies the desired mean age of samples), even when the incoming rate varies. It is heavily inspired by reservoir sampling.

7. # PySpark Carpentry: How to Launch a PySpark Job with Yarn-cluster

Using PySpark to process large amounts of data in a distributed fashion is a great way to gain business insights. However, the machine from which tasks are launched can quickly become overwhelmed. This article will show you how to run pyspark jobs so that the Spark driver runs on the cluster, rather than on the submission node.

8. # Detecting Brands in User Search Queries

Capturing user intent with brands can be valuable, especially in online advertising. In the online advertising domain, brand detection can help capture user interests and improve user modeling, which, in turn, can lead to an increase in precision of user targeting with ads relevant to their interests and needs.

9. # Bloom Filter-Assisted Joins with PySpark

One of the most attractive features of Spark is the fine grained control of what you can broadcast to every executor with very simple code. When I first studied broadcast variables my thought process centered around map-side joins and other obvious candidates. I’ve since expanded my understanding of just how much flexibility broadcast variables can offer.

10. # The Magnetic Engineering Manifesto

Creating a sustainable and consistent engineering culture means answering some fundamental questions:

• What do we believe in?
• How do we align all our ideas into a vision that is easy to understand?
• How do we turn that vision into something long-lived and actionable that can be used to drive our cultural growth?

To address these questions, we recently released a Magnetic Engineering Manifesto that we believe will help us along this path. We’re sharing it here in the hope that it will inspire other companies to take the time to write down their own thoughts on culture.

11. # Real Time Facial Recognition in Python

Last month we had another instance of our quarterly hackathon. I had an urge to experiment a bit with computer vision, despite not having done anything related before.

Our hackathons are around 48 hours long, which I hoped would be long enough to do some simple facial recognition. My goal ...

12. # Embarrassingly Serial

The past decade has seen a surge in technologies around “big data,” claiming to make it easy to process large data sets quickly, or at least scalably, by distributing work across a cluster of machines. This is not a story of success with a big data framework. This is a story of a small data set suffering at the hands of big data assumptions, and a warning to developers to check what your big data tools are doing for you.

13. # Installing Spark 1.5 on CDH 5.4

If you have not tried processing data with Spark yet, you should. It’s the next happening framework, centered around processing data up to 100x more efficiently than Hadoop, while leveraging the existing Hadoop’s components (HDFS and YARN). Since Spark is evolving rapidly, in most cases you will want to run the latest released version by the Spark community, rather than the version packaged with your Hadoop distribution. This guide will walk you through what it takes to get the latest version of Spark running on your cluster.

14. # Measuring Statistical Lift on Search Categories

One of the most popular features of the Magnetic Insight platform is our category rankings for an advertiser’s audience of page visitors. The rankings give a completely unbiased look into which search categories are the most popular amongst the users that visit a customer’s different web pages.

15. # Information Theoretic Metrics for Multi-Class Predictor Evaluation

How do you decide if a predictive model you have built is any good? How do you compare the performance of two models? As time goes on, data changes and you have to rebuild your models — how do you compare the new model’s behavior on the new data with the old model’s behavior on the old data?

16. # One-Pass Distributed Random Sampling

One of the important factors that affects efficiency of our predictive models is the recency of the model. The earlier our bidders get new version of prediction model, the better decisions they can make. Delays in producing the model result in lost money due to incorrect predictions.

The slowest steps in our modeling pipeline are those that require manipulating the full data set — multiple weeks worth of data. Our sampling process has historically required two full passes over the data set, and so was an obvious target for optimization.

17. # Click Prediction with Vowpal Wabbit

At the core of our automated campaign optimization algorithms lies a difficult problem: predicting the outcome of an event before it happens. With a good predictor, we can craft algorithms to maximize campaign performance, minimize campaign cost, or balance the two in some way. Without a good predictor, all we can do is hope for the best.

18. # Real-time Ad Targeting with Apache Kafka

Here at Magnetic, as a search-retargeting company, our core business model is to provide relevant ads to viewers. Our platform performs this task well, matching viewers up with related ads through various methods including page visits, search queries, and data analytics of each. It currently takes about 15 minutes on average for us to be able to react to new events in our core targeting infrastructure. If we could reduce this time, we could make our engineers, product management, ad operations, and our CEO really happy.

19. # Optimize Python with Closures

Magnetic’s real-time bidding system, written in pure Python, needs to keep up with a tremendous volume of incoming requests. On an ordinary weekday, our application handles about 300,000 requests per second at peak volumes, and responds in under 10 milliseconds. It should be obvious that at this scale optimizing the performance of the hottest sections of our code is of utmost importance. This is the story of the evolution of one such hot section over several performance-improving revisions.

20. # SKIP, The Search Keyword Intent Predictor

Magnetic specializes in search retargeting, thus we really need to understand our users’ searches — it is our bread and butter. We need to recognize what a user’s search means in an understandable way for both humans and computers. This is why we map each search to a category (e.g. “Automotive”), brand (e.g. “BMW”), or other intent data. Our keyword categorization service and Search Keyword Intent Predictor (SKIP) is our core technology which addresses this need.

21. # Segment Size Forecasting with “Will it Work?”

The idea for the Hackathon was simple. We all got together on a Wednesday morning and the bravest among us pitched their ideas for great new products. The rest of us jumped on board with those projects that seemed most worthy or fun and we were off.

For our project, we decided to predict the future, or more specifically, a specific aspect of the future — the expected number of users an advertising campaign will target.

22. # Good Test, Bad Test

A good test suite is a developer’s best friend — it tells you what your code does and what it’s supposed to do. It’s your second set of eyes as you’re working, and your safety net before you go to production.

By contrast, a bad test suite stands in the way of progress — whenever you make a small change, suddenly fifty tests are failing, and it’s not clear how or why the cases are related to your change.

23. # Magnetic’s Inaugural Hackathon

On Wednesday and Thursday, Magnetic hosted our first quarterly internal hackathon. We were lucky enough to be able to bring the whole team into the New York office, from London, Russia, and Slovakia.

Engineers gather to kick off the hackathon in Magnetic’s kitchen

24. # Search Query Categorization at Scale

Classification of short text into a predefined hierarchy of categories is a challenge. The need to categorize short texts arises in multiple domains: page keywords and search queries in online advertising, improvement of search engine results, analysis of tweets or messages in social networks, etc.

The meetup garnered a large audience.