The Ensemble

The Ensemble team was created by the join of two larger groups of Netflix competitors: the Grand Prize Team and the Vandelay Industries!. Below we share the history of the two teams based on the Ensemble home page.



Grand Prize Team

Grand Prize Team, captained by Gábor Takács of Gravity R&D, has been one of the leading teams in the Netflix Prize competition throughout 2009. At one point this spring, the slimmest of possible margins (0.01%) separated Grand Prize Team in third place from the two teams tied for the lead. Leading shareholders in Grand Prize Team (a.k.a. GPT) include Joe Sill, Ces Bertino and the members of the two teams who founded GPT, Gravity and Dinosaur Planet.

GPT was founded on the notion that collaboration was the key to victory in the Netflix Prize competition. Gravity and Dinosaur Planet had already shown what collaboration could accomplish. They had previously joined forces to form the team When Gravity and Dinosaurs Unite, rising to first place on the leaderboard on the day before the deadline for submissions for the first Progress Prize of the competition in October 2007. They were edged out by the AT&T team KorBell in the final hours, but the power of partnerships had been demonstrated in dramatic fashion.

Gravity is a team of four researchers from Hungary: Gábor Takács (Szechenyi Istvan University, Gyor) and István Pilászy, Bottyán Németh, and Domonkos Tikk (Budapest University of Technology and Economics). They form not only a Netflix Prize team but also the core of a company, Gravity R & D. Gravity held the top spot on the leaderboard for several months during the first year of the competition. Dinosaur Planet, another leading Netflix Prize team during the first year, was formed by three students from the Princeton class of 2007: Lester Mackey, David Weiss, and David Lin. Mackey and Weiss have since moved on to computer science PhD programs at UC Berkeley and University of Pennsylvania, respectively, while Lin works in finance in New York.

Gravity and Dinosaur Planet decided in January 2009 to take the concept of collaboration to a new level by creating a team which issued a standing invitation to any other Netflix Prize competitor to submit techniques and results to be assessed for the potential to boost the score of the combined team. The deal offered to the rest of the competitors seemed eminently fair. At the time of formation, the union of the two founding teams had already achieved 9% of the 10% improvement required to win the Netflix Prize, but that final 1% was such a daunting challenge that the founders were willing to offer a two-thirds share of the prize ($666,666 USD) to additional collaborators. Shares would be granted in proportion to the size of the contribution each collaborator made towards that elusive 1%. Thus, an improvement in the leaderboard score of 0.0001, or just one basis point (0.01%), was likely to be worth nearly seven thousand dollars. Gábor Takács was chosen to captain the new team, which was named Grand Prize Team out of a spirit of optimism.

Bottyán Németh of Gravity designed a server which could quickly yet rigorously analyze files produced by models from GPT applicants, searching for indications that the applicant's submission could improve the team's score by providing something complementary to the models the team had already developed. Anyone could upload submissions at any time of day or night and get a quick response from the server indicating whether the submission was promising. Applicants were also invited to send modeling software to GPT for evaluation for possible synergies. Submitting a result which could help GPT was a difficult task, given the high position on the leaderboard which the team already held. Many applicants were unable to demonstrate any ability to improve upon the set of models GPT had already developed.

Nonetheless, the "open invitation" strategy quickly paid off when Ces Bertino, a software engineer working in San Diego, submitted results to GPT. Bertino already had one of the best single-person teams in the entire competition and had held a top 10 position on the leaderboard for many months. Bertino's submissions provided an improvement of 21 basis points (0.21%) - an enormous jump in a competition where leading teams would rejoice when making improvements of just a few basis points.

Grand Prize Team then collected additional important contributions from Netflix Prize competitors from around the world, such as Wojtek Kulik, Bill Roberts and Willem Mestrom. Kulik is a predictive modeling researcher and entrepreneur in Warsaw, Poland. Roberts is a researcher in statistical signal processing and a part-time faculty member at George Washington University in Washington DC. Mestrom is a computational scientist and software engineer in the Netherlands.

Joe Sill, a machine learning PhD from Caltech with dotcom and finance experience, submitted technology in February which proved highly promising when evaluated by Takács. After refining the software Sill submitted, Takács found that it could boost GPT's score by 14 basis points, another major jump. Then Dan Nabutovsky, an algorithm designer from Israel, contributed a substantial improvement of 6 basis points, and the possibility that GPT might challenge for the top spot on the leaderboard began to look more likely. Sill continued to submit enhancements and accrued 8 more basis points of improvement, eked out a few basis points at a time. As a result, GPT rose to within just one basis point of the top position on the leaderboard in May, achieving a score of 0.8597 while Pragmatic Theory and BellKor in Big Chaos (two teams who would later merge to form BellKor's Pragmatic Chaos) were tied in first, each with a score of 0.8596. When BellKor's Pragmatic Chaos broke the million dollar barrier in late June, Grand Prize Team had a score of 0.8594, the best score of any team not participating in the BellKor's Pragmatic Chaos coalition.

After the breaking of the million dollar barrier, only 30 days remained for other teams to attempt to catch BellKor's Pragmatic Chaos. For the final push, GPT brought on board David Purdy, who is finishing a statistics PhD at UC Berkeley. Purdy contributed a number of analyses which had not yet been pursued by other GPT members and a wide-ranging perspective on the statistics literature.

In the closing days of the competition, GPT decided that it wasn't finished pursuing a cooperative approach to the Netflix Prize. Talks with another leading team, Opera Solutions and Vandelay United, led to the formation of The Ensemble.



Vandelay Industries!

On January 1st, 2009, dreamhost.com had a 95% off sale. You could purchase a two year web hosting plan that included shell access to a Linux server, unlimited users, and unlimited storage space for $20.00. Greg McAlpin (OfADifferentKind), a software developer in Houston, TX area, bought a two-year subscription with vague notions of setting up a website some day.

On February 25, 2009, Greg's Probe File Exchange website went online. It was an invitation-only website where members could upload their probe prediction files and see how their score in the Netflix Prize contest might improve if they were to combine their results. Probe prediction files are files that competitors could use to measure the effectiveness of their algorithms. Netflix supplied a suggested set of probe data. Since we were all using the data Netflix suggested, it was easy to compare our results.

Greg invited six people to join the Probe File Exchange. They were not chosen because they had the lowest scores. They were chosen because they were all active on the Netflix forum and their posts were consistently helpful, friendly, and funny. They were chosen because they are the sort of people that you want to work with. The Netflix forum ( http://www.netflixprize.com//community/) is the place where competitors could ask questions and help each other. There has been amazing openness in the forum. People have shared everything from ideas to source code. Five of the six people who were invited on February 25th are now members of The Ensemble.

On the first day that the Probe File Exchange was online, Bo Yang (Newman !) proposed to Greg that they create a new joint team. Bo and Greg went on to form the team "Newman and George !". They hoped that a submission of their combined files would have an RMSE lower than 0.8712 (the 2007 progress prize RMSE). RMSE, or Root Mean Squared Error, is a way of measuring the average error for a set of predictions. On February 27th, Newman and George ! made their first submission with an RMSE of 0.8689.

In order to share files, members created directories on the same Linux server that was hosting the website (on dreamhost.com). That original setup grew into the infrastructure that allowed Vandelay Industries ! to easily support many members.

Bill Bame (clueless) began uploading files the first day that the Probe File Exchange was online. He has always had extremely creative ideas and unique approaches. The files that he uploaded to the Probe File Exchange combined extremely well with those of Newman and George !. On February 26th, Bill was invited to a new team named "Newman, George, and Peterman !".

The Probe File Exchange had its own private forum where members could share ideas. Chris Hefele (chef-ele) posted some information about the non-linear ways that he used to combine files. The most common way for competitiors to combine files is linear regression. That's a mathematical way of taking many points and finding the line that passes nearest to all of the points. Nonlinear regression is much more complex. It attempts to find a curve that passes closest to all of the points. The results that Chris achieved were extremely impressive. On March 12, 2009, Chris was invited to join the team. He was going to be "Bania", but the team name was growing too long and the "Newman and ... !" teams were all on the front page of the leaderboard.

So a new team "Vandelay Industries !" was formed. The name "Vandelay Industries !" is of course a whimsical reference to "Seinfeld", as is our goal to become a coalition "for the rest of us" who are not at the top of the leaderboard. Chris continued to develop his blending techniques and he continued to produce amazing results. He is one of the main blenders on The Ensemble.

In March, George Tsagas of Feeds2 was invited to join Vandelay Industries !. He answered "not yet". He was already part of one of the leading teams and he said that there would be time to make collaborations when the leaders' improvement neared the 10% mark. Feeds2 is now a member of The Ensemble.

During March and the beginning of April, Vandelay Industries ! continued to make almost daily progress. In May, Bo made huge improvements in his personal score. With his improvements Vandelay Industries !, made up of four people working in their spare time, reached 15th place on the leaderboard among 5000+ teams.

Vandelay Industries ! was started by sending out emails to strangers asking if they wanted to work together. The team made contacts with other top teams and started dialogs with them. The person who gave the most help and encouragement was Larry Ya Luo (Dace). Larry/Dace is also the highest ranked single-member team on the Netflix Prize leaderboard. There was some disagreement about how Vandelay Industries ! should recruit new members. Some thought that we should contact teams lower than us on the leaderboard. They would be more likely to work with us. And we had already seen that a few people with no previous experience could achieve quite a bit by working together. There was hesitation about contacting the top teams on the leaderboard because Vandelay Industries ! really had nothing to offer them.

But the possibility that someone might turn us away has never deterred the team. We asked Larry if he would mind downloading our probe files and seeing how they mixed with his. He accepted, downloaded our files, and did significant analysis of them. Even though our files could barely improve his own, he offered suggestions for how we could make improvements and what he thought we needed to do to reach the top 10 on the leaderboard. Each time that Vandelay Industries made a significant improvement, Larry would look at our files and try to help us.

In June, Jeff Howbert (team Howbert) contacted Bo about combining efforts, and Jeff joined Vandelay Industries !. The team was preparing a new submission and was quietly confident that Vandelay Industries ! would get into the top 10 on the leaderboard for the first time. Then BellKor's Pragmatic Chaos made their submission that made a 10.05% improvement.

Immediately Vandelay Industries ! began sending emails to all of the top teams, inviting them to join or cooperate with Vandelay Industries !. Larry was one of the first to agree to join our team. Others followed. The infrastructure that we had in place made it simple for us to add more teams. People were able to quickly integrate into the team and become productive.

As the final moments of the competition approached on July 24th 2009, Greg McAlpin (OfADifferentKind) and Christopher Hefele (chef-ele) reflected on what the contest meant to them, the unique qualities Vandelay Industries ! offered The Ensemble, and what's next for the group.

"Joining with Grand Prize Team to create The Ensemble put us in the incredible position of making a 10% improvement over the Cinematch program that Netflix uses" said Greg Mcalpin. Larry has said it well: our goal was to make a 10% improvement. When we do that, we'll have finished successfully with a job well done. "A million dollars isn't why we've worked so hard", Greg says, "at the beginning of the contest, a lot of people said that it would be impossible for anyone to reach the 10% improvement. From February until now, in six months, this group has done the impossible." Greg continued: "if we come in second place or last place, it has been fun and it's been an awesome experience working with the great and brilliant people on this team."

"The merged team's name 'The Ensemble' not only refers to the large group of team members that's been merged together, but it's also a reference to "ensemble methods,'" says Chris/chef-ele, "which is the term researchers use for the techniques we're using to combine our individual predictions into a group prediction that is better than any of the individuals. "

"Next, although some of our teammates have formal backgrounds in machine learning, they're working side-by-side with many others who were drawn to this problem as an interesting hobby or puzzle", Chris/chef-ele says "it's like a data-miner's Rubik's Cube...very addictive."

"It's my opinion that the successes of this team is not only being driven by the technologies we're using to combine or data, but also by our ability to combine many people together & create a cohesive, functioning team in less than 30 days", Chris continued, "So our successes will be not only technological, but also organizational. It will be interesting to see if a large group of underdogs can defeat a small group of the leaders."



Updates on Netflix Prize: the strategy of collaborative teamwork

Gravity founded the Grand Prize Team (GPT) with our former collaborative team, Dinosaur Planet, in January 2009 in order to collect top seated teams for a collaborative effort to win the Grand Prize. The team started with a 9.04% improvement over Cinematch, Netflix's own recommender solution. Soon after, due to the very useful contribution of other teams, GPT became a major player for the final period of the Netflix Prize contest. The GPT team leader, Gábor Takács, founder of Gravity, was interviewed by a Netflix competitor, when GPT reached the second place of the contest with 9.64 % improvement, just 0.01% behind the leading team.

On June 26, when with a joint effort of 3 other teams outside of GPT (Pragmatic Theory, BigChaos, and BellKor) - BellKor's Pragmatic Chaos (BPC) - passed the magic 10% limit (namely they got to 10.05%), and the last 30 days of the competition has started, we initiated an even more collaborative work within GPT. We created an internal forum for directing the work and discussion of the many team members. As a result, we reached 9.91% within a few days. At the same time more collaborative work has been started among the top contenders of the contest, created joined teams based on the pattern of GPT. With only about 2 weeks remaining from the contest we started to negotiate with the biggest of such conglomerated team, Vandelay Industries!, and we created to a merged team with the name The Ensemble,  also leaded by Gábor Takács from Gravity. It turned out with the help of some noisified submissions that The Ensemble can get really close to or even overtake BPC. The Ensemble submitted the first submission with only 1 day to go from the competition, which led the team to the top of the leaderboard with 10.09% improvement - this overtook BPC with a bare 0.01 % since they improved during these 29 days to 10.08 %. The final day was really memorable. Both teams worked with extreme efforts to tweak out some more results from their algorithms. And both succeeded. With 25 minutes to go, a BPC submission (10.09 %) tied up with The Ensemble, but only 4 minutes to go, we could get an even better predictor, that remained on the top of the leaderboard: 10.10%.

However, the contest did not end with the final countdown. Netflix will announce the winner after validating the teams’ submissions, poring over the submitted code, design documents and other materials. This process can run for months. Whatever will be the final result of the contest, Gravity, as a founder of GPT and the member of The Ensemble, is proud to have participated with a valuable contribution in the team that ended up on the top of the leaderboard of the most exciting machine learning contest ever.

 


Gravity @ Netflix Prize

Our team started to work on the Netflix Prize problem in late October 2006. Our first top-40 position dates back to 9th December 2006 (RMSE 0.9116, improvement 4.18%). We were among the top 10 competitors on 28th December 2006 (RMSE 0.9017, improvement 5.22%), and reached to top at 18th January 2007 (RMSE 0.8887, improvement 6.59%). With a few short and one longer breaks (29th March, 6th April, 26th April-12th May, 15-16th May, 22nd May) we stayed at the top until 8th June 2007. Later we get back to the lead with our joined team "When Gravity and Dinosaurs Unite" on 30th September 2007 (RMSE 0.8717, improvement 8.38%) - unfortunately only for one day. Then we returned there on 7th February 2008 (RMSE 0.8691, improvement 8.65%) and stayed there until 1st March 2008.

Since early January 2007, our team was always among the top 5 individual teams. Gravity as detailed above participated in a few collaborative teams. We finished at the first place with The Ensemble, at the 3rd place with GPT, at the 14th place with Gravity (5th individual team), and at the 27th place with When Gravity and Dinosaurs Unite. The graph below summarizes our performance at the Netflix Prize contest. Here you can check out the leaderboard.

  

We have published a few papers on our algorithms:

  1. G. Takács, I. Pilászy, B. Németh, and D. Tikk. On the Gravity Recommendation System.
    In Proc. of KDD Cup Workshop at SIGKDD'07, 13th ACM Int. Conf. on Knowledge Discovery and Data Mining,
    pp. 22-30, San Jose, CA, USA, August 12-15, 2007. [Article, Bibtex]
  2. G. Takács, I. Pilászy, B. Németh, and D. Tikk. Major components of the Gravity Recommendation System.
    ACM SIGKDD Explorations Newsletter, 9(2), pp. 80-83, 2007. [Article, Bibtex]
  3. G. Takács, I. Pilászy, B. Németh, and D. Tikk. A Unified Approach of Factor Models and Neighbor Based Methods for Large Recommender Systems.
    In 1th IEEE Workshop on Recommender Systems and Personalized Retrieval, Ostrava, Czech Republic, August 4, 2008. [Article, Bibtex]
  4. G. Takács, I. Pilászy, B. Németh, and D. Tikk. Investigation of Various Matrix Factorization Methods for Large Recommender Systems.
    In 2nd Netflix-KDD Workshop, Las Vegas, NV, USA, August 24, 2008. [Article, Bibtex]
  5. G. Takács, I. Pilászy, B. Németh, and D. Tikk. Matrix Factorization and Neighbor Based Algorithms for the Netflix Prize Problem.
    In 2nd ACM International Conference on Recommender Systems, Lausanne, Switzerland, October 25, 2008. [Article, Bibtex]
  6. G. Takács, I. Pilászy, B. Németh, and D. Tikk. Scalable collaborative filtering approaches for large recommender systems, Journal of Machine Learning Research, 10 (2009), 623-656. March 31, 2009. [Article, Bibtex]
  7. I. Pilászy and D. Tikk. Computational Complexity Reduction for Factorization-Based Collaborative Filtering Algorithms, EC-Web 2009, Accepted. [Article, Bibtex]
  8. I. Pilászy and D. Tikk. Recommending New Movies: Even a Few Ratings Are More Valuable Than Metadata, In 3rd ACM International Conference on Recommender Systems, New York, NY, October 22-25, 2009, Accepted. [Article, Bibtex]

Other resources:


Check out the source code and binaries of our program that calculates linear regression.

We use a particular 1/10 of the Probe set to evaluate our methods, which we term as Probe10. An interesting property of Probe10 is that methods trained on all data excluding Probe10 get almost the same RMSE on Probe10 and Quiz. Here is a perl script that creates Probe10 from probe.txt.