3 Kaggle alternatives for collaborative data science
What’s the best way to get a good answer to a tough question? Ask a bunch of people, and make a competition out of it. That’s long been Kaggle‘s approach to data science: Turn tough missions, like making lung cancer detection more accurate, into bounty-paying competitions, where the best teams and the best algorithms win.
Now Kaggle is rolling into Google, and while all signs point to it being kept as-is for now, there will be jitters about the long-term prospects for a site with such a devoted community and an an idiosyncratic approach.
Here are three other sites that share a similar mission, if not explicitly followed in Kaggle’s footsteps. (Note that some sites, like CrowdAnalytix, may consider accepted solutions in contests as works for hire and thus their property.)
CrowdAI
A product of the École Polytechnique Fédérale de Lausanne in Switzerland, CrowdAI is an open source platform for hosting open data challenges and gaining insight into how the problems in question were solved. The platform is quite new, with only six challenges offered so far, but the tutorials derived from those challenges are detailed and valuable, providing step-by-step methodologies to reproduce that work or create something similar. The existing exercises cover common frameworks like Torch or TensorFlow, so it’s a good place to acquire hands-on details for using them.
DrivenData
DrivenData, created by a consultancy that deals in professional data problems, hosts online challenges lasting a few months. Each is focused specifically on pressing problems facing the world at large, like predicting the spread of diseases or mining Yelp data to improve restaurant inspection processes. Like Kaggle, DrivenData also has a data science jobs listing board — a feature people are worried might go missing from Kaggle post-acquisition.
CrowdAnalytix
Backed by investors from Accel Partners and SAIF Partners, CrowdAnalytix focuses on hosting data-driven problem-solving competitions, rather than sharing information that result from them. Contests are offered for finding solutions to problems in categories like modeling, visualization, and research, and each has bounties in the thousands of dollars. Some previous challenges include predicting the real costs of workers’ compensation claims or airline delays. Other contests, though, aren’t hosted for money, but for providing a competitive option to learn a related discipline, such as the R language.
Source: InfoWorld Big Data