Business Understanding

Challenge description

In this competition, you’ll develop AI systems to efficiently learn new skills and solve open-ended problems, rather than depend exclusively on systems trained with extensive datasets. The top submissions will show improvement toward human level reasoning.

Create a system that is able to efficiently learn to do new tasks. It's all about efficiency because we don't have as much compute as OpenAI had when they announced o3.

Evaluation

This competition evaluates submissions on the percentage of correct predictions. For each task, you should make 2 attempts to predict the exact outputs for every test input grid contained in the task. (Tasks can have more than one test input that needs a predicted output.) Each task test output has one ground truth. For a given task output, any of the 2 predicted outputs matches the ground truth exactly, you score 1 for that task test output, otherwise 0. The final score is the sum averaged of the highest score per task output divided by the total number of task test outputs.

Assess situation

The challenge ends on 4-11-2025, so we have 7 months to try new ideas. The grand prize is 700k$, whereas the progress prize would be 75k$ in the best case scenario (25k$ first prize and 50k$ best paper). So the prizes are designed to encourage the development of new and bold approaches. I should mainly work on novel and promising ideas.

At Veridas I can devote half of the time to the ARC challenge, and use the compute resources from the cluster. I could also use up to 15k$ in cloud computing if needed.

We can make one submission a day that it is worth 50$ in compute, that would be around 10k$ for all the competition. I believe I should use it as much as possible to gain information. Maybe I could use the development set as my test set and the public test set as my validation set. However be aware that making a lot of submissions could prevent joining other people later in the competition.