Motivation for contest
This math puzzle is related to an interesting cryptographic primitive, for a "proof of time", which Chia Network is planning to use for a cryptocurrency they are launching. Although honestly, I just find the math and history behind quadratic forms fascinating.
Chia Network previously held two programming contest related to this same topic. So a natural question may be: Why another contest?
The short answer is that this contest has a different focus. The previous contest incentives largely drove improvements for large integer operations on specific x86 processors. This contest aims to avoid the hardware specific details and dig into the core math puzzle and explore for some higher level algorithmic improvements.
"High level overview of hope for algorithm improvments" would be another way of framing this section.
All current best methods of calculating a form composition require calculating one extended gcd. The result however has large numbers, and so to prevent the size of the numbers exploding with repeated application, a form "reduction" is performed (finding in a sense the smallest form equivalent to the form just calculated). This reduction is in many ways similar to the euclidean algorithm. This is where most of the computation time goes, in these two "gcd" like calculations: one for getting the composition, the other for reducing.
Naively, composing two cubes solves three form compositions. However, a cube with holds the result of two of the compositions, necessarily then already holds a solution to the third. So for composing two arbitrary cubes, the expected cost should be at least two xgcd, and hopefully only one reduction like operation on the cube.
However, we are interested in a very special case:

we are trying to compose a cube with itself

that cube has
form1 = form2
, and soform3 ~ form1^2 = form2^2
Therefore in the resulting cube, we already know the answer to two of the form compositions. If we can fit that into the cube, we get the third for free. In this ultra idealistic case, repeated squaring may not even require an xgcd.
On the other end of the spectrum from idealistic hopes, we know, because the algorithms exists (and are included in the contest library), that we can construct a cube from scratch with form1 = form2 = anything with a single xgcd.
So inbetween, in the conservative but hopeful case, is that there is some algebraic solution to the cube composition, such that our special conditions help, and we do not need to completely toss the cube and start from scratch each time. Therefore, still requiring an xgcd, but in values roughly squareroot the typical values of A1,B1,C1. And then a reduction operation on the cube, again operating on mostly already reduced size values.
That hope feels plausible to me. And this feels especially plausible when you realize the current best algorithm for squaring in form composition, called NUDUPL, can be rephrased as constructing a cube, and the main savings is from doing most of the reduction operation in cube form, before calculating the resulting forms (which then use some other algorithm to reduce the rest of the way).
In short, the bet is there are further improvements to be made if we just "stay in the cube representation". It feels there should be a better way than each time constructing the cube from scratch, using it for some speed up, then tossing the cube away, only to require constructing another one from scratch in the next step and so on.
THIS is the kind of general algorithmic improvement that would lead to speedups regardless of the hardware architecture, even in design of ASIC devices.