Our May AIM workhop on Algebraic Vision was an absolutely fascinating experience. I will eventually write more about things that I learned there, but what I will write about first is of the most interesting from a sociological perspective. It sheds some light on the culture gap between computer vision and pure mathematics.

Forsyth

In 1993, David Forsyth published a paper at the Fourth International Conference on Computer Vision entitled “Recognizing algebraic surfaces from their outlines”. The key result is the following. (I have copied the statement of the theorem verbatim.)

Theorem. The equation of its outline in a perspective image completely determines the projective geometry of an algebraic surface of degree 2 or greater, for a generic view of a generic algebraic surface.

What does the statement mean? Let me define various terms that come up.

  • Algebraic surface is easy: this means projective algebraic surface
  • View means: linear projection
  • Perspective image means: image in a view (as defined above).
  • Generic is undefined in the paper, but we know what it means: true on a Zariski-open locus.
  • Outline is the most interesting one. Here is how to define it: we assume our linear projection has base locus at a point not on . We thus get an induced finite morphism This morphism ramifies along a curve , and the outline is the image of in .
  • Projective geometry of means “ up to automorphism of ”.

We can restate the theorem like this.

Theorem (restated). Suppose is a generic smooth projective surface of degree at least and is a generic linear projection that is regular along . Let be the ramification curve of the restriction . Then uniquely determines , up to automorphism of .

(This is close enough to a rigorous statement that it would pass muster to almost any algebraic geometer, even though some number theorists might not like the lax use of genericity. In fact, I don’t like it either this way. We should be careful and say that we are looking at a generic pair .)

Chisini

When stated this way, it looks awfully similar to the following, which I will state as a conjecture.

Conjecture. A generic finite morphism is uniquely determined by its branch divisor.

This conjecture is simpler – it removes linear projections from the picture – and harder – it removes linear projections from the picture! But now it has transformed into a standard problem in algebraic geometry, called Chisini’s conjecture, first stated in 1944. In fact, Chisini also assumed for his conjecture that the degree of is at least . And the word “generic” allows us to assume that all ramification of is simple (i.e., is locally around source and target given analytically as ), that the ramification curve in is smooth, that its image in has only nodes and cusps, and that the induced map is birational.

Chisini himself stated a solution to some cases of this conjecture in [1], but his proof was incorrect. (This is briefly explained by Antonio Lanteri in his MathSciNet review of Catanese’s paper [2]. The error seems to be related to the degeneration types of the curves that appear.)

As an algebraic geometer, the immediately intuitive way to approach a problem like this is to study the fundamental group of the complement of the branch divisor in . It looks like one is trying to prove that if the monodromy action of cycles around the divisor are a given set of transpositions, then the monodromy representation is uniquely determined. This is pretty close to wanting to know a presentation of the fundamental group. For example, is it abelian with a specific set of generators?

Indeed, Zariski conjectured that if a plane curve has only nodes, then its complement has abelian fundamental group. This was proven by Fulton (in the sense of profinite fundamental groups, although this was improved by Deligne in the complex case) [3] in 1980. (Note: knowing that such curves generally degenerate to unions of lines plays an important role, echoing the issues with Chisini’s original incorrect proof.)

So we’re done, right? Unfortunately, no: the branch curve of a projection need not be nodal. In fact, a simple dimension count argument shows that it can’t be, in general.

Moishezon

As it turns out, a number of authors have studied Chisini’s conjecture over the years, including Catanese, Kulikov, Nemirovskii, and Moishezon. Moreover, Moishezon solved the precise case encountered by Forsyth in 1981. The hard part of Chisini’s problem is that it is stated very generally, with no assumptions on the structure of the morphism. Things are much easier for linear projections, although Kulikov still has to work hard for high dimensional projections, as opposed to projections from 3-space.

Moishezon’s proof is embedded in a larger work that studies problems related to fundamental groups of curve complements [4]. So my mathy gut instinct was “right” in a certain vague sense, but it is hard to implement.

What did Forsyth do?

Forsyth’s argument is much easier. It relies heavily on input from Debarre, who was at Iowa at the same time as Forsyth (the early 1990s), consisting of a few simple cohomology calculations and some clever observations about generators of ideals in various degrees (related to the degree of the surface). At some point, I intended to sketch it here, but that is stopping me from actually finishing this. Instead, I encourage you to read the paper, not only to see how the argument works, but also to get a taste of the comparison between the cultures of algebraic geometry and computer vision.

What happened?

The main point I want to make here is that Forsyth, a computer vision researcher, seems to have rediscovered the results of Chisini, etc., without being aware of the algebro-geometric literature. There’s one small wrinkle: Forsyth got help from Debarre, a well-known algebraic geometer. Chisini’s conjecture and the work of Catanese and Moishezon seem to have been obscure enough within the subject that Debarre didn’t immediately point Forsyth toward those results. Or perhaps Debarre realized that to answer Forsyth’s immediate question, much simpler techniques would suffice. Or maybe Forsyth went to Debarre with a specific bit of the question, and Debarre didn’t know about the bigger context. Enough speculating! Regardless of how this came to be, Forsyth and Debarre cooked up a very nice and simple proof in the case of linear projections.

There is a lot that computer vision researchers and algebraic geometers have to learn from one another. But that is the subject of another post.

References

[1] Chisini, “Sulla identita birazionale delle funzioni algebriche di due variabili dotate di una medesima curva di diramazione.” Ist. Lombardo Sci. Lett. Cl. Sci. Mat. Nat. Rend. (3) 8(77), (1944). 339–356.

[2] Catanese, “On a problem of Chisini.” Duke Math. J. 53 (1986), no. 1, 33–42.

[3] Fulton, “On the fundamental group of the complement of a node curve.” Ann. of Math. (2) 111 (1980), no. 2, 407–409.

[4] Moishezon, “Stable branch curves and braid monodromies.” Algebraic geometry (Chicago, Ill., 1980), pp. 107–192, Lecture Notes in Math., 862, Springer, Berlin-New York, 1981.