Identifying art through machine learning

Given years of experience and some diligent research, identifying each work of art in an old exhibition photo doesn’t sound so hard, does it? Now imagine you have tens of thousands of photos, dating back to 1929. MoMA’s Digital Media team and Google Arts & Culture Lab set out to face this daunting challenge—or at least get a head start—using machine learning and computer vision technology.

Our collaborators at Google Arts & Culture Lab used an algorithm to comb through over 30,000 exhibition photos, looking for matches with the more than 65,000 works in our online collection. In total, it recognized over 20,000 artworks in these images, and we used those results to create a vast network of new links between our exhibition history and online collection. These connections bolster an already unparalleled resource for the millions of visitors to moma.org.

Now a photo from a 1929 painting exhibition opens a window into an iconic work by Paul Cézanne; a 1965 shot of Robert Rauschenberg prints connects you to those same works in MoMA’s 2017 Rauschenberg retrospective; and one corner of a 2013 design exhibition becomes a portal into poster art across two centuries. While hardly comprehensive, it’s a great start—and a remarkable feat given the sheer volume of information involved.

Why weren’t there more matches?

While many works were identified, many others don’t currently have anything to be matched with. Exhibitions often contain both works from our collection and items loaned from elsewhere, so the installation images likewise contain many works that aren’t part of MoMA’s collection. In addition, some works that are in the Museum’s collection have not yet been added to our constantly evolving online collection, or have been added but don’t yet include an image.

From the outset, our main concern was keeping incorrect matches to a minimum. So we sacrificed quantity for quality: Google Arts & Culture Lab designed the algorithm to declare something a match only when it was very “confident.” We learned that, like anyone, an algorithm has strengths and weaknesses. At present, the algorithm is very good at identifying static, two-dimensional images. Sculptures; moving image, installation, and sound works; and text-based artworks proved far more challenging. Unsurprisingly, multiples and editions of the same or very similar images also led to some “false positives.” Simply put, a machine can’t always tell one soup can from another. The algorithm may also incorrectly match photographs when the work on view was actually a different print of the same (or a very similar) image.

Today, the work begun by this machine learning project has been taken up by MoMA staff. Thus far, what we gave up in processing speed has been made up for by the flexibility of human brains and the thoroughness of our curators’ research.

What if I notice an incorrect match?

If you notice something amiss, please let us know by emailing [email protected]. Your input can help us offer an even more accurate resource in the future.