The pie is ready. You guys like swarms of things, right? — Bender, Robot Chef, Futurama

A running gag in the television show Futurama is that the robot character, Bender, loves to cook but is terrible at it because he doesn’t have a sense of taste or smell. He also doesn’t seem to have much understanding of what those things are. He’s often prepares meals for other non-robot characters, and everybody suffers — including Bender, when he realizes what a bad cook he is.

Although Bender’s inabilities seem both reasonable and hilarious, recent research suggests that they might go down in history as one of our culture’s most bigoted misconceptions of artificial intelligence. Machine learning programs have recently authored some unusual, but surprisingly tasty recipes.

Salad: the Low-Hanging Fruit

a plate of food

A salad, credit: Jessica Spengler

The study — which is a collaboration between researchers at Davidson College and the Museum of Mathematics — introduces a program that can “autonomously generate” good recipes for salad. “We limit our attention to salads,” the paper reads, “to avoid having to model the complex chemical transformations that the cooking process can introduce.” For the purposes of this study a “recipe” is just a list of ingredients, without their proportions or method of preparation, although the authors highlight these spaces as avenues for future research.

First, the researchers took all the salad recipes on www.allrecipes.com with five or more reviews, and extracted the ingredients. They removed ingredients that were misspelled, branded, inedible, or did not appear in 6 or more unique recipes. Each recipe came with a score from 1 to 5, based on how much people who reviewed the recipe liked it. Because most of the recipes on the site had a rating of 4 or 5, the researchers also randomly generated a bunch of combinations of 6 to 12 ingredients. They identified the ones they were sure would receive the lowest reviews — eliminating those that seemed “even remotely palatable”:


Very obviously bad recipes; Cromwell et. al.

The researchers assigned these obviously bad recipes a score of 1 and added them to the corpus. The researchers used this data set to build and train a classifier that could take the list of ingredients in two salads and correctly predict which had a higher score 82% of the time.

The classifier relied on assigning ingredients features in a “flavor network”, based on the one built by Lada Adamic in 2012. The basic principle behind a flavor network is that tastier ingredient combinations tend to appear together more frequently, in higher-scoring recipes. As NPR summarized:

“Those [ingredients] that frequently show up together [in unique recipes] — milk and butter, nutmeg and cinnamon, basil and rosemary — sit close to each other in the network, but those that rarely appear in the same dish, such as coconut and parsley, are far from each other.”


Ingredient complement network; Adamic et. al.

A Good Salad is Not Hard to Find

The researchers were satisfied that they had built a program that could discriminate between human recipes, with a good degree of success. If it generated its own, it would hypothetically be able to judge how tasty the recipe would be to humans.

Again, they randomly generated new ingredient combinations — 2,400 of them. This time, instead of manually labeling the bad ones, they had the program assign the recipes scores based on how they compared to the good and bad recipes in the training set.

They expected good recipes to be sparse, and to have to use their classifier to iteratively improve the computer generated recipes. But it turned out that the top 20 new recipes, as ranked by the computer, had a better average score, and a better maximum score, than the top 20 recipes in the training set. From this they concluded that palatable salads are relatively easy to find.

“This suggests that ingredient combinations that make for good salads are distributed rather densely throughout the ingredient space,” the authors write, “[and] finding good salad recipes is not a problem of looking for a needle in a haystack. Phrased differently: creating novel salads does not seem to require particularly great creative leaps.”

The Human Element

chart, bar chart

Human recipes out-performed computer recipes in taste, but not in novelty; Cromwell et. al.

In the last part of the study, the researchers randomly selected three of the top 20 human recipes, and three of the best computer-generated recipes, and submitted them to a blind taste test. Chefs were instructed to prepare the computer-generated salads according to three rules:

“Every ingredient listed was to be used.

No extra ingredients were to be added.

Every ingredient was to make “its presence felt”, i.e., its flavor was to be clearly discernible”

Thus, although the proportions of the ingredients and the method of preparation was all up to the human, no ‘bad’ ingredients in the computer recipes could be ‘hidden’.

Although the computer-generated salads fared worse than the human ones, all the recipes scored relatively well, and the top-scoring computer-generated salad was competitive with human ones. It was mistaken for a human salad 56% of the time. And of the six salads, only 9 out of 62 tasters were able to correctly distinguish all of the computer-generated salads from the human salads. In short, this first attempt at the Salad Test of computational creativity — something Alan Turing probably never dreamed up — went shockingly well.

a screenshot of a computer

Salad #3 was the best liked computer-generated salad; Cromwell et. al.

The tasters also rated the computer-generated salads as more novel than the human-generated ones.

As the researchers note, there’s still a long ways to go before the first robot chef earns a Michelin star. The program does not prescribe relative quantities of ingredients, nor methods of preparation. But there’s also a lot of room to improve the algorithm, possibly into something robust enough to tackle these other recipe features, and other kinds of recipes.

“In the future, we would like to expand the scope of the project to a wider class of recipes,” they conclude in the paper, “soups, drinks, desserts, etc.”

This post was written by Rosie Cima; you can follow her on Twitter here. To get occasional notifications when we write blog posts, please sign up for our email list