How Recipeas chooses recipes

Recipeas crawls recipes from chefs and food bloggers around the web — the live catalog now serves over 4.3 million accepted recipes. Not all of them are equally good. This page explains, in plain language, how we decide which to surface and which to keep behind search. The detailed numbers below are a snapshot from June 4, 2026, when the catalog was smaller; they refresh whenever the page is rebuilt.

1,880,582

Recipes accepted into the catalog

35%

Accept rate (3,526,233 discarded at ingest)

762,271

Recipes scored so far (rolling pass)

Two questions every recipe has to answer

Before a recipe even enters the catalog, our crawler asks: does it have a real photo? does it have at least 3 ingredients and 2 instructions? does the title look like a recipe and not a blog header? About 35% of what we scrape clears that bar — the rest gets discarded immediately.

The accepted recipes then get a discovery score from 0 to 100, which decides whether they show up in the browse feed. Recipes with low scores stay searchable — you can always find them by name or ingredient — they just don't lead the browse feed.

How the discovery score works

It's a small, transparent formula. We're not trying to be clever; we're trying to be honest about what makes a recipe worth recommending.

discovery_score = 50 (baseline) + up to +25 if the host is a famous American food brand + up to +15 for how many ingredients we successfully canonicalized + up to +10 for a clean English title − up to −25 for photo problems (placeholder, logo, tiny thumb, broken) − up to −10 for ingredient lines that didn't parse cleanly − up to −10 for a corrupted (mojibake) title show_in_feed = discovery_score ≥ 55

A few specific notes about that formula:

The American bonus is a list of household names — Allrecipes, Food Network, Serious Eats, Bon Appétit, America's Test Kitchen, food.com, Smitten Kitchen, etc. It's a curated lift, not a punishment for everyone else.
Foreign-language sites are not penalized for being foreign. A clean Russian recipe with a great photo and canonical ingredients scores the same as a clean American one minus the boost. What does penalize them is corrupted text (mojibake), missing photos, and ingredient names the parser couldn't recognize — all of which hurt recipes in any language.
Search ignores all of this. If you can name what you want, you can find it.

Is it actually working? The data.

The whole point of the score is to put the bad-looking recipes in the hidden bucket and the good-looking ones in the shown bucket. The fastest test is to look at known quality signals — corrupted titles, ALL-CAPS shouting, broken image URLs — and compare the rates.

Hidden from browse

616,647

recipes still searchable, hidden from feed

Corrupted (mojibake) titles	0.6%
ALL-CAPS titles	8.9%
Suspicious image URLs	0.7%

Shown in browse

145,624

recipes leading the discovery feed

Corrupted (mojibake) titles	0.0%
ALL-CAPS titles	0.5%
Suspicious image URLs	0.0%

If the score weren't doing anything useful, those percentages would be similar between the two columns. Today the gap is roughly 59× on mojibake — we're correctly funneling broken text away from the feed.

Which hosts lead each bucket

Top hosts in browse

www.allrecipes.com	8,913
www.food.com	8,016
www.americastestkitchen.com	2,628
www.justapinch.com	1,683
jamiegeller.com	1,669
sunset.com	1,668
www.greatbritishchefs.com	1,648
tasty.co	1,567
taste.co.za	1,544
www.bbcgoodfood.com	1,537

Top hosts in hidden

www.povarenok.ru	47,279
eatsmarter.de	4,178
varecha.pravda.sk	2,742
migusto.migros.ch	2,643
pt.petitchef.com	2,353
www.kochbar.de	2,147
dobruchut.aktuality.sk	2,092
www.cuisinelolo.fr	1,968
www.kotikokki.net	1,805
www.lecremedelacrumb.com	1,801

Languages in the catalog

We accept recipes from chefs and food bloggers around the world. The app shows recipes in your phone's language by default; a one-tap "Original Version" toggle in the recipe view flips back to the source language.

Language	Accepted recipes
en	1,342,358
(unknown)	418,692
ru	118,381
es	274
de	133
id	103
pt	73
zh-CN	72
it	54
hi-Latn	45

What we're still working on

About 14% of accepted titles still carry the mojibake corruption (UTF-8 read as Latin-1). We have a verified one-pass repair (99% accuracy on a 5,000-row sample); rolling it out across titles, descriptions, ingredient lines, and instructions.
The photo size signal in the score is currently from a small HEAD-request pass; the full corpus warm is in progress and will sharpen the photo penalty as it completes.
The canonical-ingredient catalog covers about 100 base ingredients; we keep adding the most-frequent gaps so more recipes parse cleanly on first ingest.