Nice writeup. The Exists subquery approach is definitely the cleanest.
One thing worth mentioning: if you're hitting this problem frequently, it might be worth reconsidering the query patterns themselves. We had a similar issue at work where we kept adding `.distinct()` everywhere, and eventually realized we were doing the filtering wrong upstream.
The PostgreSQL-specific `distinct(*fields)` with the ORDER BY restriction is one of those things that trips people up. The error message isn't great either. "SELECT DISTINCT ON expressions must match initial ORDER BY expressions" is technically correct but doesn't explain why or what to do about it.
Good call recommending Exists as the default approach. It's more explicit about intent too.
Nice writeup. The Exists subquery approach is definitely the cleanest.
One thing worth mentioning: if you're hitting this problem frequently, it might be worth reconsidering the query patterns themselves. We had a similar issue at work where we kept adding `.distinct()` everywhere, and eventually realized we were doing the filtering wrong upstream.
The PostgreSQL-specific `distinct(*fields)` with the ORDER BY restriction is one of those things that trips people up. The error message isn't great either. "SELECT DISTINCT ON expressions must match initial ORDER BY expressions" is technically correct but doesn't explain why or what to do about it.
Good call recommending Exists as the default approach. It's more explicit about intent too.
Good read, TIL!
That being said, I use Django daily for 10 years but I don’t understand the ORM besides basic CRUD. Even a simple group by looks weird.
Writing plain SQL feels easier and more maintainable in the long run.