this post was submitted on 17 Aug 2023
7 points (100.0% liked)
PostgreSQL
637 readers
3 users here now
The world's most advanced open source relational database
Project
- About (history)
- Docs
- Donate to PostgreSQL
- Wiki
- Planet PostgreSQL
- IRC
- Mailing lists:
- pgsql-announce
- pgsql-hackers (developers)
- pgsql-general
- pgsql-jobs
- User Groups
Events
- SEAPUG Summer BBQ, 6 July in Seattle
- SFBA PostgreSQL Meetup, 12 July
- Chicago PostgreSQL Meetup, 19 July
- PGDay UK 2023, 12 September in London
- PGConf 2023, 3-5 October in New York City
- PGDay Israel 2023, 19 October
- PGConf.EU 2023, 12-15 December in Prague
Podcasts
Related Fediverse communities
- c/SQL on programming.dev
- #sql on Mastodon
- #postgresql on Mastodon
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I think you can use RANK as a window function and specify TOP 1000
Ok, I'm doing some reading: https://medium.com/@amulya349/how-to-select-top-n-rows-from-each-category-in-postgresql-39e3cfebb020
Good results with this approach. I hadn't considered the RANK OVER PARTITION BY criteria_a values and it works like a champ. It moves the ORDER BY into the realm of focus (criteria_a) and performance seems decent enough... and it isn't difficult to read the short statement.
Gives me the expected results over the 5+ million test rows I ran it against.
If you could elaborate on your idea of TOP, please do. I'm hoping there might be a way to wall the LIMIT 1000 into the inner query and not have the outer query need to WHERE filter rank on so many results?
Glad this is working for you. Using TOP probably was a bad idea and I think the way you used RANK <=1000 is a better approach.
If there was a way to safely exclude any of the records - like if you knew that when published was older than X days/months/years it would never make it into the final results, you could filter them out before ranking them. That might squeeze a little more performance out of the query, but could be risky if the data isn't predictable enough.
Thank you