Find Gaps And Islands In Sequence Auto Increment PgSQL MySQL
The problem of finding gaps in sequential data, or auto-incremented values, or more commonly, Gaps an Islands problem can be resolved with some simple JOINs. There are solutions which use sub-queries, but these tend to be expensive.
To demonstrate simple database is required. In this example, two databases will be used, PgSQL and MySQL to show that the solution works well across both.
PgSQL
MySQL
With the table created in either PgSQL or MySQL, some data is needed. Lets use a list of Australian Native animals to get started. If you are wondering which ones are dangerous, use the simple rule of wildlife in Australia.
If it is not you, it can kill you.
A simple query to show they are all in place.
id | name ----+--------------------- 1 | dingo 2 | wombat 3 | platypus 4 | kangaroo 5 | kookaburra 6 | wallaby 7 | echidna 8 | koala 9 | cassowary 10 | kiwi 11 | Funnel Web 12 | emu 13 | bandicoot 14 | steve irwin 15 | bilby 16 | blue tongue lizard 17 | tasmanian devil 18 | cockatoo 19 | quokka 20 | frilled neck lizard 21 | numbat 22 | kowari 23 | brolga 24 | bogong moth 25 | galah 26 | yabby 27 | wallaroo 28 | goanna 29 | thylacine 30 | cuscus 31 | phascogale
Now lets create some gaps and islands
A quick query again to see what is left.
id | name ----+--------------------- 1 | dingo 2 | wombat 4 | kangaroo 6 | wallaby 8 | koala 9 | cassowary 10 | kiwi 11 | Funnel Web 12 | emu 13 | bandicoot 14 | steve irwin 18 | cockatoo 19 | quokka 20 | frilled neck lizard 21 | numbat 22 | kowari 23 | brolga 24 | bogong moth 25 | galah 30 | cuscus 31 | phascogale
Now the job is to find gaps, or islands.
Using a subselect, the start and stop ID's of a gap can be found.
SELECT start, stop FROM ( select m.id + 1 as start, (select min(id) - 1 from animals as x where x.id > m.id) as stop from animals as m left outer join animals as r on m.id = r.id - 1 where r.id is null ) as x WHERE stop IS NOT NULL;
The resulting records would show the starting and ending IDs of a gap.
start | stop -------+------ 3 | 3 5 | 5 7 | 7 15 | 17 26 | 29
But the use of subqueries, and correlated subqueries is generally expensive, especially if your optimizer creates temp tables. So, the same result can be obtained using JOINs.
SELECT l.id + 1 AS start, min(fr.id) - 1 AS stop FROM animals as l LEFT OUTER JOIN animals AS r ON l.id = r.id - 1 LEFT OUTER JOIN animals AS fr ON l.id < fr.id WHERE r.id IS NULL AND fr.id IS NOT NULL GROUP BY l.id, r.id;
Another option
SELECT a.id+1 AS gap, MIN(b.id) - 1 AS To FROM animals AS a, animals AS b WHERE a.id < b.id GROUP BY a.id HAVING a.id < MIN(b.id) - 1;