Range query problems are generally solved by either Binary Indexed Tree or Segment Tree. But sometimes when total queries are very large then you need some another data structure such that query time is almost constant and prefetching or pre-processing time is N log (N). The data structure is Sparse Tables.
Assume a range query problem where 10^7 questions will be asked.
For example: an array of size 10^5 and questions of type – Find minimum of all numbers in range L – R then segment tree will fail so we need sparse tables.
In sparse table we use concept of dynamic programming (if you don’t know then also its ok, it’s just traversing an array intelligently).
The concept is to break down the complete linear array in chunks of powers of 2. For example – let array indexes be 0,1,2,3,4 then break the array in chunks of powers of two as below —-
Row 0 —– >0 – 0,0 – 1,0 – 3
Row 1 —– >1 – 1,1 – 2,1 – 4
Row 2 —– >2 – 2,2 – 3
Row 3 —– >3 – 3,3 – 4
Row 4 —– >4 – 4
These ranges denote that we will be calculating the answer for only these ranges only and the answer for any other range can be calculated from these many ranges only.
The main thing is how to break into these ranges and calculate the data efficiently for this and store.
To store values we will use a 2 Dimensional array where A[i][j] means the answer for range
i to i + 2j- 1
Simulate the above formula and you will get the values which are given above.
We already know the value for A[i] because i + 2
0 – 1 = i itself it means we are asking the a[i] element as our answer.
We will be calculating answer in column major which means A then A then A….and so on. Then after that A A A etc. and similarly for all other values.
So when you are going to calculate A[i][j] you already will be knowing answer for A[i][j-1]. So I will break the asked range L to R in the form of two ranges for which I would have calculated answers in my table.
Suppose the query is find minimum element of array in range L to R. Let there exist two numbers a and b with following conditions —–
L + 2
R – 2
Basically these conditions mean that the given range is broken into two ranges such that for both the ranges I have calculated answer in my table. For example let the range is 7 to 19 . Then a and b can be 3 and 3 so that the range 7 to 19 is broken into ——
7 to 7+8-1 i.e. 7 to 14
19-8+1 to 19 i.e. 12 to 19
We are doing this to ensure that both the ranges are overlapping to each other and also cover entirely the asked range 7 to 19. Here a = b and this condition always holds. (Why ? this you need to think. Comment for help). Say a = b = 2 then the range would have been 7 to 10 and 16 to 19 so what about the numbers of range 11 to 15 . Thus dividing the range is a crucial step.
IMPLEMENTATION and Few Formulas –
The implementation is very easy once you understand its concept. The answer for A[i] is very clear as explained above. If you are calculating A[i][j] in column major order then you are always ensured that before calculating answer for range i to j i.e. A[i][j] you will have answer for A[i][j-1] and also answer for A[i+2
j-1][j-1] because columns are one less than j and in calculating column wise we have already visited them.
Thus for calculating A[i][j]
A[I][J] = MIN(A[I][J-1],A[I+2
(J-1)][j-1] which in simple language can be decoded as
Answer for range i to i+2
j-1 is calculated from answer for i to i+2(j-1)-1 and i+2(j-1) to i+2(j-1)+2(j-1) (which is equal to i + 2j)
Remember MIN can be replaced by any range function as per your problems.
For breaking ranges into chunks of two I am leaving this as a thinking and brain application for Reader. You can get help in comments or from my code.
Code for Sparse table –