Segment Tree

April 27, 2023

Segment Tree is a data structure used to efficiently answer range queries on an array or list of elements. Range queries are queries that ask for information about a contiguous subsequence of elements in the array. Examples of range queries include finding the sum, minimum, maximum, or the number of elements that satisfy a given condition in a subsequence of the array.

The Segment Tree algorithm is particularly useful when the array is static, meaning that once it is created, the elements in the array do not change. In such cases, the Segment Tree can be preprocessed in linear time and can then answer range queries in logarithmic time.

Segment Trees have been used for a wide range of applications, including image processing, geographic information systems, and computer graphics.

Brief history and development

Segment Trees were first introduced by Jon Bentley in his book “Programming Pearls” in 1986. Bentley used the example of finding the minimum value in a range of elements in an array to motivate the need for a data structure that could efficiently answer range queries.

The Segment Tree data structure has since been refined and extended by various researchers. One important extension is the use of lazy propagation, which allows for more efficient updates to the array while maintaining the logarithmic time complexity for range queries.

Key concepts and principles

The key concept behind the Segment Tree is to precompute and store information about ranges of elements in the array. Each node in the tree represents a subrange of the array and stores some precomputed information. The root of the tree represents the whole array.

The precomputed information stored in each node depends on the type of range query being performed. For example, if the query asks for the sum of elements in a range, each node stores the sum of the elements in its corresponding subrange. If the query asks for the minimum value in a range, each node stores the minimum value in its corresponding subrange.

To answer a range query, the algorithm traverses the tree and combines the precomputed information stored in the nodes that cover the range being queried. The combining operation depends on the type of query. For example, to find the sum of elements in a range, the algorithm adds the precomputed sums of the relevant nodes. To find the minimum value in a range, the algorithm takes the minimum of the minimum values stored in the relevant nodes.

Pseudocode and implementation details

The following is the pseudocode for building a Segment Tree for an array of n elements:

buildTree(arr, tree, node, start, end):
    if start equals end:
        tree[node] = arr[start]
    else:
        mid = (start + end) / 2
        buildTree(arr, tree, 2 * node, start, mid)
        buildTree(arr, tree, 2 * node + 1, mid + 1, end)
        tree[node] = tree[2 * node] + tree[2 * node + 1]

The above pseudocode defines a recursive function buildTree that takes in an array arr, a list tree to store the Segment Tree, the index of the current node node, the starting index of the current node’s subrange start, and the ending index of the current node’s subrange end.

The function recursively builds the Segment Tree by dividing the current node’s subrange in half and calling buildTree on each half. The precomputed information stored in each node is the sum of the precomputed information stored in its children.

To answer a range query, the following pseudocode can be used:

queryRange(tree, node, start, end, queryStart, queryEnd):
    if start >= queryStart and end <= queryEnd:
        return tree[node]
    elif end < queryStart or start > queryEnd:
        return 0
    else:
        mid = (start + end) / 2
        leftSum = queryRange(tree, 2 * node, start, mid, queryStart, queryEnd)
        rightSum = queryRange(tree, 2 * node + 1, mid + 1, end, queryStart, queryEnd)
        return leftSum + rightSum

The above pseudocode defines a recursive function queryRange that takes in the Segment Tree tree, the index of the current node node, the starting index of the current node’s subrange start, the ending index of the current node’s subrange end, the starting index of the query range queryStart, and the ending index of the query range queryEnd.

The function recursively traverses the Segment Tree and returns the precomputed information stored in the relevant nodes to answer the range query.

5. Examples and use cases

The Segment Tree algorithm is used in various applications that involve range queries on arrays. Some examples include:

  • Finding the sum, minimum, or maximum value in a range of elements in an array.
  • Counting the number of elements that satisfy a given condition in a range of elements in an array.
  • Computing the GCD or LCM of a range of elements in an array.

6. Advantages and disadvantages

The Segment Tree algorithm has several advantages:

  • It allows for efficient range queries on static arrays, with a time complexity of O(log n) for each query.
  • It can be used for a wide range of range query applications.
  • It can be extended with lazy propagation to support efficient updates to the array.

However, the Segment Tree algorithm also has some disadvantages:

  • It requires extra space to store the precomputed information in the tree, which can be a drawback when the array is large.
  • It is not suitable for dynamic arrays, where the elements in the array change frequently.

There are several related algorithms and variations of the Segment Tree, including:

  • Interval Tree: A variant of the Segment Tree that supports interval queries instead of contiguous subsequence queries.
  • Binary Indexed Tree (BIT): Another data structure that can be used to answer range queries on arrays. The BIT has a simpler implementation than the Segment Tree but can only support certain types of range queries.