If you are a programmer, you probably know such data structures as
java.util.ArrayList of Java and/or
std::vector of C++. Both are simple arrays under the hood, and whenever there is no room for appending a new element, the implementation creates a larger array, copies the contents of the current array to it, deletes the current array, sets the newly created array as current, and, finally, appends the element normally. Basically, as long as there is room for appending, the running time of such an operation is constant. Obviously, if there is no room, the running time is . This is by no means informative, and we would like to understand the efficiency of such a data structure better. Now, let amortized analysis come to rescue. The idea behind amortized analysis is that you sum the costs of operations appearing in a sequence, and divide the sum by the number of operations in the sequence. Formally, if the cost sequence of an operation sequence is , the amortized running time is simply the average of them:
Now, we have two schemes for expanding the underlying array:
- arithmetic: choose a positive integer and every time you have to expand the array, make it array components longer;
- geometric: choose a real number , and every time you have to expand the array of length , make it array components long. (If is the initial array length, we require .)
Suppose the initial array capacity is and is given. Also we are given a sequence of appending operations of length . Now the total work is
Above, is the total work of expanding and filling the array sufficiently many times in order to accommodate elements, and denotes the number of elements we could have put before expanding once again.
Now, let us show that the amortized cost is linear in . We start from the lower bound:
Next, we need to show that there exist and such that for all . Let us choose . Now we have
Since is a constant, so is . Next, let us derive an upper bound:
At this point we choose , which leads to
If we choose , we have that for all , which is the very definition of , and so, we have that .
Next, we will show that appending with geometric strategy runs in amortized time. If we manage to show that the upper bound is constant, we can omit proving the lower bound. Suppose we start with the empty array of capacity . Also, we are given the expansion factor . As a slight technicality, we must choose such that ; otherwise, the very first expansion will have no effect, and, thus, by induction, none of them. Assuming the same unit costs for copying and appending elements like in the arithmetic strategy, the total work of appending elements is
We require to be the smallest integer such that , which leads us to the following inequalities:
Since is required to be the smallest integer satisfying the above inequality, we can set . Also,
Since , we have
Now we have that
which is constant since and are constants. Hence, appending an element to an array-based data structure using geometric array expansion strategy runs in amortized constant time.