next up previous
Next: Entropy Up: Chaotic Measures and Dynamical Previous: Integer Lyapunov Exponent

Integer Lyapunov Evaluation

Our first numerical experiments focused on the evaluation of Lyapunov exponents for a variety of iterators and initial hash functions, including variations of linear and quadratic probing, and double hashing. The results of these experiments were, not surprisingly, inconclusive. The relationship between iterator and Lyapunov exponent appeared to be a complex one. All commonly used hash iterators had an integer Lyapunov exponent which depended on table size, yet in many cases the exponent was not directly related to the actual performance of the function for random key values. In some cases a higher integer Lyapunov exponent was associated with a poorer performing hash function. It was concluded that the evaluation of integer Lyapunov exponent alone was not a sufficient measure of hash function performance.

An analytical evaluation of double hashing provides some insight. We can rewrite equations (5) and (6) as

  equation142

where N and N-2 are prime numbers. As discussed in Section 2.2, this function produces unique probe sequences for each unique value of tex2html_wrap918 , and all probe sequences will be of full length. An estimate of the integer Lyapunov exponent can be determined analytically. If one starts with an initial key tex2html_wrap919 , one can analytically perform the summation in equation (10). First, observe that the expressions tex2html_wrap_inline928 and tex2html_wrap_inline930 are both equal to the original key k for k<N-3. In this case, hash function (12) reduces to

equation148

Evaluating the individual terms of the sum for equation (10) yields

eqnarray151

Clearly it is difficult to bound this expression because of the modular reduction operations. However, rough bounds can be established by noting that only when tex2html_wrap_inline936 and tex2html_wrap_inline938 are in different epochs modulo N will the difference between the values be greater than (i+2). Furthermore, for many values of i > N/2 the distance tex2html_wrap_inline946 will be substantially less than i because the distance is measured modulo N. Overall, a very rough expected value for the integer error distance tex2html_wrap_inline946 is of order i for the i-th iteration of this hash function:

displaymath922

where tex2html_wrap_inline958 denotes the expected value of the Lyapunov exponent for m iterations, starting with key tex2html_wrap_inline962 . This expectation is easily verified empirically. For example, for N=1823 the measured Lyapunov exponent is 6.00 versus 6.50 predicted as an upper bound above, measured over the entire problem space. This rough bound is sufficient for the analysis.

An important observation can be drawn from this analysis. Recall that the Lyapunov exponent is a measure of sensitivity to initial conditions. It tells us how quickly data initially close together will be distributed widely in the hash table. From the analysis of the double hashing function, it can be seen that values that start close in the table will differ by no more than about tex2html_wrap_inline970 slots in the table after iteration i. This is a significant result if one would like collisions resolved after only a few probe iterations. Consider that even for large tables with high load factors, the average number of probes rarely exceeds 30. Therefore data elements which are clustered initially will likely remain clustered for the first several iterations. This has significant implications for the performance of double hashing of nonuniformly distributed data. A hash function which distributes the data uniformly in the hash table from the very first iteration more closely approximates the uniform hashing property.

The problem described above is inherent because of the linearity modulo N of the most popular choices for hash functions. Because of the linear relationship modulo N between i and tex2html_wrap_inline982 in double hashing (see equation (4)) close key values are separated little during the first probe iterations, unless they happen to cross a modulo N boundary. Correcting this problem requires either the use of nonlinear hash functions modulo N for tex2html_wrap_inline836 and tex2html_wrap_inline834 or a nonlinear probe function modulo N.


next up previous
Next: Entropy Up: Chaotic Measures and Dynamical Previous: Integer Lyapunov Exponent

Greg Heileman
Sat Nov 2 14:24:01 MST 1996