STEMM Institute Press
Science, Technology, Engineering, Management and Medicine
Quantitative Analysis of the Avalanche Effect of Different Hash Functions Based on Chi-Squared and T-Tests
DOI: https://doi.org/10.62517/jbdc.202601227
Author(s)
Liyuan Zhao
Affiliation(s)
Applied Statistic, Anhui University, Hefei, Anhui, China
Abstract
As the demand for information security continues to escalate in modern society, the avalanche effect of hash functions has emerged as a critical metric for evaluating their resistance to cryptographic attacks. Nevertheless, conventional analyses of the avalanche which lacking systematic quantitative assessment and robust statistical underpinning effect predominantly rely on qualitative observations, those impedes objective and reproducible comparisons across different algorithms.This paper constructs a quantitative analysis framework for the avalanche effect of hash functions, employing chi-squared tests and paired sample t-tests. This framework comprehensively assesses hash algorithm performance from two perspectives: output distribution uniformity and output variability. Employing a control variate methodology,This experiment constructs statistical samples by applying two independent, random-position bit flips to various types of input strings, thereby generating perturbed outputs and calculating the Hamming distance between the two perturbed inputs.The effectiveness of the framework is validated by statistically testing classical hash algorithms such as MD5, SHA-256, and SHA-1. The experimental results were found to align highly with the known security characteristics of these algorithms.In summary,this framework for avalanche effect presented in this paper holds significant theoretical value and practical implications for the design and security comparison of cryptographic algorithms.
Keywords
Hash Function; Avalanche Effect; Chi-Squared Test; t-Test
References
[1] Sadeghi-Nasab, A., & Rafe, V. (2023). A comprehensive review of the security flaws of hashing algorithms. Journal of Computer Virology and Hacking Techniques, 19(2), 287-302. [2] Saarinen, M. J. O. (2009). Cryptanalysis of Dedicated Cryptographic Hash Functions (Doctoral dissertation, Ph. D. dissertation, University of London). [3] Li, Bo, Liu, Ping, & Wang, Zhangyi. (2007). Study on the Randomness of SHA-256 Output Sequences. Computer Engineering and Applications, (09), 142-144+156. [4] Liu, Yang. (2015). Research on Chaotic Pseudo-Random Sequence Algorithms and Image Encryption Technologies [Doctoral dissertation, Harbin Institute of Technology]. [5] Chi, L., & Zhu, X. (2017). Hashing techniques: A survey and taxonomy. ACM Computing Surveys (Csur), 50(1), 1-36. [6] Upadhyay, D., Gaikwad, N., Zaman, M., & Sampalli, S. (2022). Investigating the avalanche effect of various cryptographically secure hash functions and hash-based applications. IEEE Access, 10, 112472-112486. [7] Wang, Gan & Zhang, Wenying. (2016). Security Analysis of SHA-3. Computer Applications Research, 33(03), 851-854+865. [8] Zhou, Lü & Chen, Qin. (2002). Methods and Algorithms for Statistical Performance Testing of Block Ciphers. Journal of Hangzhou Electronic Engineering Institute, (06), 81-84. DOI: 10.13954/j.cnki.hdu.2002.06.018. [9] Saarinen, M. J. O. (2009). Cryptanalysis of Dedicated Cryptographic Hash Functions (Doctoral dissertation, Ph. D. dissertation, University of London). [10] Damasevicius, R., Ziberkas, G., Stuikys, V., & Toldinas, J. (2012). Energy consumption of hash functions. Elektronika ir elektrotechnika., 18(10), 81-84. [11] Sofi, A. A., Mir, A. H., & Jabeen, Z. (2025). Effect of hash functions on speed and security within Bitcoin’s proof-of-work (PoW). Cluster Computing, 28(11), 724. [12] Palukha, U., & Kharin, Y. (2019, June). Performance analysis for statistical testing of random and pseudorandom generators by entropy statistics. In 2019 International Conference on Information and Digital Technologies (IDT) (pp. 358-364). IEEE. [13] Kaminsky, A. (2019). Testing the randomness of cryptographic function mappings. Cryptology ePrint Archive. [14] Damasevicius, R., Ziberkas, G., Stuikys, V., & Toldinas, J. (2012). Energy consumption of hash functions. Elektronika ir elektrotechnika., 18(10), 81-84. [15] Bookstein, A., Kulyukin, V. A., & Raita, T. (2002). Generalized hamming distance. Information Retrieval, 5(4), 353-375. [16] Mitsuya, S., Nakashima, Y., Inenaga, S., Bannai, H., & Takeda, M. (2021). Compressed communication complexity of hamming distance. Algorithms, 14(4), 116. [17] Zhu, Mingfu, Zhang, Baodong, & Lü, Shuwang. (2002). A Statistical Analysis of the Diffusion Characteristics of Block Cipher Algorithms. Journal of Communications, (10), 122-128. [18] De Moivre, A. (2020). The doctrine of chances: A method of calculating the probabilities of events in play. Routledge. [19] JAISWAL, M. S., Cho, M., & Kang, B. (2022). U.S. Patent No. 11,410,043. Washington, DC: U.S. Patent and Trademark Office. [20] Fang, Xiangzhong. (2022). Chi-Squared Distribution and Chi-Squared Test. China Statistics, (05), 29-31. [21] Wang, Lili. (2018). Discussion on Testing Issues in Hypothesis Testing. Science & Technology Vision, (22), 82-84. https://doi.org/10.19694/j.cnki.issn2095-2457.2018.22.038. [22] Menezes, A. J., van Oorschot, P. C., & Vanstone, S. A. (2021). Handbook of Applied Cryptography. Instructor, 202101. [23] Wang, X., Yin, Y. L., & Yu, H. (2005, August). Finding collisions in the full SHA-1. In Annual international cryptology conference (pp. 17-36). Berlin, Heidelberg: Springer Berlin Heidelberg.
Copyright @ 2020-2035 STEMM Institute Press All Rights Reserved