Mapping Blockchain and Data Science to the Cyber Threat Intelligence Lifecycle: Collection, Processing, Analysis, and Dissemination

Authors

Keywords:

Blockchain, Cyber Threat Intelligence, Data Science, Lifecycle Integration, Security, Threat Analysis, Trust

Abstract

Cyber Threat Intelligence (CTI) has been at the center of proactive cybersecurity actions that seek to detect threats before they become incidents of disrupting or debilitating nature. The CTI process is usually found to encompass four consecutive yet interconnected phases: Collection, Processing, Analysis, and Dissemination. All of them make up an end-to-end security posture, but they differ in technical process as well as operational focus. Combining blockchain technology and data science into this lifecycle offers promising benefits, though not without trade-offs. While this integration can improve integrity, automation, and insight, conventional limitations—i.e., threats to data integrity, trust, and real-time scalability—still present challenges that require careful architectural consideration. The distributed ledger in blockchain, backed by cryptography, ensures immutability and auditability of threat information such that no one can unilaterally modify or censor sensitive intelligence. Smart contracts may support automation of specific procedures with reduced human input. Data science techniques are used to process large volumes of diverse threat data through methods such as machine learning, data mining, and predictive modeling. Matrix factorization, eigenvalue decomposition, and vector-space embeddings based  concepts form the basis of much of these data science methods, formalizing anomaly detection, classification, and clustering.  This work systematically maps and integrates blockchain and data science across the four stages of CTI. During the Collection phase, blockchain stores secure, tamper-evident records of ingested data, and data science pipelines normalize multi-channel input collection and early-stage entity extraction. Provenance tracking and on-chain validation supplement processing, with rigorous data normalization and feature engineering. Analysis uses trusted blockchain data stores for high-level machine learning operations, from models in Support Vector Machines to spectral clustering. Lastly, Dissemination provides tamper-evident, verifiable sharing of intelligence with complementary data-driven adaptive warnings and customized reporting.

Downloads

Published

2023-03-04

How to Cite

[1]
I. Ahmed, R. Mia, and N. A. F. Shakil, “Mapping Blockchain and Data Science to the Cyber Threat Intelligence Lifecycle: Collection, Processing, Analysis, and Dissemination”, JACAIDMS, vol. 13, no. 3, pp. 1–37, Mar. 2023, Accessed: Jan. 28, 2026. [Online]. Available: https://sciencespress.com/index.php/JACAIDMS/article/view/2023-03-04