Cryptanalytic Extraction of Deep Neural Networks with Non-Linear Activations

17 Dec 2025 02.00 PM - 03.00 PM SPMS-LT5 (SPMS-03-08) Current Students

Description

Deep neural networks (DNNs) are today’s central machine learning engines, yet their parameters represent valuable intellectual property exposed to extraction through black-box queries. While existing cryptanalytic attacks have primarily targeted ReLU-based architectures, this work extends model-stealing techniques to a broad class of non-linear activation functions, including GELU, SiLU, SELU, Sigmoid, and others. We present the first universal black-box attack capable of recovering both weights and biases from networks whose activations converge to linear behavior outside narrow non-linear regions. Our method generalizes prior geometric approaches by leveraging higher-order derivatives and adjacent linear zone analysis, bypassing the need for non-differentiability.

We show that, for several activations, neuron signatures can be recovered more easily than in the ReLU case, and we further demonstrate that activation functions themselves can be identified when not publicly known. Our results broaden the scope of cryptanalytic model extraction, revealing that the secrecy of activation functions or smoothness of nonlinearities does not provide effective protection against black-box recovery attacks.