Digital Steganography
Steganography is a powerful tool for concealing sensitive data. Unlike cryptography, which focuses on encrypting messages to render them unreadable, steganography aims to conceal the very presence of a message, allowing it to be transmitted undetected in plain sight. Steganography sometimes can enhance rather than replace encryption. The carrier medium of the message can be anything, from image files (that we’re going to talk about in this post) to text, to network packets, or audio recordings.
With Steganography the Kerckhoffs Principle, that a system should remain secure under the assumption that the adversary knows the system, is not respected, although interpretations for steganography differ in whether this includes knowledge of the cover source or not.
Applications
Some applications are:
- It can be used as covert channels/undetectable communication pathways within a system. By manipulating seemingly innocuous data transmissions, covert channels enable data exchange between authorised entities while evading network monitoring mechanisms.
- Copyright control of materials. It finds extensive use in digital watermarking, where imperceptible alterations are made to media files to assert ownership or authenticity. This can facilitate content tracking and rights management, and it’s usually done by inserting a small but repeated pattern inside the media carrier, helping to prevent the loss in case of compression.
- TCP/IP packets: ToS bits / TCP ISNs / IP ID can be used as a steganographic carrier, as is shown here.
Network steganography is receiving attention from the research community, not just for the example that we’ve just seen, but also with covert timing channels, which uses delays between network packets to embed the payload.
Another practical example provides two levels of protection on mammograms: patient’s medical information hiding and content masking. The first level embeds patient’s information in the background area that is not used (black). The second level is the application of a watermarking technique, that can mask the contents of the mammogram, for providing mammograms protection against illegal access and malevolent modifications. This watermark can be removed to reveal the masked mammogram when authorization for viewing is given.
(a) Original Mammogram; (b) Segmented Mammogram; (c) Dilated Mammogram; (d) Block Separation; (e) stego-Mammogram; (f) Masked Mammogram.
Methods and techniques
Some basic points must be followed to create a stego-image:
- Delete the original image after generating the stego-image to prevent direct comparing
- Stego-image must not have visual artefacts (smooth areas must be avoided, and noisy background or edges should be targeted)
We need to consider that one problem of many algorithms is the possibility of errors when extracting embedded information, that will cause the complete loss of information if the payload was encrypted before embedding. To prevent this type of occurrence, some researchers proposed to use additional, non-encrypted, payload to give extra-information (e.g. order of data elements for extraction).
(a) original image (b) stego image
Let’s see some of the most famous and interesting techniques:
Basic and unsecure methods
Both methods that we’re going to see in this section are not reliable because they are easily detectable but very easy to implement:
- EOF: consists in appending the secret message ‘behind’ the JPEG image using EOF (end of file) tag.
C:> Copy Image.jpg /b þ Secret_message.txt /b Final.jpg
- EXIF: using the same idea of the last method we’re going to append hidden data into the image’s extended file information (EXIF) that usually contains metadata of the image (e.g. model of the camera, time, location…)
LSB (Least Significant Bit) substitution
In digital images, usually, each pixel is represented by a combination of colour channels RGB (Red, Green, Blue). Each colour channel uses 8 bits per pixel, allowing for 256 different intensity levels (0 to 255). To hide a message within an image, each character of the message is converted into its binary representation.
We can define two different LSB approaches:
- LSB replacement: the least significant bit is replaced with the secret data. This replacement can be made in two different ways:
- Sequential embedding insert data in order of bits until the whole message is embedded.
- Randomised embedding, disperses the position of the secret data.
- LSB matching: each secret data bit is compared with the least significant bit of the corresponding cover byte; if the two compared bits match, no change is made, while in the case of a mismatch the cover byte is incremented or decremented at random. $$ S_{i}=\begin{cases} C_{i} ~~ if ~ M_{i}=C_{i} \C_{i}-1~~ if~M_{i}\ne C_{i} ~\text{and}~C_{i}\ne0 \C_{i}+1 ~~ if ~ M_{i}\ne C_{i} ~\text{and}~C_{i}=0 \end{cases} $$
DCT algorithms
A more sophisticated method of hiding something in an image would be to hide it inside the discrete cosine transform (DCT) coefficients of the JPEG images.
F5 is an example of DCT algorithm, to hide information, it adds 1 to the coefficients with a positive value and subtracts one from the coefficients with a negative value. The coefficients with a zero value are not modified, as this would alter the statistic of the image significantly. This way of hiding information introduces a communication problem, the receiver will extract the message by reading the LSB of the non-zero coefficients, since the zeros are not used. But if when modifying a coefficient with value 1 or -1, it has become zero, the receiver would lose that bit. The solution used by F5, when this situation occurs, is to hide the same bit in the next coefficient again. While this solves the communication problem, it considerably increases the number of modified coefficients. F5 uses matrix embedding (subtraction and matrix encoding) reducing the amount of changes made to the DCT coefficients and allowing larger payload.
The algorithm used takes n DCT coefficients and hashes them to k bits. If the hash value equals the message bits, then the next n coefficients are chosen and modified, otherwise one of the n coefficients is modified, and the hash is recalculated.
Adaptive steganography
Adaptive/Model-Based steganography method takes statistical global features of the image before attempting to interact with its LSB/DCT coefficients to understand where to make changes. This can be done by considering factors such as the distribution of pixel values, the correlation between neighbouring pixels, and the sensitivity of human visual perception to image modifications.
Detection
Steganalysis is the science of attacking steganography. We can define two different types:
- Active steganalysis attempts to destroy any trace of secret communication contained in the media carrier (e.g. compression, changing image format, changing LSBs).
- Passive steganalysis attempts to detect the existence of stego-images and retrieve the payload of it. Currently the best image steganalyzers are built using feature-based steganalysis and ma- chine learning.
Visual steganalysis
Consists in the analysis of the suspicious image with the eye to identify any difference. In a normal image, usually there are approximately as many even (1) values as there are odd (0). When text is converted to binary and inserted in an image with a poor algorithm, there are often more 0 than 1.
If the original image is available it could be easy to spot file size differences or increase/decrease of unique colours.
Signature steganalysis
The algorithm goal is to search for repetitive signatures (patterns). A method used for JPEG is dividing the image into 8x8 blocks and analysing the value of Discrete Cosine Transform coefficients (DCT) in all the blocks. The quantization matrix is then compared with the standard JPEG quantization table to check if there are incompatible blocks.
Statistical steganalysis
Chi-square is a non-parametric statistical algorithm used to detect whether the intensity levels scatter in a uniform distribution throughout the image surface or not. This method has low success when embedding is randomised.
(a)pixel difference (b)pixel differences using modified pixel value differencing steganography
A more reliable method is called Rs steganalysis that divides the image into groups and measures the noise in every group by flipping the LSBs and then classifying them as regular or singular depending on the pixel noise. The process is then repeated for a dual type flipping.
GEFR (Gradient Energy-Flipping Rate) detection is based on the Gradient Energy that is calculated by summing the squares of the differences between the intensity values of neighbouring pixels in both the horizontal and vertical directions. This algorithm calculates the Gradient Energy of both the cover and steganographic image and through the analysis of the variation of it and the curve it is possible to detect the secret message and its lengths.
As we saw before, if LSB matching algorithms are used our statistical detection will be more difficult to do. A lot of the algorithms that are able to detect LSB matching media uses Markov chain or co-occurrence matrix, but we’re going to discuss them in another post.
Transform domain steganalysis
These types of algorithms work against transform domain techniques, such as Discrete Cosine Transform (DCT), that we saw before, Fast Fourier Transform (FFT) and Discrete Wavelet Transform (DWT).
A lot of algorithms in this section uses artificial neural networks to determine whether the image was stego or a clean one.
This first algorithm tackles steganography detection in JPEG images. It analyses 2-dimensional arrays built from the magnitudes of JPEG quantized block DCT coefficients, it computes the differences between neighbouring pixels in these 2-D arrays along horizontal, vertical, and diagonal directions. Finally, a Markov process models these different arrays, capitalising on their second-order statistics to unmask the alterations brought on by hidden messages.
This second algorithm introduces a fresh set of features specifically designed for steganalysis in JPEG images. The features hinge on the first-order statistics of quantized noise residuals obtained by decompressing the JPEG image using 64 kernels of the Discrete Cosine Transform (DCT). This approach yields features with reduced computational complexity and lower dimensionality compared to more intricate models; despite this simplicity, the features maintain competitive performance.
Universal or blind steganalysis
Universal steganalysis is the most advanced one and tries to detect the secret messages regardless of the steganographic technique applied to the initial image, for this reason is also the most useful one, because if you want to detect stego images outside a laboratory you don’t know that type of algorithm has been applied.
Some algorithms use the Mahalanobis distance from a test Center of Mass to the training distribution to identify stego images; others use the local binary pattern texture operator to examine the pixel texture patterns within neighbourhoods across the colour planes with an artificial neural network used as classifier.
Deep learning techniques in universal steganalysis seem to be very effective and if trained with a large dataset tends to have high detection accuracy.
Conclusion
Steganography and Steganalysis are two very interesting and complex topics, there are hundreds of different algorithms that try to hide or detect data in an efficient way.
While I was reading about steganography it felt like it was something very far from a real world implementation, something that works very well in a lab environment but that can easily have problems when it’s taken out, and every example of real world implementation is easily thinkable with another, more reliable security tool.
Other two interesting aspects pointed out in this paper are:
- Multiple object stream, which is particularly relevant to hiding in network channels, where communication is repeated. It faces a problem on how to adapt steganography for embedding one object in many and how to allocate payload between multiple objects.
- Key exchange, in particular how a key broadcast is not allowed in an initial setting, and how encoding a key as semantic content of a cover is practically impossible because the datagrams are sensitive to even a single bit error. Regarding this aspect, another problem is the state of ignorance of sender and receiver on when the key is being transmitted, because every transmission includes noise, both the sender and the receiver assumes that the noise of every image is a public key, and sends a reply. By doing that, the amount of computation necessary is monstrous.
Talking about steganalysis, the situation seems similar. Apart from universal ones, the majority of the algorithms are created ad hoc with the knowledge of the embedding algorithm used, the cover source and which object they should examine, which is unrealistic in a real-world use scenario. Right now, in 2024, it seems like literature is mostly focused on grayscale images, with video and network traffic analysis taking a backseat.