Home » Technology » Artificial Intelligence » Researchers Expose Image-Based Attack That Hacks AI Models

Researchers Expose Image-Based Attack That Hacks AI Models

Security researchers have presented a new attack technique that enables confidential user data to steal over manipulated images. This uses the AI ​​models that are now built everywhere.

Scaling produces secret texts

Was developed The approach By Kikimora Morozova and Suha Sabi Hussain from the security company Trail of Bits. The method is based on a concept that was presented in 2020 in a scientific work by the TU Braunschweig. At that time, the possibility was described in machine learning systems the possibility of so-called “image scaling attacks”. Trail of Bits has now transferred this principle to current AI applications and practically demonstrated.

The focus is on the fact that AI systems mostly automatically reduce uploaded images in order to save computing power and costs. Common resampling algorithms such as “Nearest Neighhbor”, “Bilinear” or “Bicubic” are used. These methods mean that certain patterns that were previously hidden in the original image can become visible when they down.

A manipulated image can contain apparently invisible instructions that only appear in the reduced version – hardly recognizable for the human eye, but can be read for a voice model. In an example, dark areas colored red during downscaling, which made hidden black text visible. The AI ​​model then automatically interpreted this text as a legitimate part of the user input. From a user view, everything looks inconspicuous. In the background, however, the system can carry out harmful commands, with which sensitive data in particular is used. In a test, the researchers managed to forward calendar data from a Google account to a foreign email address via the “Gemini Cli” tool.

Gemini is affected

According to Trail of Bits, several platforms, including Google’s Gemini models (CLI, web interface, API, Vertex Ai Studio), the Google Assistant on Android and the Genspark service. To illustrate the risks, the researchers have an open source tool called Anamorphic Published that can specifically create pictures for different downscaling methods.

To defend such attacks, the experts advise to limit the image sizes in the upload and to display a preview of the reduced version. In addition, safety -relevant actions should never be carried out automatically, but always require confirmation – mainly when text is extracted from images. However, the most important defense is a secure system design, which is generally robust against prompt injection attacks, the researchers said. Only through systematic protective mechanisms can prevent multimodal AI applications to become a gateway for data abuse.

Leave a Reply