Cybersecurity Risks in OpenAI ChatGPT Plugins and Emerging Threats


Cybersecurity researchers have uncovered that third-party plugins for OpenAI ChatGPT could become a potent attack vector for cybercriminals aiming to access sensitive data illicitly. Salt Labs’ recent publication highlights the vulnerabilities within ChatGPT and its surrounding ecosystem, which may enable attackers to clandestinely install harmful plugins and compromise accounts on third-party platforms like GitHub.

These ChatGPT plugins are designed to enhance the large language model’s (LLM) capabilities, providing access to real-time information, computational functions, and third-party services. However, the discovery of security loopholes has raised significant concerns. In response, OpenAI has taken measures by launching GPTs—customized versions of ChatGPT with minimized third-party dependencies. From March 19, 2024, the installation of new plugins and creation of new conversations with existing plugins will be discontinued to mitigate these risks.

Salt Labs identified a critical flaw exploiting the OAuth workflow, allowing attackers to manipulate users into installing unauthorized plugins. This is mainly because ChatGPT does not confirm whether the user actually initiated the plugin installation process. Consequently, this loophole can be exploited to intercept and siphon off sensitive data from the victims, potentially including proprietary information.

Furthermore, the investigation revealed vulnerabilities in PluginLab, posing risks of zero-click account takeover attacks. These vulnerabilities could enable cybercriminals to seize control of an organization’s third-party website accounts, such as GitHub, and access their source code repositories. Aviad Carmel, a security researcher, elucidated that the authorization endpoint in PluginLab lacks proper authentication mechanisms, facilitating attackers to impersonate a victim and access their resources.

Additionally, several plugins, including Kesem AI, were found to be susceptible to OAuth redirection manipulation. This vulnerability allows attackers to hijack the account credentials related to the plugin through a maliciously crafted link.

These discoveries are part of a broader trend of identifying security weaknesses in ChatGPT. Imperva had earlier reported two cross-site scripting (XSS) vulnerabilities that could be exploited to gain control over any ChatGPT account. Moreover, in December 2023, Johann Rehberger demonstrated how malicious entities could develop custom GPTs to phish for user credentials and transmit them to external servers.

A novel side-channel attack on AI assistants, disclosed recently, further accentuates the evolving cybersecurity threats. Researchers from Ben-Gurion University and Offensive AI Research Lab uncovered an attack leveraging token-length in LLMs to extract encrypted responses covertly. This side-channel attack, despite encryption, uses the size of data packets to deduce the length of the tokens, enabling attackers to infer sensitive information from private AI assistant conversations.

To mitigate such risks, the research suggests implementing countermeasures like random padding to mask token lengths, bundling tokens during transmission, and sending complete responses in one go. These strategies aim to balance security, usability, and performance, underscoring the intricate challenges in safeguarding AI-driven platforms against sophisticated cyber threats.

Technical in-depth …Token Privacy Vulnerabilities in AI Assistants

Recent research conducted by Yisroel Mirsky and his team at the Offensive AI Research Lab, Ben-Gurion University, has shed light on significant privacy risks in AI assistants, including OpenAI’s ChatGPT. Their findings reveal that despite the use of encryption, there are flaws in how these services encrypt traffic, making private conversations vulnerable to eavesdropping. This vulnerability extends beyond OpenAI, affecting many major chatbots, with Google Gemini being a notable exception.

The core of this vulnerability lies in what the researchers have termed the “token-length sequence” side-channel. AI assistants, in their quest for real-time interaction, transmit responses as they are generated, in the form of tokens (akin to words). Each token’s length remains consistent in both its encrypted and plaintext forms, inadvertently revealing information about the content of the message. This allows a passive observer on the same network, such as a Wi-Fi or LAN, to potentially decipher the content of encrypted messages without detection by the service provider or the user.

The attack methodology is akin to a complex puzzle-solving exercise where the attacker first analyzes the size of each encrypted token and then sequences these token lengths to deduce the possible phrases or sentences they represent. Due to the vast number of potential combinations for phrases and sentences, the initial output from this side-channel analysis is rough and unsophisticated.

To refine this attack, the researchers developed a token inference attack strategy. This method employs two specially trained large language models (LLMs) to process the raw data obtained from the side-channel. These LLMs are adept at recognizing patterns and long-term sequences in text, enabling them to decipher the sentences with surprising accuracy, even without any visible characters.

This ability to predict and interpret the token sequence allows attackers to reconstruct the encrypted conversation with a high degree of accuracy. The attack exploits the stylistic and repetitive nature of AI-generated text, enabling the LLMs to deduce the content effectively. This is particularly effective in known-plaintext attack scenarios, where the attacker has some knowledge of the plaintext.

The research highlights the potential for LLMs to reverse-engineer encrypted traffic into readable text, using training data from publicly available chats or directly from the AI service as a paying customer. This vulnerability was not only demonstrated in OpenAI’s ChatGPT-4 but also in other services like Microsoft Bing AI (Copilot), indicating a widespread issue across AI assistant platforms.

The study emphasizes the need for a more secure method of transmitting tokenized responses from AI assistants to prevent unauthorized access to private conversations. It suggests that the current encryption methods, while effective in obscuring direct text, are insufficient to protect against more sophisticated side-channel attacks that exploit the token-length sequences. This discovery calls for immediate attention to enhance the privacy and security of AI communication channels.

Anatomy of an AI Chatbot

In the realm of natural language processing (NLP), tokens serve as the fundamental building blocks of text, encapsulating meaning within even the most nuanced grammatical constructs. Beyond mere words, tokens encompass not only lexical elements but also punctuation and spatial indicators. For instance, consider the following dialogue:

“Oh no! I’m sorry to hear that. Try applying some cream.”

When processed through cutting-edge models like GPT 3.5 or GPT 4, or through LLAMA-1 and LLAMA 2, the dialogue transforms into structured token sequences:

“Oh no! I’m sorry to hear that. Try applying some cream.”

Such tokenization methodologies form the bedrock of Artificial Intelligence (AI) assistants, each designed with a distinct tokenizer to dissect text into digestible fragments. The algorithms underpinning major AI assistants often reveal their tokenizer rules as part of accessible APIs, enabling developers to harness these mechanisms for various applications.

However, tokens serve a dual purpose, not just during the execution of Large Language Models (LLMs) but crucially in their training phase. LLMs undergo rigorous training processes, ingesting vast corpora of tokenized data. This corpus serves as the crucible wherein LLMs learn the probabilities associated with particular tokens following specific sequences. Through this iterative learning, LLMs refine their ability to predict the subsequent token in an ongoing conversation, lending a semblance of human-like responsiveness.

Conversations within AI chatbot ecosystems can be dissected into two primary message categories: prompts and responses. Prompts constitute user inputs, spanning from inquiries to statements, initiating the conversational exchange with an LLM. Represented as token sequences, prompts carry the essence of user intent within the digital dialogue. Conversely, LLM-generated responses, also tokenized sequences, encapsulate the AI’s synthesized output in response to user prompts.

Crucially, LLMs retain a granular awareness of dialog history, enabling them to contextualize responses within the continuum of preceding inputs and AI-generated outputs. This contextualization enriches the conversational flow, imbuing AI chatbots with a semblance of continuity and coherence reminiscent of human interactions.

In their seminal paper, researchers elucidate the structural dynamics of AI chatbot conversations:

Prompt (P): A prompt signifies the user’s input, typically manifesting as a question or statement that serves as the catalyst for interaction with the LLM. This input is represented as a token sequence, denoted as P = [p1, p2,…, pm] for pi ∈ K, where K represents the set of all possible tokens.

Response (R): In response to the prompt, the LLM crafts a corresponding output, also expressed as a sequence of tokens denoted as R = [r1, r2,…, rn] for ri ∈ K. These tokens encapsulate the synthesized response, tailored to align with the preceding dialog history and user input.

This intricate interplay between tokens, dialog history, prompts, and responses constitutes the intricate anatomy of AI chatbots, showcasing the fusion of linguistic precision and computational prowess that underpins modern conversational AI systems.

Not Ready for Real Time

In the landscape of chat-based Large Language Models (LLMs), the transmission of tokens is a critical aspect that directly impacts security and privacy. While some platforms, such as Google Gemini, have implemented measures to address certain risks, the majority of widely available LLMs opt for immediate token transmission after generation. This real-time approach, while convenient, introduces a significant vulnerability known as the side channel.

The essence of this vulnerability lies in the individual transmission of tokens, where each token is sent separately and can be intercepted by adversaries with passive Attack-in-the-Middle (AitM) capabilities. This allows attackers to measure the length of tokens irrespective of encryption, potentially compromising user privacy.

For instance, consider an AI assistant generating the phrase “You should see a doctor.” When transmitted as individual tokens, each word becomes a separate packet, revealing their respective lengths in the payload size: 3, 6, 3, 1, 6 (excluding static overhead). Although the content remains encrypted, the pattern of token lengths provides insights into the order and approximate length of words, which can be analyzed by adversaries.

In contrast, when tokens are transmitted collectively, the payload size reflects the total length of the message, making it challenging for attackers to discern individual token lengths in real time. This mechanism not only protects user prompts sent to the AI but also mitigates the risk of attackers deciphering token specifics.

Researchers emphasize that in real-time communication settings, the immediate transmission of tokens (denoted as ri) after generation exposes a clear correlation between token length (ti) and character count (∣ri∣). This correlation can be exploited through payload length differentials in cumulative messages, revealing token lengths and potentially breaching the privacy of conversations.

Furthermore, these vulnerabilities extend beyond mere privacy breaches to potential exploitation of user prompts. This can occur either directly through the repetition of prompts by the AI or indirectly through contextual inference based on token lengths and patterns.

The intricate interplay between token transmission methods, encryption, and side channel vulnerabilities underscores the necessity for robust security protocols in chat-based LLMs. As the AI landscape continues to evolve, addressing these nuanced vulnerabilities becomes imperative to safeguard user privacy and uphold the integrity of conversational AI platforms.

The following table, denoted as Table 1 in the paper, breaks down chatbots from various AI providers to show which ones were, or remain, vulnerable to the attack:

Attack framework overview: 1) Encrypted traffic is intercepted and then (2) the start of the response is identified. Then (3) the token-length sequence T is extracted, and (4) a heuristic is used to partition T into ordered segments (T0,T1,…). Finally, (5) each segment is used to infer the text of the response. This is done by (A) using two specialized LLMs to predict each segment sequentially based on prior outputs, (B) generating multiple options for each segment and selecting the best (most confident) result, and (C) resolving the predicted response ˆR by concatenating the best segments together.

A Complete Breach of Confidentiality

Recent developments in the realm of artificial intelligence (AI) and chatbot technology have unveiled concerning vulnerabilities that pose a significant threat to user privacy and confidentiality. Attack methodologies targeting chatbots, particularly those employing large language models (LLMs), have demonstrated both successes and failures, shedding light on the urgent need for enhanced security measures in the digital landscape.

One of the pivotal aspects of these attacks is the utilization of cosine similarity (φ) as a metric for measuring success. A cosine similarity of φ > 0.5 is deemed a successful attack, emphasizing the nuanced nature of breaching confidentiality in AI-driven conversations. Researchers, such as Weiss et al., have underscored the efficacy of this approach despite initial accuracy limitations. While an attack may exhibit only 29 percent perfect accuracy and 55 percent high accuracy, it can still compromise confidentiality significantly. This is achieved by leveraging a sentence transformer model to compute the cosine similarity between predicted and actual responses, transcending mere word accuracy assessments like ROUGE.

Mirsky, in an online interview, highlighted the attack’s proficiency in deciphering responses to common queries while acknowledging challenges in handling arbitrary content. Training LLMs to accurately predict word placements within token sequences has also been a daunting task, further complicating the defense against such attacks.

A critical aspect of these vulnerabilities lies in the ability to intercept and analyze packets sent by chatbots to end-users. While this is relatively feasible in a localized network environment, it becomes increasingly challenging in cross-network scenarios, particularly without nation-state resources or granular ISP access. Token encryption, aimed at thwarting such interception, has been breached, necessitating robust mitigation strategies.

Two proposals have been put forth to mitigate these vulnerabilities.

  • Firstly, adopting a batched packet transmission approach, akin to Google’s methodology, can enhance security by minimizing predictability.
  • Secondly, integrating “padding” techniques adds random spaces to packets, standardizing packet lengths to deter attacks further.

However, these measures may impact user experience, leading to potential delays or increased traffic.

Recent responses from industry players such as OpenAI, Cloudflare, and Microsoft highlight ongoing efforts to address these vulnerabilities. OpenAI and Cloudflare have implemented padding mitigations, acknowledging the urgency of securing chatbot interactions. Microsoft has emphasized the importance of addressing vulnerabilities promptly to safeguard customer data.

As the landscape of chat-based LLMs continues to expand, stakeholders must prioritize understanding and mitigating these vulnerabilities. This research serves as a pivotal reference for industry professionals involved in the development and deployment of AI-driven chatbot systems, underscoring the imperative of proactive security measures to safeguard user confidentiality and privacy in the digital age.

Image :A sample of attack successes and failures on R0. A cosine similarity of φ > 0.5 is considered a successful attack.

reference link :


Copyright of
Even partial reproduction of the contents is not permitted without prior authorization – Reproduction reserved


Please enter your comment!
Please enter your name here

Questo sito usa Akismet per ridurre lo spam. Scopri come i tuoi dati vengono elaborati.