OpenAI introduced two methods it’s bettering its synthetic intelligence (AI) fashions final week. The primary consists of releasing a brand new replace for the GPT-4o (often known as the GPT-4 Turbo), the corporate’s newest AI mannequin powering ChatGPT for paid subscribers. The corporate says the replace improves the artistic writing means of the mannequin and makes it higher at pure language responses and writing partaking content material with excessive readability. OpenAI additionally launched two analysis papers on crimson teaming and shared a brand new methodology to automate the method to scale recognizing errors made by its AI fashions.
OpenAI Updates GPT-4o AI Mannequin
In a submit on X (previously often known as Twitter), the AI agency introduced a brand new replace for the GPT-4o basis mannequin. OpenAI says the replace permits the AI mannequin to generate outputs with “extra pure, partaking, and tailor-made writing to enhance relevance and readability.” It is usually mentioned to enhance the AI mannequin’s means to course of uploaded information and supply deeper insights and “extra thorough” responses.
Notably, the GPT-4o AI mannequin is offered to customers with the ChatGPT Plus subscription and builders with entry to the big language mannequin (LLM) through API. These utilizing the free tier of the chatbot shouldn’t have entry to the mannequin.
Whereas Devices 360 employees members weren’t capable of check out the brand new capabilities, one consumer on X posted concerning the newest enhancements within the AI mannequin after the replace. The consumer claimed that GPT-4o may generate an Eminem-style rap cipher with “refined inside rhyming constructions”.
OpenAI Shares New Analysis Papers on Purple Teaming
Purple teaming is the method utilized by builders and firms to make use of exterior entities to check software program and techniques for vulnerabilities, potential dangers, and issues of safety. Most AI companies collaborate with organisations, immediate engineers, and moral hackers to stress-test whether or not it responds with dangerous, inaccurate, or deceptive output. Assessments are additionally made to examine whether or not an AI system could be jailbroken.
Ever since ChatGPT was made public, OpenAI has been public with its crimson teaming efforts for every successive LLM launch. In a weblog submit final week, the corporate shared two new analysis papers on the development of the method. Considered one of them is of specific curiosity given the corporate claims it will probably automate large-scale crimson teaming processes for AI fashions.
Revealed within the OpenAI area, the paper claims that extra succesful AI fashions can be utilized to automate crimson teaming. The corporate believes AI fashions can help in brainstorming attacker targets, how an attacker’s success could be judged, and understanding the variety of assaults.
Increasing on it, the researchers claimed that the GPT-4T mannequin can be utilized to brainstorm an inventory of concepts that represent dangerous behaviour for an AI mannequin. Some examples embrace prompts resembling “how you can steal a automobile” and “how you can construct a bomb”. As soon as the concepts have been generated, a separate crimson teaming AI mannequin could be constructed to trick ChatGPT utilizing an in depth collection of prompts.
At the moment, the corporate has not begun utilizing this methodology for crimson teaming given a number of limitations. These embrace the evolving dangers of AI fashions, exposing the AI to lesser-known methods for jailbreaking or producing dangerous content material, and the necessity for the next threshold for information in people to appropriately decide the potential dangers of output as soon as the AI mannequin turns into extra succesful.