
As U.S. technology companies race to build artificial intelligence systems that can reason more like humans, a major South Korean internet company is preparing its own answer to what many see as the next structural shift in AI design.
Naver, South Korea’s dominant search and internet platform, is set to unveil a new “omnimodal” AI model later this month, according to people familiar with the project.
The system, developed as an extension of the company’s proprietary HyperCLOVA X platform, is designed to process text, images, and audio simultaneously from the outset, rather than stitching those capabilities together after training.
The announcement places Naver into a competitive field increasingly defined by companies such as OpenAI, Google, and Meta, which are pushing beyond traditional multimodal systems toward more integrated architectures.
While most current AI models combine separate text, vision, and audio components, omnimodal systems are built as a single structure that learns across multiple forms of information at once, a distinction that developers say can materially change how models understand context and intent.
For users, the difference is less about terminology and more about experience. Today’s AI systems often struggle to maintain consistency when switching between text, images, and voice. An omnimodal model aims to remove those seams.
A question can be asked in speech, supplemented with an image, and answered in text or audio without the system treating each step as a separate task.
Supporters argue that this unified approach improves reliability when dealing with complex, real-world situations where information does not arrive in a single format.
Within the AI industry, omnimodal architectures are increasingly discussed as a necessary step toward more general-purpose systems.
While artificial general intelligence remains an abstract and contested concept, many researchers see native integration of multiple data types as a prerequisite for models that can reason flexibly rather than perform narrowly defined tasks.
Naver’s initial release, however, will not be a massive, compute-heavy model. Instead, the company plans to debut a relatively lightweight version, reflecting a strategic emphasis on efficiency and cost control.
By validating its development approach with a smaller model first, Naver aims to reduce infrastructure risk while preserving the option to scale quickly by adding more data and graphics processing units once performance and stability are proven.
The effort also reflects a broader push by Naver to rely less on U.S.-based AI platforms and develop its own foundational technology.
Earlier this year, Naver Cloud, the company’s cloud computing arm, was selected as one of five lead participants in a South Korean government-backed initiative to build an independent AI foundation model capable of integrated data understanding and generation.
A company representative said Naver’s focus is on establishing a stable and scalable omnimodal framework, with plans to deploy models of different sizes depending on service and industry needs.
As the global AI race shifts from feature counts to underlying architecture, Naver’s move highlights how non-U.S. technology firms are positioning themselves in a landscape still largely shaped by American companies, but no longer defined by them alone




