Open-source foundation for superior Japanese LLMs

Shisa.AI is a developer of open-source bilingual Japanese-English large language models (LLMs) specializing in high-performance natural language processing for Japanese linguistic tasks.
The core value of Shisa.AI lies in delivering state-of-the-art Japanese-English bilingual LLMs that rival commercial models like GPT-4o and DeepSeek-V3 while maintaining full open-source accessibility and transparency.

Shisa V2 405B, the flagship model, is built on Meta’s Llama 3.1 405B architecture, optimized for Japanese language understanding through extensive fine-tuning on curated JA/EN datasets.
The product provides full open-source access to model weights, training datasets, and inference code under the Apache 2.0 license, enabling commercial and research use without restrictions.
Integrated chat demo interfaces and vLLM-optimized inference pipelines are included for immediate deployment across GPU clusters, including AMD MI300X hardware configurations.

Addresses the scarcity of high-quality Japanese-optimized LLMs by providing specialized models that outperform general-purpose models on tasks like translation, summarization, and question answering.
Targets developers and enterprises requiring bilingual (JA/EN) NLP capabilities, particularly in industries like customer support, content localization, and academic research.
Enables cost-effective deployment of Japanese-language AI solutions without dependency on proprietary APIs, as demonstrated in use cases like automated document analysis and real-time chat systems.

Unlike most bilingual models prioritizing English, Shisa V2 405B achieves parity between Japanese and English performance through novel training techniques like evolutionary model merging and culture-specific alignment.
The model incorporates Japan-specific legal and cultural context awareness, trained on datasets compliant with Japanese copyright law as reaffirmed by the Ministry of Education in April 2023.
Competitive advantages include benchmark scores surpassing GPT-4o on JGLUE (Japanese General Language Understanding Evaluation) and JAQKET datasets while maintaining 40% lower inference costs through vLLM optimizations.

How does Shisa V2 405B compare to GPT-4 for Japanese tasks? Shisa V2 405B outperforms GPT-4o in standardized Japanese benchmarks while offering full model control through open-source deployment, avoiding API latency and data privacy concerns.
Can Shisa models be used commercially? All Shisa models are released under Apache 2.0 license, permitting unrestricted commercial use including modification and redistribution of derivative models.
What hardware is required for deployment? The 405B parameter model operates efficiently on 8x AMD MI300X or equivalent NVIDIA H100 clusters, with quantized versions available for smaller-scale deployments.
How is Japanese cultural context handled? Training datasets include legally compliant Japanese copyright materials and culture-specific alignment layers, ensuring appropriate responses to honorifics and societal norms.
Are custom fine-tuning capabilities supported? Users can leverage released datasets and PyTorch-based training scripts to adapt models for domain-specific applications like legal document analysis or technical support.

Subscribe to Our Newsletter