SERA

Definition: SERA (Soft-verified Efficient Repository Agents) is a family of open-source coding models (8B, 14B, 32B parameters) built on the Qwen 3 architecture. It specializes in agentic coding tasks like debugging, refactoring, and pull request generation.
Core Value Proposition: SERA drastically reduces coding agent training costs via "soft-verified generation" (SVG), enabling affordable private codebase adaptation for organizations and researchers.

Soft-Verified Generation (SVG): Generates synthetic training data using partially correct code patches instead of fully verified solutions. Uses a taxonomy of 51 common bug patterns to diversify data. Eliminates costly test infrastructure, cutting data generation costs by 26–57× vs. RL methods.
Repository Specialization: Fine-tunes models to internal codebases via targeted synthetic data. Trains on 8,000 samples per repo ($1,300 cost), enabling 32B models to outperform 100B+ generalists (e.g., GLM-4.5-Air) on domain-specific tasks.
NVIDIA-Optimized Inference: Supports BF16/FP8/NVFP4 precision on Hopper/Blackwell GPUs. Achieves 8,600 tokens/sec on 4xB200 GPUs with NVFP4. Compatible with Claude Code for seamless integration.
Extended Context Handling: Trained for 32K context, scales to 256K via RoPE. Solves 54.2% of SWE-Bench Verified tasks at 64K context, rivaling Devstral Small 2 (50.0%).

Pain Point: Closed coding agents (e.g., Devin, SWE-agent) lack knowledge of private APIs/codebases and require expensive RL pipelines ($500k+) for customization.
Target Audience:
- Software Teams: Adapts to internal stacks (e.g., Django, SymPy) for automated maintenance.
- ML Researchers: Lowers SOTA reproduction cost to $400 (vs. $12,000 for industry equivalents).
- Indie Developers: Runs on 2x NVIDIA RTX PRO 6000 Blackwell GPUs (40 GPU days for SERA-32B).
Use Cases:
- Debugging proprietary financial systems using internal data.
- Generating verified patches for open-source projects.
- Low-cost fine-tuning for academic AI labs.

Differentiation: Outperforms SkyRL and SWE-smith at 26× lower training costs. Matches Devstral Small 2 performance with pure SFT (no RL needed).
Key Innovation: SVG decouples workflow simulation from code correctness, enabling high-fidelity synthetic data from any repo. Combined with the bug-type menu, it scales data generation 100× cheaper than hard-verified methods.

How does SERA reduce coding agent training costs?
SERA uses soft-verified generation (SVG) to create synthetic data without full test verification, slashing costs to $400 for SOTA replication vs. $12,000 for comparable models.
Can SERA adapt to my company’s private codebase?
Yes. SERA fine-tunes via 8,000 synthetic samples per repository ($1,300), specializing 32B models to outperform 100B+ generalists like GLM-4.5-Air on internal code.
What hardware is needed to run SERA-32B?
Optimized for NVIDIA Hopper/Blackwell. Runs on 2x H100 GPUs (BF16) for training; achieves 8,600 tokens/sec on 4xB200 GPUs (NVFP4) for inference.
How does SERA compare to Devstral Small 2?
At 64K context, SERA-32B solves 54.2% of SWE-Bench tasks vs. Devstral’s 50.0%, with 57× lower training costs and no RL dependency.
Is SERA compatible with existing AI tools?
Yes. Integrates with Claude Code and includes open weights/data on Hugging Face. Deployment requires 2 CLI commands.

Fast, accessible coding agents that adapt to any repo