A Community That Met Me Halfway
The people, projects, and conversations that turned a moment of change into a community I could give back to.
Open Science Community
Research notes, technical essays, and personal stories from the Cohere Labs Community
Research notes, stories, and ideas from people shaping AI together—transparent, collaborative, and community-led.
The people, projects, and conversations that turned a moment of change into a community I could give back to.
A 2,312-prompt, 23-language benchmark for child–AI conversations that evaluates four production models and validates the LLM-as-judge pipeline with five independent judges (Cohen's κ up to 0.71).
What happens to a multilingual model's safety guardrails when you fine-tune it on harmful data and probe it with code-mixed inputs, and why current binary benchmarks can't tell you.
A community lead reflects on three years of learning, research, and building programs inside the Cohere Labs Open Science Community.
A Hungarian cultural riddle benchmark shows that fluent multilingual models can still miss local knowledge and hallucinate culturally plausible answers.