LATAM Applied AI Engineer- Remote

  • Argentina, Santiago del Estero Province, Argentina
  • Isaac
  • Full-Time
  • Remote

Job Description:

LATAM APPLIED AI ENGINEER

Build agents for the entertainment industry and creator economy. We are hiring an Applied AI Engineer dedicated to agent reliability.

The mandate is deliberately narrow. We don't need another full-stack engineer or a generic senior architect. We need someone whose hands stay in the eval pipeline and who turns the multi-agent system from "works in demos" into "survives non-deterministic LLM output at production scale."

You will partner with the technical co-founder on architecture decisions where reliability is at stake, including HITL escalation, thread context collapse, and multi-channel routing. The team owns the long-term evals vision, and this hire builds and runs the pipeline day-to-day.

WHAT YOU'LL OWN

• Instrument the scheduling agent end-to-end with traces on top of the existing Langfuse deployment

• Build eval datasets from real production traffic across the agent layer (scheduling, notetaker, iMessage, Gmail)

• Stand up the Braintrust scoring pipeline, including both quality scorers and robustness scorers (did the system survive the LLM output, did HITL trigger when it should)

• Own the feedback loop from evals to prompt and architecture changes to re-evaluation, including DSPy and DPO as the system matures

• Partner with the technical co-founder on agent architecture decisions where reliability is at stake

• Operate end-to-end on your workstream: dev, test, deploy, and on-call sustain

ABOUT THE COMPANY

Building agents for the global entertainment industry and creator economy. Professionals at WME, UTA, Netflix, Night, and Live Nation use this product today through private beta.

The company has a massive trust moat: a 270K+ proprietary distribution network of verified entertainment professionals. Founded by Vince Morales (2x founder, UTA Ventures, Elevate Ventures), Warner Bailey (2x founder, WME, Live Nation), and Ryan McCaffrey (WMG, Hebbia AI, Robinhood). Technical talent from NVIDIA, Intuit, and HubSpot.

Oversubscribed its $400K round through angels from Coatue, Ramp, Plug and Play, FanFix, Temple Hill, Outshine Talent, and Undercurrent Talent. Now raising $2M with +$500K already committed.

MUST-HAVE REQUIREMENTS

1. Real production experience with non-deterministic LLM systems. Evidence of shipping, debugging, and maintaining agent systems under real traffic (not side projects or OpenAI API experimentation)

2. Has built or operated an eval pipeline before. If not Braintrust specifically, the intuition for scorer design has to be there

3. Product-first mindset applied to evals. Can take scorer design and feedback loops and drive a real product forward (not just build models in isolation)

4. Strong Python backend fluency. The stack is FastAPI, Django, Celery, and LangGraph

5. Comfortable with observability tooling (Langfuse or equivalent) and logs/metrics stacks like Grafana or Loki

6. Senior enough to own a workstream end-to-end from dev to test to deploy to on-call sustain (no pair programming crutch)

7. Works async, writes clearly, fluent English

NICE-TO-HAVE

• DSPy, DPO, or other LLM optimization experience

• Prior agentic systems work with LangGraph or similar (tool use, planning, multi-step)

• Time at a prestige-tier LATAM company (Nubank, Itaú, iFood) or strong YC/US startup

• Exposure to multi-model stacks (GPT-4.1/4o, Claude, Whisper) in production

Salary: $84,000 - $120,000 per year | Experience: 3+ years | Stage: Pre-seed