Role: data architect (with experience using snowflake)
location: mexico - remote
rate $30/hr
job description:
* develop and automate evaluation frameworks to benchmark gemini and other llms across accuracy, grounding, hallucination rate, latency, and contextual coherence.
* design, run, and analyze controlled experiments (a/b testing, prompt variations) to measure the performance impact of prompt tuning and parameter adjustments
* use python (pandas, numpy, scipy) to clean, transform, and analyze datasets; apply statistical testing, regression, and hypothesis validation for model comparisons.
* proficiency in google colab, jupyter, gemini cli, vertex ai, and ai studio for running reproducible experiments and maintaining model evaluation pipelines.
* build dashboards and visual reports (matplotlib, plotly, looker studio) to communicate insights and performance trends effectively to stakeholders.
* work closely with ml engineers, architects, and ai product teams to interpret results, refine prompts, and guide model retraining strategies.
* maintain clear experiment logs, reproducible notebooks, and result repositories for internal validation and audit.
* ph.d. (preferred) or master's in computer science, data science, ml, or applied mathematics with 6–10 years of relevant experience.
* hands-on exposure to llm evaluation, nlp benchmarking, or ai experimentation is essential.
* experience with gemini models strongly preferred.
* familiarity with rag frameworks, evaluation agents, or llmops;
* exposure to gcp ecosystem (vertex ai, bigquery, ai studio).
* strong research and analytical mindset with clear communication of technical findings
tipo de puesto: tiempo completo
sueldo: $80, $90,000.00 al mes
experiencia:
* snowflake: 1 año (obligatorio)
* vertex ai : 1 año (deseable)
* gemini : 1 año (deseable)
idioma:
* inglés avanzado (obligatorio)
lugar de trabajo: empleo remoto