*software engineer, ai — code evaluation & training (remote)*
*list of accepted countries and locations*
help train large-language models (llms) to write production-grade code across a wide range of programming languages:
- * compare & rank multiple code snippets*, explaining which is best and why.
- * repair & refactor ai-generated code* for correctness, efficiency, and style.
- * inject feedback* (ratings, edits, test results) into the rlhf pipeline and keep it running smoothly.
*end result*: the model learns to propose, critique, and improve code the way _you_ do.
*rlhf in one line*
generate code ➜ expert engineers rank, edit, and justify ➜ convert that feedback into reward signals ➜ reinforcement learning tunes the model toward code you’d actually ship.
*what you’ll need*:
- * 4+ years of professional software engineering experience* in *java*
(constraint programming experience is a bonus, but not required)
- * strong code-review instincts*—you can spot logic errors, performance traps, and security issues quickly.
- * extreme attention to detail and excellent written communication skills.*
much of this role involves explaining _why_ one approach is better than another. This cannot be overstated.
- you *enjoy reading documentation and language specs* and thrive in an asynchronous, low-oversight environment.
*what you don’t need*:
- no prior rlhf (reinforcement learning with human feedback) or ai training experience.
- no deep machine learning knowledge. If you can review and critique code clearly, we’ll teach you the rest.
*tech stack*:
we are looking for engineers with a strong command of *java*.
*logistics*:
- * location*: fully remote — work from anywhere
- * compensation*: from $30/hr to $70/hr, depending on location and seniority
- *hours*: minimum 15 hrs/week, up to 40 hrs/week available
- * engagement*: 1099 contract