Data Collection Research Software Engineer at GitHub

Remote | Full-time

Title: Data Collection Research / Software Engineer

Location: Remote – Global

Data Collection Research/Software Engineer
GitHub is seeking a research/software engineer with programming languages and/or software engineering expertise to join the Copilot team as part of GitHub Next. GitHub Next aims to be a meeting place within GitHub for experimentation with new ideas, and for setting the agenda for GitHub’s product several years in advance. Next has a small number of permanent research staff, and this position is part of the core Next team.

This engineer will collaborate with OpenAI and Microsoft Research to collect and process large-scale data sets to improve the OpenAI Codex model that powers Copilot. The ideal candidate will have experience with mining software repositories, large-scale program analysis, and/or creating benchmark sets for machine learning for code tasks.


  • Collaborate with OpenAI to improve the quality of the Codex code synthesis model.
  • Undertake short- and medium-term research projects in the area of code synthesis, and ship improvements to the production model.
  • Participate in all activities of GitHub Next: organizing webinar series, evaluating project proposals, and disseminating research results.

Minimum Qualifications:

  • Ability to do innovative research on one of the following topics: mining software repositories, program analysis (static or dynamic), program synthesis, machine learning for code.
  • 3+ years experience building developer tools in production
  • Inclination to prototype quickly and make fast decisions on experiment failure.
  • A creative mindset and good practical skills are more important than formal experience.

Preferred Qualifications:

  • PhD in computer science or related field, or other evidence of the ability to do independent research.
  • Knowledge of Python or JavaScript and its ecosystem, or the ability to acquire such knowledge quickly.
  • Experience analyzing and/or mining large software repositories.
  • Ability to communicate complex ideas clearly, both in spoken and written form, for expert as well as novice audiences.
  • Interest in modern AI technologies and program synthesis in particular.

See all Developer Jobs >

Sign up for Daily Remote Job Alerts!

Want Access to 25,000+ More Remote and Flexible Jobs?

More Jobs

More Jobs

Part-time to full-time,
freelance to employee

More Career Fields

More Career Fields

50+ flexible
job categories

More Resources

More Resources

Q&A's, webinars,
career coaching & more

Learn More About Our Premium Service