Lila Sciences

Biotech

Co-Op,DataExtraction

$48–65k ~AI est. Cambridge, Massachusetts, United States INTERNSHIP

Market Sentiment

HIGH DEMAND

Neural analysis suggests this role is
optimal for Mid+ candidates.

The Brief

“Co-Op, Data Extraction at Lila Sciences. Skills: Data extraction, Machine learning, NLP, Computer vision. Contribute to AI systems. Extract and structure knowledge”

Industry & Context.

Biotech

What They're Looking For.

Must Have

Bachelor's, Master's, or PhD in Computer Science, Chemistry, Materials Science, or related field, Solid foundation in machine learning fundamentals, Python proficiency

Nice to Have

Coursework or projects involving multimodal models, Coursework or projects involving document understanding, Experience working with messy, real-world datasets, Interest in scientific document parsing

What You'll Do.

Contribute to AI systems

Extract and structure knowledge

Structure unstructured scientific data

Run extraction pipelines

Share work through presentation

Contribute to publication

Contribute to open-source project

How You'll Work.

Team & Collaboration

Alongside research scientists; Alongside engineers

Communication Scope

Document findings clearly; Share work

Full Job Description

Your Impact at LILA Lila Sciences builds AI systems that accelerate discovery across the physical and life sciences. Within Physical Sciences AI, our team works on turning unstructured scientific knowledge (e.g., literature, patents, technical reports) into structured signals that power downstream Lila applications. As a Data Extraction Co-Op, you will work alongside research scientists and engineers on a focused sub-problem in this stack. You will get hands-on experience fine-tuning and evaluating extraction models, building pipelines for messy real-world data, and shipping work that flows into production systems. What You'll Be Building Contribute to AI systems that extract and structure knowledge from scientific literature and patents, focused on a well-defined sub-problem Fine-tune and evaluate language, multimodal, or specialized models for data extraction, with mentor guidance Build and test pipelines that structure unstructured scientific data across text, tables, and visuals Run extraction pipelines, analyze results, and document findings clearly Share your work through a team presentation, write-up, or contribution to a publication or open-source project What You'll Need to Succeed Pursuing a Bachelor's, Master's, or PhD in Computer Science, Chemistry, Materials Science, or a related field Solid foundation in machine learning fundamentals and Python Familiarity with NLP or computer vision concepts Curiosity about scientific data and willingness to learn quickly in a research setting Bonus Points For Coursework or projects involving multimodal models or document understanding (OCR, table/figure extraction) Experience working with messy, real-world datasets Interest in scientific document parsing About LILA Lila Sciences is building Scientific Superintelligence™ to solve humankind's greatest challenges. We believe science is the most inspiring frontier for AI. Rather than hard-coding expert knowledge into tools, LILA builds systems that can learn for themselv

Free ATS check

Applying for this Co-Op, Data Extraction role?

Most applicants get filtered before a human reads their resume. See if yours makes the cut.

How to Apply on Greenhouse

Create a Greenhouse profile before applying — it saves time across multiple applications.
Upload your resume as a PDF; the parser handles it better than Word.
Answer all knockout questions carefully — wrong answers auto-reject before a human sees you.
Enable email notifications to track application status in real time.

ANONYMOUS · UNFILTERED

What do employees actually say about Lila Sciences?

Real rants from real employees. Read before you apply.

Read Company Rants →