Paperpile

Backend Engineer

Not Specified Global / Not Specified Engineering & Tech Senior

Salary:

Posted May 7, 2026

58 views

0 apply clicks

Apply Now ← Browse Jobs

Role

About the Role

Paperpile runs on data at scale, with a literature database of 250M+ academic papers and a growing body of user data accumulated over more than a decade. You'll work across the systems that ingest, process, store, and serve this data reliably: building pipelines, optimizing search, handling PDFs at scale, and exposing clean APIs.

Requirements:

Strong backend engineering background with experience building and operating data-heavy systems in production.
Experience deploying and operating services on AWS.
Experience designing and maintaining data ingestion pipelines handling messy, heterogeneous sources.
Comfortable with web scraping and working with third-party data sources and APIs.
Familiarity with Node.js and TypeScript (experience in Java or Python is also acceptable).
High standards for data quality, including correctness, deduplication, and consistency.
Solid understanding of full-text search systems including indexing strategy, relevance tuning, and query optimization.
Proficient in building reliable REST APIs.

Useful Experience:

Familiarity with academic publishing formats and data sources (PubMed, Crossref, arXiv).
Experience with PDF processing pipelines (extraction, transformation, storage, and delivery at scale).
Experience with LLM-based document processing or ML pipelines for extracting structured data from unstructured text.
Large scale web crawling and scraping.

Skills

Required Skills

Node.js TypeScript AWS Data Pipelines Web Scraping Full-text Search REST APIs PDF Processing LLM Machine Learning

Interested in this role?

← Back to all remote jobs