How It Works

The co-scientist follows a closed-loop research workflow. Each project moves through four stages, and the knowledge gained compounds across projects.

Plan

Define a research question and hypothesis. The co-scientist drafts a RESEARCH_PLAN.md with specific analyses and expected outcomes.

Run

Execute analyses as Jupyter notebooks on BERDL JupyterHub. Query collections via Spark SQL, generate figures, and produce data outputs.

Learn

Synthesize findings into a REPORT.md with literature context. Capture pitfalls, performance tips, and discoveries for the shared knowledge base.

Reuse

Skills, memory, and data products from each project compound. The next project starts with everything the system has learned so far.

Reusable tools the co-scientist invokes during research. Each skill encapsulates domain knowledge and workflow patterns learned from prior projects.

berdl

Query the KBase BERDL (BER Data Lakehouse) databases. Use when the user asks to explore pangenome data, query species information, get genome statisti...

berdl-discover

Discover and document BERDL databases. Use when the user wants to explore a new database, generate documentation for a database, or create a module fi...

berdl-ingest

Ingest a local dataset into the BERDL Lakehouse from a local (off-cluster) machine. Handles data format detection and preparation, MinIO upload, and D...

berdl-minio

Retrieve and use BERDL MinIO credentials and transfer result artifacts between BERDL object storage and the local machine. Use when exported query res...

berdl-query

Run SQL queries from a local machine against a provisioned BERDL Spark cluster using spark_connect_remote. Use when the user wants remote Spark comput...

berdl-review

Run an independent AI review of a project or research plan. Use when you want feedback without the full /submit checklist.

Example: The 5,526 Costly + Dispensable Genes

Research Question: What characterizes genes that are simultaneously burdensome (fitness improves when deleted) and not conserved in the pangenome? Are they mobile elements, recent acquisitions, degraded pathways, or something else?

Plan

RESEARCH_PLAN.md

Run

3 notebooks, 6 figures

Learn

REPORT.md with 547 references

Reuse

0 data products

View Full Project

Shared Memory

Knowledge captured during research that helps future projects avoid mistakes and build on prior findings.

Get Started

The co-scientist runs on BERDL JupyterHub with AI assistance via Claude Code and BERIL skills.