docs: propose agentic discovery + inference engineering additions to CV#3
docs: propose agentic discovery + inference engineering additions to CV#3ctr26 wants to merge 10 commits into
Conversation
65d0ae7 to
06bd620
Compare
…); add ShapeEmbed + MIFA to CV pubs
ctr26
left a comment
There was a problem hiding this comment.
as add that i'm keen to work more with agentic swarms. using agents with strong inference programming to use tools and counter factuals to discovery novel biology
| ## Education | ||
|
|
||
| **PhD, Engineering** — University of Cambridge, 2014 – 2018 | ||
| EPSRC PES-CDT Photonic & Electronic Systems studentship (£120k). |
|
|
||
| ## Education | ||
|
|
||
| **PhD, Engineering** — University of Cambridge, 2014 – 2018 |
There was a problem hiding this comment.
mention microscopy and image analysis
| **MRes, Photonics** — University of Cambridge & UCL, 2013 – 2014 | ||
| EPSRC Photonics CDT. Modules in computer vision, quantum mechanics, photonics; thesis on structured-illumination microscopy reconstruction. | ||
|
|
||
| **MSci, Physics (First-Class Honours)** — Nottingham Trent University, 2009 – 2013 |
| EPSRC Photonics CDT. Modules in computer vision, quantum mechanics, photonics; thesis on structured-illumination microscopy reconstruction. | ||
|
|
||
| **MSci, Physics (First-Class Honours)** — Nottingham Trent University, 2009 – 2013 | ||
| Top physics graduate of cohort. |
There was a problem hiding this comment.
i don't need two separate cvs
There was a problem hiding this comment.
mention recursions maps as a comparison to atlases
|
|
||
| ## Why Sanger, and why now | ||
|
|
||
| This Fellowship sits where my work is heading. |
There was a problem hiding this comment.
too casual
needs to say that there's a growing call for the virtual cell and there are two ways, top down or bottom up
bottom up with conservatives estimates is the length of the universe in compute without a quantum leap
top down MIGHT be enough to at least resolve disease without needing to bridge to bottom up
| Sanger is the only place in the UK where atlas-scale single-cell data (Cellular Genomics), a programme committed to predictive biology (Generative Genomics), and a strategic embrace of AI for science share one building. | ||
| I want to spend three years here. | ||
|
|
||
| I am Senior Machine Learning Scientist at Valence Labs / Recursion Pharmaceuticals. |
| I am Senior Machine Learning Scientist at Valence Labs / Recursion Pharmaceuticals. | ||
| I lead components of the Virtual Cell initiative, fine-tuning multi-modal LLMs over knowledge graphs, free text, transcriptomic and phenotypic imaging data to predict cell state and drug response. | ||
| I co-developed TxPert (arXiv:2505.14919), a state-of-the-art transcriptomic perturbation predictor, and contributed to the Boltz2 proteome-scale virtual screening pipeline. | ||
| Before Recursion I led AI engineering at EMBL-EBI's BioImage Archive and Uhlmann Group, supervised six PhD students, and shipped `bioimage_embed` (self-supervised learning for biological images) and `shape_embed` (cell-shape representation, arXiv:2507.01009) as community resources. |
There was a problem hiding this comment.
i lead projects exploring agnostic representation learning of biology through images with a collaboration from google cloud aswell as research efforts like thinking carefully about
…anglicise
- Strip promotional language: drop "state-of-the-art", "unparalleled",
"unique skills land best", "missing piece", "novelty I would bring".
- Anglicise: "place in the UK"→"UK site"; "live in"→"sit in"; "Sincerely,"
→"Yours sincerely,"; "Prof."→"Prof"; double→single quotes; drop
hyphen after -ly adverb ("locally-relevant"→"locally relevant");
"Year two onward"→"Years two and three".
- Replace industry register: "shipped"→"released"/"built"; "went into"
→"moved to"; "target"→"focus on"; "natural adjacencies"→
"complementary directions"; "embrace"→"commitment".
- Strunk pass: drop empty intensifiers ("exactly", "actually",
"explicitly"); fix double-with ("with shared latent space, with
knowledge graphs"→"with a shared latent space, using knowledge
graphs"); fix possessive on "Lotfollahi and Haniffa's groups" (two
separate groups, not joint).
- Standardise "multi-modal"→"multimodal" in writer's prose (Sanger's
own materials use "multimodal"; italicised quoted titles untouched).
- UK quote punctuation: question mark inside the closing quote where
the quotation is itself a question.
Substantive claims preserved: numerics (200M cells, 6 PhD students,
40+ researchers, €5M), all paper IDs (arXiv:2505.14919,
arXiv:2507.01009), institution names, dates, scope, "world's richest
genomics environment", "from scratch" (built light-sheet from raw
optics, not a kit).
- Replace casual opener with virtual-cell framing: top-down vs
bottom-up, with conservative bottom-up compute on the timescale
of the universe (cites Karr et al., Cell 2012, Mycoplasma
whole-cell model); top-down *might* resolve disease without
bridging.
- Add Recursion Maps / Sanger Atlases comparator: perturbation-
response geometry at industrial scale versus cell-type and tissue
coverage at population scale.
- Self-description: "Senior Machine Learning Scientist" →
"ML research scientist".
- Current Recursion work: lead with agnostic representation learning
of biology through images, Google Cloud collaboration, then the
Virtual Cell initiative ("thinking carefully about how to
fine-tune ..." preserving the existing TxPert/Boltz2 lineage).
- Agentic discovery: add agentic-swarm forward direction (ensembles
with strong inference programming, tools, counterfactuals,
cross-agent disagreement as verification signal).
May 9 comments on docs/sanger-gdm-academic-cv.md are already
addressed: the file was deleted in 74faa90 (single canonical CV in
index.md), and index.md already covers microscopy and image analysis.
Adds a markdown proposal under docs/ showing the additions to make to index.md (Core Strengths, Recursion bullet, Skills/ML&AI line). No edits to index.md itself yet — review the proposed text, then merge or hand-apply when you are happy.