Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
0b0547b
docs(quarto): changes to Custom Hooks Demo notebook as per Excel spre…
lukeroantreeONS Mar 10, 2026
493f518
docs(servers): changes to documentation as per Excel spreadsheet - r…
lukeroantreeONS Mar 10, 2026
c3c9006
docs(indexers): changes to indexers documentation as per Excel spread…
lukeroantreeONS Mar 10, 2026
a8a79cb
docs(vectorisers): changes to vectorisers documentation as per Excel …
lukeroantreeONS Mar 11, 2026
2ea3a4d
docs(demo): remove install instructions from the custom vectoriser demo
lukeroantreeONS Mar 11, 2026
ef0ede3
docs(demo): remove install instructions from general demo
lukeroantreeONS Mar 11, 2026
af0ec40
docs(demos): changes to DEMOs README documentation as per Excel sprea…
lukeroantreeONS Mar 11, 2026
b331481
docs(landing page): changes to Landing Page documentation as per Exce…
lukeroantreeONS Mar 11, 2026
ebf2bf6
docs(quarto): address Riley's Comments
lukeroantreeONS Mar 11, 2026
44e7584
docs(quarto): add ONS site icon
lukeroantreeONS Mar 11, 2026
daf165c
docs(quarto): address Erlend's Comments
lukeroantreeONS Mar 11, 2026
ad73751
docs(quarto): address Riley's Comments, round 2
lukeroantreeONS Mar 11, 2026
27bc56c
docs(quarto): address Riley's Comments, round 3
lukeroantreeONS Mar 11, 2026
960c7ae
docs(quarto): address Riley's Comments, round 4
lukeroantreeONS Mar 11, 2026
b3ec308
docs(quarto): address Riley's Comments, round 5 - DEMO installation o…
lukeroantreeONS Mar 11, 2026
dd36c6a
docs(quarto): missed code quotes on error in servers
lukeroantreeONS Mar 11, 2026
311c4ff
docs(quarto): address Riley's Comments, round 6 - embed() return form…
lukeroantreeONS Mar 11, 2026
15ee886
docs(quarto): address Erlend's Comments, round 2 - indexers overview
lukeroantreeONS Mar 11, 2026
daa5c48
added more highlights
rileyok-ons Mar 11, 2026
3df4d82
even more highlighting
rileyok-ons Mar 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ repos:
name: deptry (uv)
language: system
pass_filenames: false # deptry expects a project path, not filenames
entry: uv run deptry .
entry: uv run deptry --per-rule-ignores "DEP003=plum,DEP004=quartodoc|numpydoc" .

- id: forbid-new-init
name: Check if __init__.py is added to the src folder
Expand Down
114 changes: 89 additions & 25 deletions DEMO/README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,14 @@
# Demo for `classifai`
# Overview of Demonstrations & Examples

This directory contains a set of Jupyter notebooks designed to help you understand and use `classifai` effectively.

## Prerequisites

You may wish to download each notebook individually and the demo dataset individually - each notebook contains specific installation instructions on how to set up an environemnt and download the package

## Running the Demo

To start the demo, launch Jupyter Notebook or JupyterLab from your terminal in this directory:

```bash
jupyter notebook
```

Or, if you prefer JupyterLab:

```bash
jupyter lab
```

Then, open the notebooks in your browser.
We recommend going through the the `general_workflow_demo.ipynb` notebook for a broad overview of the package before moving onto the `custom_vectoriser.ipynb` notebook, which covers a more advanced use-case.
---

## Notebooks Overview

This demo includes two Jupyter notebooks:
This demo series includes several Jupyter notebooks:

### 1. `general_workflow_demo.ipynb`
### 1. ✨ ClassifAI Demo - Introduction & Basic Usage ✨ : `general_workflow_demo.ipynb`

This introduces the core features of `classifai`.

Expand All @@ -43,7 +24,7 @@ It covers:

This notebook is intended for prospective users to get a quick overview of what the package can do, and as a 'jumping off point' for new projects.

### 2. `custom_vectoriser.ipynb`
### 2. Creating Your Own Vectoriser : `custom_vectoriser.ipynb`

This notebook demonstrates how to create a new, custom Vectoriser by extending the base `VectoriserBase` class.

Expand All @@ -55,7 +36,7 @@ It covers:

This notebook is for users who want to implement a vectorisation approach not covered by our existing suite of Vectorisers.

### 3. `custom_preprocessing_and_postprocessing_hooks.ipynb`
### 3. VectorStore pre- and post- processing logic with _Hooks_ πŸͺ : `custom_preprocessing_and_postprocessing_hooks.ipynb`

This notebook demostrates how to add custom Python code logic to the VectorStore search pipeline, such as performing spell checking on user input, without breaking the data flow of the ClassifAI VectorStore.

Expand All @@ -70,3 +51,86 @@ It covers:
* Examples of different kinds of hooks that can be written - [spellchecking, deduplicating results, adding extra info to results based on result ids]

---

## Installation of classifai

#### *0)* [optional] Create and activate a virtual environment from the command line

##### Using pip + venv

Create a virtual environment:

`python -m venv .venv`

##### Using UV

Create a virtual environment:

`uv venv`

##### Activating your environment

(macOS / Linux):

`source .venv/bin/activate`

Activate it (Windows):

`source .venv/Scripts/activate`

#### *1)* Install the classifai package

##### Using pip

`pip install "https://github.com/datasciencecampus/classifai/releases/download/v0.2.1/classifai-0.2.1-py3-none-any.whl"`

##### Using uv

one-off:

`uv pip install "https://github.com/datasciencecampus/classifai/releases/download/v0.2.1/classifai-0.2.1-py3-none-any.whl"`

add as project dependency:

`uv add "https://github.com/datasciencecampus/classifai/releases/download/v0.2.1/classifai-0.2.1-py3-none-any.whl"`


#### *2)* Install optional dependencies

##### Using pip

`pip install "classifai[<dependency>]"`

where `<dependency>` is one or more of `huggingface`,`gcp`,`ollama`, or `all` to install all of them.
##### Using uv

one-off installation

`uv pip install "classifai[<dependency>]"`

add as project dependency

`uv add "classifai[<dependency>]"`

---

## Prerequisites

You may wish to download each notebook individually and the demo dataset individually - each notebook contains specific installation instructions on how to set up an environemnt and download the package

## Running the Demo

To start the demo, launch Jupyter Notebook or JupyterLab from your terminal in this directory:

```bash
jupyter notebook
```

Or, if you prefer JupyterLab:

```bash
jupyter lab
```

Then, open the notebooks in your browser.
We recommend going through the the `general_workflow_demo.ipynb` notebook for a broad overview of the package before moving onto the `custom_vectoriser.ipynb` notebook, which covers a more advanced use-case.
119 changes: 14 additions & 105 deletions DEMO/custom_preprocessing_and_postprocessing_hooks.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -71,11 +71,12 @@
"The use of these dataclasses both helps the user of the package to understand what data needs to be provided to the Vectorstore and how a user should interact with the objects being returned by these VectorStore functions. Additionally, this ensures robustness of the package by checking that the correct columns are present in the data before operating on it. \n",
"\n",
"The reverse_search() and embed() VectorStore functions have their own input and output data classes with their own validity column data checks. The names of each set are intuitively:\n",
"| **VectorStore Method** | **Input Dataclass** | **Output Dataclass** |\n",
"\n",
"| **VectorStore Method** | **Input Dataclass** | **Output Dataclass** |\n",
"|-------------------------------|-----------------------------|-----------------------------|\n",
"| `VectorStore.search()` | `VectorStoreSearchInput` | `VectorStoreSearchOutput` |\n",
"| `VectorStore.search()` | `VectorStoreSearchInput` | `VectorStoreSearchOutput` |\n",
"| `VectorStore.reverse_search()` | `VectorStoreReverseSearchInput` | `VectorStoreReverseSearchOutput` |\n",
"| `VectorStore.embed()` | `VectorStoreEmbedInput` | `VectorStoreEmbedOutput` |\n",
"| `VectorStore.embed()` | `VectorStoreEmbedInput` | `VectorStoreEmbedOutput` |\n",
"\n",
"Users of the package can use the schema of each of these input and output dataclasses to understand how to interface with these main methods of the VectorStore class.\n",
"\n"
Expand Down Expand Up @@ -145,92 +146,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Installation (pre-release)\n",
"\n",
"`Classifai` is currently in **pre-release** and is **not yet published on PyPI**. \n",
"This section describes how to install the packaged **wheel** from the project’s public GitHub Releases so that you can follow through this DEMO and try the code yourself.\n",
"\n",
"#### 1) Create and activate a virtual environment in command line\n",
"\n",
"##### Using `pip` + `venv`\n",
"Create a virtual environment:\n",
"\n",
"```bash\n",
"python -m venv .venv\n",
"```\n",
"\n",
"##### Using `UV`\n",
"Create a virtual environment:\n",
"\n",
"```bash\n",
"uv venv\n",
"```\n",
"\n",
"Activate the created environment with \n",
"\n",
"(macOS / Linux):\n",
"```bash\n",
"source .venv/bin/activate\n",
"```\n",
"Activate it (Windows):\n",
"```bash\n",
"source .venv/Scripts/activate\n",
"```\n",
"\n",
"---\n",
"\n",
"#### 2) Install the pre-release wheel\n",
"\n",
"##### Using `pip`\n",
"```bash\n",
"pip install \"https://github.com/datasciencecampus/classifai/releases/download/v0.2.1/classifai-0.2.1-py3-none-any.whl\"\n",
"```\n",
"\n",
"##### Using `uv`\n",
"```bash\n",
"uv pip install \"https://github.com/datasciencecampus/classifai/releases/download/v0.2.1/classifai-0.2.1-py3-none-any.whl\"\n",
"```\n",
"\n",
"---\n",
"\n",
"#### 3) Install optional dependencies (`[huggingface]`)\n",
"\n",
"Finally, for this demo we will be using the Huggingface Library to download embedding models - we therefore need an optional dependency of the Classifai Pacakge:\n",
"\n",
"##### Using `pip`\n",
"```bash\n",
"pip install \"classifai[huggingface]\"\n",
"```\n",
"\n",
"##### Using `uv pip`\n",
"```bash\n",
"uv pip install \"classifai[huggingface]\"\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Assuming the step one virtual environemnt is set up and actiavted and ready in the terminal, run the following commands to install the classifai package and the huggingface dependencies.\n",
"## PIP\n",
"#!pip install \"https://github.com/datasciencecampus/classifai/releases/download/v0.2.1/classifai-0.2.1-py3-none-any.whl\"\n",
"#!pip install \"classifai[huggingface]\"\n",
"\n",
"## UV\n",
"#!uv pip install \"https://github.com/datasciencecampus/classifai/releases/download/v0.2.1/classifai-0.2.1-py3-none-any.whl\"\n",
"#!uv pip install \"classifai[huggingface]\"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Note! :\n",
"\n",
"You may need to install the ipykernel python package to run Notebook cells with your Python environment"
"#### If you can run the following cell in this notebook, you should be good to go!"
]
},
{
Expand All @@ -239,29 +157,20 @@
"metadata": {},
"outputs": [],
"source": [
"#!pip install ipykernel\n",
"from classifai.vectorisers import HuggingFaceVectoriser\n",
"\n",
"#!uv pip install ipykernel"
"print(\"done!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"\n",
"#### If you can run the following cell in this notebook, you should be good to go!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from classifai.vectorisers import HuggingFaceVectoriser\n",
"#### Alternatively, to test without running a notebook, run the following from your command line; \n",
"\n",
"print(\"done!\")"
"```shell\n",
"python -c \"import classifai\"\n",
"```"
]
},
{
Expand Down Expand Up @@ -761,7 +670,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "classifai",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -775,9 +684,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.7"
"version": "3.12.10"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
Loading
Loading