Skip to content

chore(indexers): updated vectorstore types#143

Open
rileyok-ons wants to merge 4 commits intomainfrom
142-update-vectorstore-types
Open

chore(indexers): updated vectorstore types#143
rileyok-ons wants to merge 4 commits intomainfrom
142-update-vectorstore-types

Conversation

@rileyok-ons
Copy link
Collaborator

@rileyok-ons rileyok-ons commented Mar 6, 2026

✨ Summary

Fixes the typing for vectorstore to add typehints to the methods and resolve type errors with internal .vectors
Also fixes bug with deptry on pre-commit hook
Also updates server pydantic models to patch out deprecated extra definition (now uses configdict)
Resolves #142

📜 Changes Introduced

  • adds in correct entrypoint for deptry in pre-commit docker

  • adds proper init typehints for vectorstore

  • adds support for pathlike for filename inputs

  • Feature implementation (feat:) / bug fix (fix:) / refactoring (chore:) / documentation (docs:) / testing (test:)

  • Updates to tests and/or documentation

  • Terraform changes (if applicable)

✅ Checklist

Please confirm you've completed these checks before requesting a review.

  • Code passes linting with Ruff
  • Security checks pass using Bandit
  • API and Unit tests are written and pass using pytest
  • Terraform files (if applicable) follow best practices and have been validated (terraform fmt & terraform validate)
  • DocStrings follow Google-style and are added as per Pylint recommendations
  • Documentation has been updated if needed

🔍 How to Test

Run usual demo to check if it inits correctly, then try inputting in Path as file location, check if intellisense looks better

@rileyok-ons rileyok-ons requested a review from a team as a code owner March 6, 2026 16:20
@rileyok-ons rileyok-ons linked an issue Mar 6, 2026 that may be closed by this pull request
@github-actions github-actions bot added the chore label Mar 6, 2026
Copy link
Contributor

@frayle-ons frayle-ons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one small change requested - and can we confirm this will work okay with quartodocs.

self.hooks = {} if hooks is None else hooks
self.vectoriser_class = vectoriser.__class__.__name__
## these are all to be filled in from vectors creation
self.vector_shape: int | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are self.num_vectors andself.vector_shape the same as self.vectors in this case? they are defined later so why not just remove the None logic from here and let lines 209 and 210 do the assignment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove these, quarto seems to build fine but it doesn't seem to show constructors, wondering if we'd want to configure for this?

file_name (str | os.PathLike): The name of the input CSV file.
data_type (str): The type of input data (currently supports only "csv").
vectoriser (object): The vectoriser object used to transform text into
vectoriser (vectoriserBase): The vectoriser object used to transform text into
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small note: [V]ectoriserBase (capitalisation)

@lukeroantreeONS
Copy link
Collaborator

This all looks good to me - but before merging, can we check that initialising a VectorStore with a CSV within a GCP Bucket still works with the new checks added? (I know we haven't advertised that functionality yet, but it would be a shame to break it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update VectorStore types

3 participants