[Benchmarking]: Benchmarking prototype - dataset: CUAD by acarbonetto · Pull Request #228 · awslabs/graphrag-toolkit

acarbonetto · 2026-04-27T15:59:16Z

Issue #, if available:

N/A

Description of changes:

Add an integration test that benchmarks a small prototype dataset.

This PR adds three tests CuadBenchmarkEvaluate, CuadBenchmarkQuery, and CuadBenchmarkBuild, and a small prototype dataset for runnings.

Test Results:

02-CuadBenchmarkEvaluate.PASS.json
01-CuadBenchmarkQuery.PASS.json
00-CuadBenchmarkBuild.PASS.json

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

) Co-authored-by: ianrob <ianrob@amazon.co.uk>

…DB with Neo4j, fix notebooks and scripts (#189) * Consolidate environment configuration - Delete redundant docker/.env.jupyter and docker/.env.template - Consolidate into single notebooks/.env.template (source of truth) - Replace 3 S3 bucket vars with single S3_BUCKET_NAME + prefixes - Align DynamoDB table name (graphrag-toolkit-batch-table) - Fix env var names (INCLUDE_DOMAIN_LABELS, AWS_REGION) - Update models to claude-sonnet-4-6, AWS_PROFILE to default - Add .env and output.log to .gitignore * Replace FalkorDB with Neo4j and standardize Docker - Replace FalkorDB with Neo4j 5.25-community + APOC in dev compose - Rename all services to hybrid convention across all compose files - Add neo4j Python driver and build-essential to dev Dockerfile - Fix dev-reset.sh: run docker compose down before rebuilding - Add mysql cleanup to dev-reset.sh - Update container/volume names in reset scripts - Remove lexical-graph-src mount from main compose (fix dev mode detection) - Fix dev compose mount path and remove non-existent mysql schema mount * Fix AWS setup scripts and add prompt upload - Add bedrock:InvokeModel to batch inference IAM role policy - Align bucket name to graphrag-toolkit-ACCOUNT_ID pattern - Fix AWS profile default from padmin to default - Add S3 prompt file upload (extract from JSON, upload as .txt) - Apply all fixes to both .sh and .ps1 scripts * Standardize and fix notebooks - Fix titles to match numbering (00-Setup through 04-Cloud-Querying) - Standardize dotenv loading (%dotenv magic) - Fix collection_id web-docs to best-practices in notebook 01 - Add Neo4jGraphStoreFactory registration to notebook 03 - Add Bedrock prompt provider section to notebook 04 - Add GPU requirement warning for BGEReranker - Remove hardcoded ARNs, use placeholder comments - Fix hardcoded region to os.environ in notebook 03 - Remove empty trailing cells, clear all outputs * Fix documentation - Update README Quick Start to correct setup order - Add .env sync instructions after setup script - Update model references to claude-sonnet-4-6 - Add RESPONSE_MODEL and EVALUATION_MODEL to docs - Fix BATCH_ROLE_NAME to bedrock-batch-inference-role - Align DynamoDB table name in batch_processing.md and aws_integration.md - Replace padmin with your-profile in setup docs - Replace ccms-rag-extract with graphrag-toolkit bucket names * Update examples/lexical-graph-hybrid-dev/notebooks/.env.template Co-authored-by: Andrew Carbonetto <andrew.carbonetto@improving.com> * Update examples/lexical-graph-hybrid-dev/README.md Co-authored-by: Andrew Carbonetto <andrew.carbonetto@improving.com> * Remove quotes from env vars, add AWS CLI quickstart link Remove quotes from all env var values in README.md and .env.template. Add link to AWS CLI quickstart guide and aws sts get-caller-identity verification command in prerequisites. * Remove redundant .env paths from .gitignore The bare .env pattern on line 31 already matches .env files in all subdirectories, making the explicit paths for local-dev and hybrid-dev notebooks redundant. * Replace duplicated env vars in aws_integration.md with link to .env.template * Remove duplicate .env configuration sections from README Replace repeated env vars listing and setup instructions with a link to notebooks/.env.template as the single source of truth. * Use brackets for optional profile arg in setup docs * Remove hardcoded profile defaults from setup scripts Remove hardcoded 'default' profile from setup-bedrock-batch.sh and .ps1. Scripts now accept an optional profile argument — when omitted, AWS CLI uses its default credential chain (env vars, instance profile, etc.). - .sh: use PROFILE_ARGS conditional variable - .ps1: use @ProfileArgs splatting - Update README.md and setup-bedrock-batch-doc.md accordingly --------- Co-authored-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Co-authored-by: zollie <>

* update documentation * fix broken mdx files * small comments --------- Co-authored-by: Oussama Hansal <haooussa@amazon.com>

Bumps [astro](https://github.com/withastro/astro/tree/HEAD/packages/astro) from 5.6.2 to 5.18.1. - [Release notes](https://github.com/withastro/astro/releases) - [Changelog](https://github.com/withastro/astro/blob/astro@5.18.1/packages/astro/CHANGELOG.md) - [Commits](https://github.com/withastro/astro/commits/astro@5.18.1/packages/astro) --- updated-dependencies: - dependency-name: astro dependency-version: 5.18.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Co-authored-by: Oussama Hansal <haooussa@amazon.com>

The Neptune `create_config` helper constructed a botocore Config without `max_pool_connections`, which falls back to botocore's default of 10. Under the default `CompositeTraversalBasedRetriever` both `ChunkBasedSearch` and `EntityNetworkSearch` run in parallel with `ThreadPoolExecutor(max_workers=num_workers=10)` each, driving up to 20 concurrent Neptune queries and queuing half of them on the pool. Apply a 32-connection default via `setdefault`, giving ~60% headroom above the code-provable peak while preserving explicit overrides via the existing `config=` kwarg on `GraphStoreFactory.for_graph_store`. Co-authored-by: voidwisp <voidwisp@users.noreply.github.com>

When the caller passes a sized iterable (e.g. a list), compute the total batch count once per pipeline run and include it in the per-batch log message as "batch: N/M". True streaming iterators without __len__ keep the existing "batch: N" format. In non-TTY environments (CloudWatch, ECS, CI/CD) the per-batch log is the only progress signal; knowing "batch 7 of 12" versus just "batch 7" is the difference between estimating completion and guessing. Zero extra materialization cost — only len() is probed. Follow-up to #96 (closes #97). Co-authored-by: voidwisp <voidwisp@users.noreply.github.com>

* feat(opensearch): thread client_kwargs through OpenSearchIndex Expose a `client_kwargs` dict on `OpenSearchIndex.for_index` and `create_opensearch_vector_client` so callers can override OpenSearch-py client settings (e.g. `pool_maxsize`, `timeout`, `max_retries`) without patching the library. The underlying `create_os_client` / `create_os_async_client` helpers already accept `**kwargs`, but nothing upstream was passing values in, so the query clients fell back to urllib3's default pool size of 1. Under concurrent retrieval this causes "connection pool is full" warnings and repeated TLS handshakes. Usage: VectorStoreFactory.for_vector_store( uri, client_kwargs={"pool_maxsize": 10, "timeout": 60}, ) The kwarg is optional; when unset, behaviour is unchanged. * feat(opensearch): default pool_maxsize to 32 urllib3 falls back to pool_maxsize=1 when not set, which serializes concurrent queries behind a single TLS connection. 32 is a sane baseline that matches the typical fan-out of the traversal-based retrievers. Explicit overrides (including the existing ``pool_maxsize=1`` used by ``index_exists``) still win via ``setdefault``. * fix(opensearch): make all hardcoded client kwargs overridable via client_kwargs Addresses reviewer feedback on PR #204. Previously only pool_maxsize used setdefault; use_ssl, verify_certs, connection_class, timeout, max_retries, retry_on_timeout were unconditionally hardcoded as explicit kwargs alongside **kwargs, so any caller-supplied override for those keys triggered: TypeError: OpenSearch() got multiple values for keyword argument ... Move all defaults into a single dict merged with caller kwargs (caller wins). Same fix applied to both create_os_client and create_os_async_client. Adds a test per client that exercises pool_maxsize + timeout override while confirming the remaining defaults still flow through. --------- Co-authored-by: voidwisp <voidwisp@users.noreply.github.com> Co-authored-by: Ian Robinson <ianrob@amazon.com>

Bumps [astro](https://github.com/withastro/astro/tree/HEAD/packages/astro) and [@astrojs/starlight](https://github.com/withastro/starlight/tree/HEAD/packages/starlight). These dependencies needed to be updated together. Updates `astro` from 5.18.1 to 6.1.8 - [Release notes](https://github.com/withastro/astro/releases) - [Changelog](https://github.com/withastro/astro/blob/main/packages/astro/CHANGELOG.md) - [Commits](https://github.com/withastro/astro/commits/astro@6.1.8/packages/astro) Updates `@astrojs/starlight` from 0.32.5 to 0.38.3 - [Release notes](https://github.com/withastro/starlight/releases) - [Changelog](https://github.com/withastro/starlight/blob/main/packages/starlight/CHANGELOG.md) - [Commits](https://github.com/withastro/starlight/commits/@astrojs/starlight@0.38.3/packages/starlight) --- updated-dependencies: - dependency-name: astro dependency-version: 6.1.8 dependency-type: direct:production - dependency-name: "@astrojs/starlight" dependency-version: 0.38.3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add documentation site URL and PyPI link to the top of the root * address comments --------- Co-authored-by: Oussama Hansal <haooussa@amazon.com>

--------- Co-authored-by: Oussama Hansal <haooussa@amazon.com>

…cs-site (#219) Co-authored-by: Oussama Hansal <haooussa@amazon.com>

…ph (#213) Bumps [python-dotenv](https://github.com/theskumar/python-dotenv) from 1.1.1 to 1.2.2. - [Release notes](https://github.com/theskumar/python-dotenv/releases) - [Changelog](https://github.com/theskumar/python-dotenv/blob/main/CHANGELOG.md) - [Commits](theskumar/python-dotenv@v1.1.1...v1.2.2) --- updated-dependencies: - dependency-name: python-dotenv dependency-version: 1.2.2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…ts (#223) KeywordVSSProvider and GraphSummary are retrieval-time components but incorrectly default to GraphRAGConfig.extraction_llm. When a user sets response_llm to an active model but leaves extraction_llm as default, traversal-based search fails silently at keyword extraction. This changes both components to use response_llm, consistent with KeywordProvider and QueryModeRetriever. Co-authored-by: Abdellah Ghassel <abdghsl@amazon.com>

…oss examples (#272) * Use .env.template pattern for environment configuration across examples Standardize environment variable handling across lexical-graph notebooks, workshop, and byokg-rag examples to follow the pattern established in lexical-graph-local-dev and lexical-graph-hybrid-dev. Changes: - Add .env.template files for lexical-graph/notebooks, workshop, and byokg-rag - Refactor all 5 byokg-rag notebooks to use dotenv instead of hardcoded values - Add python-dotenv to byokg-rag dependencies - Update READMEs with environment setup instructions - Create byokg-rag README with notebook descriptions and required env vars * Update examples/lexical-graph/workshop/readme.md Co-authored-by: Andrew Carbonetto <andrew.carbonetto@improving.com> * Update examples/byokg-rag/README.md Co-authored-by: Andrew Carbonetto <andrew.carbonetto@improving.com> * Update examples/lexical-graph/workshop/.env.template Co-authored-by: Andrew Carbonetto <andrew.carbonetto@improving.com> * Address PR review feedback - Add placeholder value for GRAPH_IDENTIFIER in byokg-rag .env.template - Restore trailing newlines in notebook JSON files - Remove redundant 'fill in your values' text from READMEs - Remove bare GRAPH_STORE= and VECTOR_STORE= lines from .env.templates to avoid unsetting existing environment variables - Move python-dotenv from main requirements to optional dev dependency * Use <YOUR_GRAPH_IDENTIFIER> placeholder style consistent with hybrid-dev * Simplify env config README sections — remove redundant intro text * Move python-dotenv to notebook pip install cells Remove python-dotenv from pyproject.toml dev dependency. Add it to the pip install cells in each byokg-rag notebook instead, since it is only needed for notebook usage, not the library runtime. * Fix cell ordering in local_graph notebook and add python-dotenv to prerequisites - Swap pip install cell before dotenv usage cell in local_graph notebook - Add python-dotenv to README prerequisites for discoverability --------- Co-authored-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Extend security-scan.yml to scan lexical-graph/src/ with Bandit, matching the existing byokg-rag pattern. Adds lexical-graph/** to trigger paths and a separate bandit job for lexical-graph.

Bumps [actions/github-script](https://github.com/actions/github-script) from 8 to 9. - [Release notes](https://github.com/actions/github-script/releases) - [Commits](actions/github-script@v8...v9) --- updated-dependencies: - dependency-name: actions/github-script dependency-version: '9' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 4 to 7. - [Release notes](https://github.com/astral-sh/setup-uv/releases) - [Commits](astral-sh/setup-uv@v4...v7) --- updated-dependencies: - dependency-name: astral-sh/setup-uv dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

Updates the requirements on [pre-commit](https://github.com/pre-commit/pre-commit) to permit the latest version. - [Release notes](https://github.com/pre-commit/pre-commit/releases) - [Changelog](https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md) - [Commits](pre-commit/pre-commit@v3.5.0...v4.6.0) --- updated-dependencies: - dependency-name: pre-commit dependency-version: 4.6.0 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6. - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@v5...v6) --- updated-dependencies: - dependency-name: actions/setup-python dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 2 to 3. - [Release notes](https://github.com/softprops/action-gh-release/releases) - [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md) - [Commits](softprops/action-gh-release@v2...v3) --- updated-dependencies: - dependency-name: softprops/action-gh-release dependency-version: '3' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4 to 6. - [Release notes](https://github.com/actions/setup-node/releases) - [Commits](actions/setup-node@v4...v6) --- updated-dependencies: - dependency-name: actions/setup-node dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

- hash_utils.py: Add usedforsecurity=False to MD5 (used for content IDs, not crypto) - proposition_extractor.py: Add # nosec B615 for HuggingFace downloads (model name is user-configurable, pinning not feasible) - pg_vector_indexes.py: Add # nosec B608 for SQL identifier interpolation (table/column names from internal config, not user input) bandit -r lexical-graph/src/ -ll now exits with code 0 (0 medium+ findings).

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

The validation check used '== 3' which incorrectly rejected exactly 3 parameters (the actual maximum supported by the function signature). Changed to '> 3' to allow the full range of 0-3 parameters. Added unit tests covering all boundary cases: - 0, 1, 2, 3 parameters accepted - 4, 5 parameters rejected - Missing name rejected - Parameters marked required and not hidden

…les fix Bump upload-pages-artifact from v3 to v5 and add include-hidden-files: true to preserve .nojekyll in the docs artifact. Without this flag, v4+ excludes dotfiles by default which could affect GitHub Pages deployment.

Replace 3,402 files with :: in their names (invalid on Windows/NTFS) with ZIP archives (amazon.zip, ecorp.zip, ntsb.zip, wiki-aircraft.zip). This fixes the inability to clone the repository on Windows.

- Add zip_source parameter for explicit ZIP archive reading - Add filename_sanitizer parameter for customizable filesystem-safe names - Include windows_safe_filename helper (replaces :: with __) - Update 2-Querying.ipynb and setup.py to use zip_source - Add Windows compatibility docs to workshop readme - Add 17 unit tests for new functionality

The : character is illegal in Windows/NTFS filenames. The previous implementation only replaced :: with __ but left single : intact, causing NotADirectoryError on Windows. Now replaces :: with __ first, then remaining : with _: aws::abc:1234 → aws__abc_1234

…dows On Windows, open() without encoding defaults to the system codepage (cp1252) which cannot encode Unicode characters like snowman or globe emoji. Adding encoding='utf-8' ensures consistent behavior across platforms.

…rieval, storage, and utils (#300) * [FEATURE] Add unit tests for lexical-graph storage/graph utilities Adds unit test coverage for two undertested modules. Tests only, no source changes. - storage/graph/graph_utils.py: 21.5% -> 97.5% - storage/graph/query_tree.py: 25.6% -> 100% Full-suite coverage: 46.10% -> 46.86% (+488 lines, +71 tests). * [FEATURE] Bump lexical-graph coverage threshold from 45% to 46% Locks in the gain from the new graph_utils and query_tree tests (46.10% -> 46.86%). * [FEATURE] Trim AI-flavoured comments and tighten test names in new graph tests Drop the module/class boilerplate docstrings and restate-the-code inline comments added in the previous coverage commits, keeping only the WHY comments (branch mappings, LIFO ordering). Rename several tests for clarity: fix a typo (dedupd to deduped), drop uppercase AND/OR from identifiers, and reshape the property-based node_id names to describe the outcome instead of the input shape. Fold the dash case into the non-alnum test it overlaps with. * [FEATURE] Bump lexical-graph coverage from 47% to 58% with broad unit-test additions Adds unit tests across retrieval utils, processors, query-context providers, retrievers, storage wrappers, vector index helpers, and shared utils. Pushes the full-suite coverage from 47% to 58.15% (1305 to 1611 tests) and lifts the CI gate from 47% to 56%. Covered modules (each new test file targets one source module): retrieval/utils: statement_utils, chunk_utils, entity_utils, vector_utils, query_decomposition retrieval/processors: zero_scores, truncate_statements, update_chunk_metadata, rerank_statements retrieval/query_context: query_mode, keyword_provider, keyword_nlp_provider, keyword_vss_provider, entity_provider, entity_from_top_statement_provider, entity_vss_provider, entity_context_provider retrieval/retrievers: query_mode_retriever, chunk_cosine_search, chunk_based_search, topic_based_search, entity_based_search, entity_network_search, entity_context_search, composite_traversal_based_retriever, chunk_based_semantic_search, semantic_chunk_beam_search, traversal_based_base_retriever retrieval/summary: graph_summary storage/graph: multi_tenant_graph_store, neptune_graph_stores (helpers + factories) storage/vector: vector_store, multi_tenant_vector_store, read_only_vector_store, neptune_vector_indexes, s3_vector_indexes helpers utils: reranker_utils, bedrock_utils, llm_cache (predict/stream/cache paths) Tests use mocks for external services (graph stores, LLMs, boto3 clients, embedding caches) so no infrastructure is required. pg_vector_indexes and opensearch_vector_indexes are not covered here because psycopg2-binary and llama-index-vector-stores-opensearch are not in the test env; the gate at 56 reflects the actual achievable ceiling on this configuration. * [FEATURE] Remove non-deterministic test_is_unique per reviewer feedback Drop the new_query_var uniqueness assertion. uuid4 collision is theoretically possible (1 in 2^122), so the assertion is not strictly reproducible. The hex body and prefix tests already validate the output shape; an accidental fixed return would still fail those. Per acarbonetto review comment on PR #1. * [FIX] Add license header to retrievers/__init__.py test package The empty __init__.py was failing the license-header-check job, which gates the test job. Recent CI runs showed "unknown%" in the coverage report comment because the artifact never uploaded. * [FEATURE] Address Mykola's review comments on PR #1 test additions * test_reranker_utils: add end-to-end TF-IDF ranking test using the real tfidf_matcher backend (no mocks), confirming relevance ordering on plain English input. * test_rerank_statements: strengthen model and bedrock dispatch tests to verify scores actually land on statements, not just that the scorer was called. * test_entity_searches: add edge-case tests for empty graph-store results, store-side exceptions, and missing entity contexts. * [FEATURE] Enhance tests for composite traversal retriever and neptune vector indexes with pytest fixtures and mock dependencies * [FEATURE] Raise coverage of four lexical-graph modules to >=80% per Andrew's review Add unit tests for the modules acarbonetto flagged below 80% on fork PR #1: rerank_statements (69->99%), keyword_vss_provider (72->100%), entity_context_provider (75->96%), neptune_graph_stores (66->92%). Covers the previously-untested scoring-method bodies in rerank_statements, the topic-content path in keyword_vss_provider, the entity context tree builders in entity_context_provider, and intercept_before_parse plus the NeptuneDatabaseClient query/exception paths called out by Kiro. * [FIX] Return one content string per topic in KeywordVSSProvider._get_topic_content The result-collection loop iterated over each topic's joined statement string character by character, so _get_topic_content returned a list of single characters instead of per-topic content. Append the future result directly and skip topics with no statements.

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

github-actions · 2026-06-08T15:52:31Z

Lexical Graph Coverage Report: The coverage is at 59.55% (target: 80%). Download the HTML report here.

mykola-pereyma

Changes requested inline

mykola-pereyma

Review: duplicates, dead code, and minor issues flagged inline.

acarbonetto added 8 commits April 15, 2026 14:13

Initial commit for benchmarking

b1df5c8

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Add benchmarking tasks

1fd0437

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Move evaluation tools

9f42287

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Fix build-tests.sh

e8eaf68

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Fix benchmarking tests

9b002ed

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Add benchmark data

4abc8cb

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Update profiles to include haiku 4.5

0e95391

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Rm temp directory

23ac989

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

acarbonetto mentioned this pull request Apr 28, 2026

[RFC] Local Execution Mode for Integration Tests #229

Open

21 tasks

Fix evaluate tests

da81acc

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

acarbonetto marked this pull request as ready for review May 4, 2026 16:31

oussamahansal changed the title ~~Benchmarking prototype - dataset: CUAD~~ [Benchmarking]: Benchmarking prototype - dataset: CUAD May 28, 2026

Ian Robinson and others added 18 commits June 8, 2026 08:41

Only copy source code to test notebook if install URI not supplied (#197

91cbf49

) Co-authored-by: ianrob <ianrob@amazon.co.uk>

Update to latest LlamaIndex dependencies

5b64287

feat(lexical-graph): standardize graph store init lifecycle (#185)

d45183e

Co-authored-by: zollie <>

feat(falkordb): add tenant-safe idempotent index bootstrap (#186)

d3cb22b

Co-authored-by: zollie <>

update documentation (#198)

aafa929

* update documentation * fix broken mdx files * small comments --------- Co-authored-by: Oussama Hansal <haooussa@amazon.com>

Upgrade release version to 3.18.3 (#201)

a422959

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

pass the wheel file instead of the source code (#202)

0b01b06

Co-authored-by: Oussama Hansal <haooussa@amazon.com>

Add documentation site URL and PyPI link to the top of the root (#215)

8d6be62

* Add documentation site URL and PyPI link to the top of the root * address comments --------- Co-authored-by: Oussama Hansal <haooussa@amazon.com>

Clean up documentation files (#212)

03c69de

--------- Co-authored-by: Oussama Hansal <haooussa@amazon.com>

Fix dependabot error from bumping astro and @astrojs/starlight in /do…

d42d9e1

…cs-site (#219) Co-authored-by: Oussama Hansal <haooussa@amazon.com>

mykola-pereyma and others added 23 commits June 8, 2026 08:44

Add lexical-graph to security scan workflow

60cd466

Extend security-scan.yml to scan lexical-graph/src/ with Bandit, matching the existing byokg-rag pattern. Adds lexical-graph/** to trigger paths and a separate bandit job for lexical-graph.

Increase lexical-graph coverage threshold from 30% to 45% (#281)

589b32d

Install thefuzz when running int tests

400145e

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Update dependency installation

e46f59a

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Update dependency installation

863b679

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

chore: replace NTFS-incompatible source-data files with ZIP archives

83768e9

Replace 3,402 files with :: in their names (invalid on Windows/NTFS) with ZIP archives (amazon.zip, ecorp.zip, ntsb.zip, wiki-aircraft.zip). This fixes the inability to clone the repository on Windows.

Fix benchmarking tests

adb2d9c

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Update profiles to include haiku 4.5

77874d6

Signed-off-by: Andrew Carbonetto <andrew.carbonetto@improving.com>

Merge branch 'main' into acarbo/poc-benchmark-cuad

1b3ba1f

oussamahansal requested a review from mykola-pereyma June 8, 2026 16:59

remove dups

4373205

mykola-pereyma reviewed Jun 9, 2026

View reviewed changes

address comments

d13115d

oussamahansal requested a review from mykola-pereyma June 9, 2026 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Benchmarking]: Benchmarking prototype - dataset: CUAD#228

[Benchmarking]: Benchmarking prototype - dataset: CUAD#228
acarbonetto wants to merge 93 commits into
mainfrom
acarbo/poc-benchmark-cuad

acarbonetto commented Apr 27, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

mykola-pereyma left a comment •

edited

Loading

Uh oh!

mykola-pereyma left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

acarbonetto commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

mykola-pereyma left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mykola-pereyma left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

acarbonetto commented Apr 27, 2026 •

edited

Loading

mykola-pereyma left a comment •

edited

Loading