Skip to content

feat: add include_row_data parameter for Excel search (issue #55)#61

Merged
k-ibaraki merged 4 commits intomainfrom
feat/issue-55-include-row-data
Feb 11, 2026
Merged

feat: add include_row_data parameter for Excel search (issue #55)#61
k-ibaraki merged 4 commits intomainfrom
feat/issue-55-include-row-data

Conversation

@k-ibaraki
Copy link
Copy Markdown
Member

Summary

Add include_row_data parameter to Excel search to retrieve entire row data for each match in a single call, eliminating N+1 API calls.

Motivation

Problem:

  • Current Excel search returns only cell coordinates and values
  • Getting row context requires N additional cell_range reads (N+1 calls)
  • Example: 23 matches = 24 API calls (1 search + 23 reads)

Solution:

  • include_row_data=True fetches all row data in single search call
  • Reduces API calls from N+1 to 1
  • Verified savings: ~2,300 tokens for 23-match case

Changes

Core Implementation (src/sharepoint_excel.py)

  • Add include_row_data parameter to search_cells method
  • Implement _get_row_data helper to extract non-null cells
  • Handle RuntimeError with 2-phase match collection
  • Support both fast path (_cells) and fallback (iter_rows)

Server Integration (src/server.py)

  • Expose include_row_data in MCP tool interface
  • Update tool description with LLM usage guidelines
  • Clarify limitations and performance guidance

Testing

  • 5 new parser-level tests
  • 1 new server-level test + 1 updated test
  • All tests passing, 100% coverage of new code

Documentation

  • Usage examples with sample responses
  • Performance guidelines (<50, 50-200, >200 matches)
  • Same-row match duplication behavior
  • Header exclusion clarification
  • English and Japanese docs

Behavior

Default: include_row_data=False (backward compatible)

Response format with include_row_data=True:

{
  "matches": [{
    "coordinate": "B5",
    "value": "Monthly Budget",
    "row_data": [
      {"coordinate": "A5", "value": "Category"},
      {"coordinate": "B5", "value": "Monthly Budget"},
      {"coordinate": "C5", "value": 50000}
    ]
  }]
}

Key points:

  • Row data includes non-null cells only
  • Headers NOT included (even with frozen_rows)
  • Same-row multiple matches get independent row_data (duplicated)
  • Effective for <200 matches

Verification

  • ✅ All quality checks passing (ruff, ty, pytest)
  • ✅ Verified with real-world data (23 matches)
  • ✅ Token savings: ~2,300 tokens
  • ✅ Response time: significantly reduced

Related

Closes #55

🤖 Generated with Claude Code

k-ibaraki and others added 4 commits February 11, 2026 15:39
Add include_row_data parameter to search_cells method to retrieve
entire row data for each match in a single call, avoiding N+1 reads.

Changes:
- Add include_row_data parameter to search_cells method
- Update _scan_sheet to collect and attach row data when enabled
- Add _get_row_data helper method to extract non-null cells from a row
- Handle RuntimeError by collecting matches before accessing sheet rows
- Support both fast path (_cells) and fallback path (iter_rows)

Behavior:
- Default: False (backward compatible)
- Row data includes only non-null cells
- Same-row multiple matches get independent row_data (duplicated)
- Single-column sheets handled correctly

Related: #55

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Integrate include_row_data parameter into MCP tool interface with
updated description for LLM usage guidance.

Changes:
- Add include_row_data parameter to sharepoint_excel function
- Pass parameter to search_cells in search mode
- Update tool description with usage guidelines and limitations

Tool description updates:
- Clarify that row_data contains matched rows only (not headers)
- Note same-row match duplication behavior
- Emphasize A1:Z5 header read requirement
- Add performance guidance (<200 matches recommended)

Related: #55

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add test coverage for include_row_data parameter in both parser
and server layers.

Test cases:
- Basic row data inclusion (include_row_data=True)
- Default behavior verification (include_row_data=False)
- Same-row multiple matches with independent row_data
- Null cell exclusion from row_data
- Multi-sheet search with row data
- Server-level parameter passing

Coverage:
- Parser layer: 5 new tests
- Server layer: 1 new test + 1 updated test

All tests passing with 100% coverage of new code paths.

Related: #55

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add comprehensive documentation for include_row_data parameter
including usage examples, performance guidelines, and limitations.

Documentation updates:
- Add parameter description to tool parameters table
- Add usage example with sample response
- Add performance guidelines (<50, 50-200, >200 matches)
- Document same-row match duplication behavior
- Clarify header exclusion and A1:Z5 requirement
- Include verified use case (23 matches, ~2,300 token savings)

Languages:
- English: docs/usage.md
- Japanese: docs/usage_ja.md

Related: #55

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 11, 2026 06:42
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @k-ibaraki, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Excel search tool by adding an include_row_data parameter. This allows for the efficient retrieval of complete row context alongside search matches, addressing a common N+1 API call problem. The change improves performance and reduces resource consumption by consolidating multiple data fetches into a single, optimized operation, while maintaining backward compatibility and providing clear usage guidelines.

Highlights

  • New Feature: include_row_data parameter: Introduced a new include_row_data boolean parameter to the Excel search functionality, allowing users to retrieve the entire row data for each matched cell.
  • Performance Optimization: This feature significantly reduces API calls from an N+1 pattern (one search + N cell reads) to a single call, leading to substantial token savings (~2,300 tokens for 23 matches) and reduced response times.
  • Backward Compatibility: The default value for include_row_data is False, ensuring existing integrations remain unaffected.
  • Comprehensive Documentation: Updated usage documentation in both English and Japanese, including examples, performance guidelines, and important notes on row_data behavior (e.g., non-null cells only, no headers, duplication for same-row matches).
  • Robust Testing: Added multiple new unit tests covering various scenarios for the include_row_data parameter, ensuring its correct functionality and integration.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • docs/usage.md
    • Added include_row_data parameter to the sharepoint_excel tool documentation.
    • Included new usage examples demonstrating how to use include_row_data=True.
    • Provided performance guidelines and important notes regarding row_data behavior.
  • docs/usage_ja.md
    • Added Japanese documentation for the include_row_data parameter.
    • Included Japanese usage examples and performance guidelines for the new feature.
  • src/server.py
    • Added include_row_data as a parameter to the sharepoint_excel function.
    • Updated the tool's description to reflect the new include_row_data functionality and LLM usage guidelines.
    • Passed the include_row_data parameter from the server to the SharePointExcelParser.search_cells method.
  • src/sharepoint_excel.py
    • Added include_row_data parameter to the search_cells method signature.
    • Modified _scan_sheet to conditionally include row data based on the include_row_data flag.
    • Implemented a new private helper method _get_row_data to extract non-null cells for a given row.
    • Adjusted the cell scanning logic to collect matches in a two-phase approach to handle _cells iteration safely when include_row_data is true.
  • tests/test_server.py
    • Updated an existing test case to include include_row_data=False in the search_cells mock assertion.
    • Added a new unit test test_excel_search_with_include_row_data to verify that include_row_data=True is correctly passed to the parser.
  • tests/test_sharepoint_excel.py
    • Added test_search_cells_include_row_data_true to verify row data is included when the flag is true.
    • Added test_search_cells_include_row_data_false_default to confirm row data is not included by default.
    • Added test_search_cells_include_row_data_multiple_matches_same_row to test behavior with multiple matches in a single row.
    • Added test_search_cells_include_row_data_null_cells_excluded to ensure null cells are excluded from row_data.
    • Added test_search_cells_include_row_data_multiple_sheets to verify correct functionality across multiple sheets.
Activity
  • The pull request introduces a new feature to optimize Excel search operations by reducing API calls.
  • Core logic in src/sharepoint_excel.py was updated to handle the new parameter and retrieve row data efficiently.
  • The src/server.py file was modified to expose this new functionality through the tool interface.
  • Documentation in both English and Japanese (docs/usage.md, docs/usage_ja.md) has been updated to reflect the new parameter and its usage.
  • Comprehensive unit tests were added and updated in tests/test_server.py and tests/test_sharepoint_excel.py to ensure the new feature works as expected across various scenarios.
  • The author reports that all quality checks (ruff, ty, pytest) are passing and verified token savings and reduced response times with real-world data.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

このPRは、Excel検索機能に include_row_data パラメータを追加し、検索結果に各マッチの行全体のデータを含めることができるようにする優れた改善です。N+1のAPI呼び出し問題を解決し、パフォーマンスとトークン効率を大幅に向上させます。実装は堅牢で、テストカバレッジも十分です。

Changes:

  • Excel検索に include_row_data パラメータを追加し、1回の呼び出しで行全体のデータを取得可能に(N+1呼び出しを1呼び出しに削減)
  • 2フェーズマッチ収集により、_cells 辞書のイテレーション中のシートアクセスエラーを回避
  • 包括的なテストカバレッジ(5つのパーサーレベルテスト、2つのサーバーレベルテスト)
  • 英語と日本語の両方で詳細なドキュメントとパフォーマンスガイドラインを提供

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/sharepoint_excel.py search_cells_scan_sheetinclude_row_data パラメータを追加、新しい _get_row_data ヘルパーメソッドを実装、高速パス(_cells)とフォールバック(iter_rows)の両方をサポート
src/server.py MCPツールインターフェースで include_row_data を公開、ツール説明を更新してLLM使用ガイドラインと制限事項を追加
tests/test_sharepoint_excel.py 5つの新しいテストを追加:基本機能、デフォルト動作、同一行の複数マッチ、nullセルの除外、複数シートのサポート
tests/test_server.py include_row_data=True のパススルーを検証する新しいテストを追加、既存テストを更新してデフォルト値を確認
docs/usage.md 使用例、レスポンス形式、パフォーマンスガイドライン、重要な注意事項を含む英語ドキュメントを追加
docs/usage_ja.md 使用例、レスポンス形式、パフォーマンスガイドライン、重要な注意事項を含む日本語ドキュメントを追加
Comments suppressed due to low confidence (1)

src/sharepoint_excel.py:48

  • docstringに新しいパラメータ include_row_data の説明が不足しています。Args セクションに include_row_data: bool パラメータの説明を追加してください(例: 「Trueの場合、各マッチに行全体のデータ(非nullセルのみ)を含める(デフォルト: False)」)。
        """
        セル内容を検索して該当位置を返す

        Args:
            file_path: Excelファイルのパス
            query: 検索キーワード
            sheet_name: 検索対象シート名(指定時はまずそのシートを検索し、マッチ0件なら全シート検索にフォールバック)

        Returns:
            JSON文字列(マッチしたセルの位置情報)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new include_row_data parameter to the Excel search functionality, effectively solving the N+1 problem for row data fetching, which significantly improves performance and reduces token usage. The implementation is robust, incorporating both a fast path using openpyxl's private _cells attribute and a fallback to iter_rows, along with a two-phase match collection to prevent RuntimeError. The changes are well-integrated, include comprehensive unit tests, and are thoroughly documented. Please note that a full security analysis could not be performed for this pull request as the content of src/server.py was not provided. For future reviews, please ensure all relevant file contents are available for a complete security assessment.

@k-ibaraki k-ibaraki merged commit e493d24 into main Feb 11, 2026
7 checks passed
@k-ibaraki k-ibaraki deleted the feat/issue-55-include-row-data branch February 11, 2026 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add include_row_data parameter to Excel search for efficient data retrieval

2 participants