feat: add include_row_data parameter for Excel search (issue #55)#61
feat: add include_row_data parameter for Excel search (issue #55)#61
Conversation
Add include_row_data parameter to search_cells method to retrieve entire row data for each match in a single call, avoiding N+1 reads. Changes: - Add include_row_data parameter to search_cells method - Update _scan_sheet to collect and attach row data when enabled - Add _get_row_data helper method to extract non-null cells from a row - Handle RuntimeError by collecting matches before accessing sheet rows - Support both fast path (_cells) and fallback path (iter_rows) Behavior: - Default: False (backward compatible) - Row data includes only non-null cells - Same-row multiple matches get independent row_data (duplicated) - Single-column sheets handled correctly Related: #55 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Integrate include_row_data parameter into MCP tool interface with updated description for LLM usage guidance. Changes: - Add include_row_data parameter to sharepoint_excel function - Pass parameter to search_cells in search mode - Update tool description with usage guidelines and limitations Tool description updates: - Clarify that row_data contains matched rows only (not headers) - Note same-row match duplication behavior - Emphasize A1:Z5 header read requirement - Add performance guidance (<200 matches recommended) Related: #55 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add test coverage for include_row_data parameter in both parser and server layers. Test cases: - Basic row data inclusion (include_row_data=True) - Default behavior verification (include_row_data=False) - Same-row multiple matches with independent row_data - Null cell exclusion from row_data - Multi-sheet search with row data - Server-level parameter passing Coverage: - Parser layer: 5 new tests - Server layer: 1 new test + 1 updated test All tests passing with 100% coverage of new code paths. Related: #55 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add comprehensive documentation for include_row_data parameter including usage examples, performance guidelines, and limitations. Documentation updates: - Add parameter description to tool parameters table - Add usage example with sample response - Add performance guidelines (<50, 50-200, >200 matches) - Document same-row match duplication behavior - Clarify header exclusion and A1:Z5 requirement - Include verified use case (23 matches, ~2,300 token savings) Languages: - English: docs/usage.md - Japanese: docs/usage_ja.md Related: #55 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Summary of ChangesHello @k-ibaraki, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Excel search tool by adding an Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
このPRは、Excel検索機能に include_row_data パラメータを追加し、検索結果に各マッチの行全体のデータを含めることができるようにする優れた改善です。N+1のAPI呼び出し問題を解決し、パフォーマンスとトークン効率を大幅に向上させます。実装は堅牢で、テストカバレッジも十分です。
Changes:
- Excel検索に
include_row_dataパラメータを追加し、1回の呼び出しで行全体のデータを取得可能に(N+1呼び出しを1呼び出しに削減) - 2フェーズマッチ収集により、
_cells辞書のイテレーション中のシートアクセスエラーを回避 - 包括的なテストカバレッジ(5つのパーサーレベルテスト、2つのサーバーレベルテスト)
- 英語と日本語の両方で詳細なドキュメントとパフォーマンスガイドラインを提供
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/sharepoint_excel.py | search_cells と _scan_sheet に include_row_data パラメータを追加、新しい _get_row_data ヘルパーメソッドを実装、高速パス(_cells)とフォールバック(iter_rows)の両方をサポート |
| src/server.py | MCPツールインターフェースで include_row_data を公開、ツール説明を更新してLLM使用ガイドラインと制限事項を追加 |
| tests/test_sharepoint_excel.py | 5つの新しいテストを追加:基本機能、デフォルト動作、同一行の複数マッチ、nullセルの除外、複数シートのサポート |
| tests/test_server.py | include_row_data=True のパススルーを検証する新しいテストを追加、既存テストを更新してデフォルト値を確認 |
| docs/usage.md | 使用例、レスポンス形式、パフォーマンスガイドライン、重要な注意事項を含む英語ドキュメントを追加 |
| docs/usage_ja.md | 使用例、レスポンス形式、パフォーマンスガイドライン、重要な注意事項を含む日本語ドキュメントを追加 |
Comments suppressed due to low confidence (1)
src/sharepoint_excel.py:48
- docstringに新しいパラメータ
include_row_dataの説明が不足しています。Args セクションにinclude_row_data: boolパラメータの説明を追加してください(例: 「Trueの場合、各マッチに行全体のデータ(非nullセルのみ)を含める(デフォルト: False)」)。
"""
セル内容を検索して該当位置を返す
Args:
file_path: Excelファイルのパス
query: 検索キーワード
sheet_name: 検索対象シート名(指定時はまずそのシートを検索し、マッチ0件なら全シート検索にフォールバック)
Returns:
JSON文字列(マッチしたセルの位置情報)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request introduces a new include_row_data parameter to the Excel search functionality, effectively solving the N+1 problem for row data fetching, which significantly improves performance and reduces token usage. The implementation is robust, incorporating both a fast path using openpyxl's private _cells attribute and a fallback to iter_rows, along with a two-phase match collection to prevent RuntimeError. The changes are well-integrated, include comprehensive unit tests, and are thoroughly documented. Please note that a full security analysis could not be performed for this pull request as the content of src/server.py was not provided. For future reviews, please ensure all relevant file contents are available for a complete security assessment.
Summary
Add
include_row_dataparameter to Excel search to retrieve entire row data for each match in a single call, eliminating N+1 API calls.Motivation
Problem:
cell_rangereads (N+1 calls)Solution:
include_row_data=Truefetches all row data in single search callChanges
Core Implementation (
src/sharepoint_excel.py)include_row_dataparameter tosearch_cellsmethod_get_row_datahelper to extract non-null cells_cells) and fallback (iter_rows)Server Integration (
src/server.py)include_row_datain MCP tool interfaceTesting
Documentation
Behavior
Default:
include_row_data=False(backward compatible)Response format with
include_row_data=True:{ "matches": [{ "coordinate": "B5", "value": "Monthly Budget", "row_data": [ {"coordinate": "A5", "value": "Category"}, {"coordinate": "B5", "value": "Monthly Budget"}, {"coordinate": "C5", "value": 50000} ] }] }Key points:
row_data(duplicated)Verification
Related
Closes #55
🤖 Generated with Claude Code