Skip to content

Split arrow reader into smaller modules#2358

Merged
CTTY merged 2 commits intoapache:mainfrom
blackmwk:ir-2309
Apr 23, 2026
Merged

Split arrow reader into smaller modules#2358
CTTY merged 2 commits intoapache:mainfrom
blackmwk:ir-2309

Conversation

@blackmwk
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

What changes are included in this PR?

Split arrow reader module into smaller onces so that it would be easier to maintain. I didn't do any extra changes on purpose to make the pr easier to read.

Are these changes tested?

ut.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mbutrovich
Copy link
Copy Markdown
Collaborator

Thanks for tackling this @blackmwk! My only comment is that I noticed PredicateConverter fields are pub(super) with pub(super) column_indices: &'a Vec<usize>. In the original they were pub. This is actually a tightening of visibility, which is fine. I think long term that's the right move, and now I wonder if there's anything else we should tighten visibility on, but for now this looks good.

Copy link
Copy Markdown
Collaborator

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left one minor comment. Thank you @blackmwk!

Copy link
Copy Markdown
Collaborator

@CTTY CTTY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@CTTY CTTY merged commit 4b0b352 into apache:main Apr 23, 2026
20 checks passed
@blackmwk
Copy link
Copy Markdown
Contributor Author

Thanks for tackling this @blackmwk! My only comment is that I noticed PredicateConverter fields are pub(super) with pub(super) column_indices: &'a Vec<usize>. In the original they were pub. This is actually a tightening of visibility, which is fine. I think long term that's the right move, and now I wonder if there's anything else we should tighten visibility on, but for now this looks good.

Thanks for reporting, I agree that it's fine for now to make it a crate private api.

mbutrovich added a commit to mbutrovich/iceberg-rust that referenced this pull request Apr 24, 2026
Resolve conflict from PR apache#2358 splitting reader.rs into modules.
Port bytes_read/ScanMetrics changes into reader/pipeline.rs:
- FileScanTaskReader struct with ScanMetrics
- CountingFileRead wrapping in open_parquet_file
- ScanResult return type from read()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Split arrow/reader.rs module into smaller modules.

3 participants