-
Notifications
You must be signed in to change notification settings - Fork 4
Add row_schema field to SpaceInfo
#18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| let mut result: Vec<IndexValue<T>> = vec![]; | ||
| for interval in intervals.iter() { | ||
| for index in interval.0..=interval.1 { | ||
| let mut index_records = parse_index_page::<T, PAGE_SIZE>(file, index as u32)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, it looks like for every Link we are currently seeking at page start and re-reading header and then seeking to data start. I think it's not the best way to do that
| Ok(result) | ||
| } | ||
|
|
||
| fn read_links<DataType, const PAGE_SIZE: usize>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part also looks suspicious. I think for big tables it will not work
|
@a-rodin I think we can use iterators approach. You need to implement custom ones for this task. I think there will be two iterators: index iterator and data iterator. Index iterator will load one index page and yield links. When page is read then next page is loaded into memory and parsed. So one page load will give data for many next calls. For data iterator you need to add fn that will read multiple data records from single page. It will have vec of Data iterator is harder as I think. The issue now we have is frequent re-read of data and too much seeks. I think data iterator can have next logic. It will use index iterator and will get vec of
|
|
@Handy-caT I have added three iterators, |
the2pizza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me with small notes
| use super::{Interval, SpaceInfo}; | ||
|
|
||
| pub fn map_index_pages_to_general<T>(pages: Vec<IndexData<T>>) -> Vec<General<IndexData<T>>> { | ||
| let mut header = &mut GeneralHeader::new(0.into(), PageType::Index, 0.into()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By my compiler warnings it shouldn't be mut
| for index in interval.0..interval.1 { | ||
| let index_page = parse_index_page::<T, PAGE_SIZE>(file, index as u32)?; | ||
| result.push(index_page.inner); | ||
| for index in interval.0..=interval.1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit doubt when I see nested loops. For improving performance I suppose we can merge intervals to reduce amount of iterations.
This PR is a part of work on pathscale/WorkTable#24.