diff --git a/.codeboarding/Core_Model.md b/.codeboarding/Core_Model.md
new file mode 100644
index 0000000..17dba94
--- /dev/null
+++ b/.codeboarding/Core_Model.md
@@ -0,0 +1,87 @@
+```mermaid
+
+graph LR
+
+    Core_Model["Core Model"]
+
+    General_Utilities["General Utilities"]
+
+    Inference_Engine["Inference Engine"]
+
+    Core_Model -- "incorporates functions from" --> General_Utilities
+
+    Core_Model -- "feeds data to" --> Inference_Engine
+
+    click Core_Model href "https://github.com/Genentech/equifold/blob/main/.codeboarding//Core_Model.md" "Details"
+
+```
+
+
+
+[![CodeBoarding](https://img.shields.io/badge/Generated%20by-CodeBoarding-9cf?style=flat-square)](https://github.com/CodeBoarding/GeneratedOnBoardings)[![Demo](https://img.shields.io/badge/Try%20our-Demo-blue?style=flat-square)](https://www.codeboarding.org/demo)[![Contact](https://img.shields.io/badge/Contact%20us%20-%20contact@codeboarding.org-lightgrey?style=flat-square)](mailto:contact@codeboarding.org)
+
+
+
+## Details
+
+
+
+The Core Model component is fundamental to this project as it encapsulates the neural network architecture responsible for predicting protein structures. Its design follows the "Machine Learning Model Development and Inference" pattern by clearly separating the model's definition from other concerns.
+
+
+
+### Core Model [[Expand]](./Core_Model.md)
+
+This component defines the neural network architecture, including its layers, modules, and the forward pass logic. It's responsible for learning and predicting protein structures from input features. It leverages PyTorch and e3nn for building equivariant neural networks, which are crucial for handling 3D structural data.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- `MLP` (43:43)
+
+- `BesselBasis` (70:70)
+
+- `RadialNN` (93:93)
+
+- `LayerNorm` (139:139)
+
+- `Emb` (172:172)
+
+
+
+
+
+### General Utilities
+
+Provides essential utility functions for calculations within the model's forward pass, such as computing structural metrics and loss functions.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+### Inference Engine
+
+Responsible for loading the trained Core Model and feeding it input data to predict protein structures.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+
+
+
+
+### [FAQ](https://github.com/CodeBoarding/GeneratedOnBoardings/tree/main?tab=readme-ov-file#faq)
\ No newline at end of file
diff --git a/.codeboarding/Data_Ingestion.md b/.codeboarding/Data_Ingestion.md
new file mode 100644
index 0000000..5566798
--- /dev/null
+++ b/.codeboarding/Data_Ingestion.md
@@ -0,0 +1,109 @@
+```mermaid
+
+graph LR
+
+    Data_Ingestion["Data Ingestion"]
+
+    openfold_light_mmcif_parsing["openfold_light.mmcif_parsing"]
+
+    openfold_light_parsers["openfold_light.parsers"]
+
+    DataHandler["DataHandler"]
+
+    CoreModel["CoreModel"]
+
+    Data_Ingestion -- "comprises" --> openfold_light_mmcif_parsing
+
+    Data_Ingestion -- "comprises" --> openfold_light_parsers
+
+    Data_Ingestion -- "provides processed data to" --> DataHandler
+
+    openfold_light_mmcif_parsing -- "provides parsed structural data to" --> DataHandler
+
+    openfold_light_parsers -- "supplies parsed sequence and alignment data to" --> DataHandler
+
+    DataHandler -- "feeds evolutionary features to" --> CoreModel
+
+    click Data_Ingestion href "https://github.com/Genentech/equifold/blob/main/.codeboarding//Data_Ingestion.md" "Details"
+
+```
+
+
+
+[![CodeBoarding](https://img.shields.io/badge/Generated%20by-CodeBoarding-9cf?style=flat-square)](https://github.com/CodeBoarding/GeneratedOnBoardings)[![Demo](https://img.shields.io/badge/Try%20our-Demo-blue?style=flat-square)](https://www.codeboarding.org/demo)[![Contact](https://img.shields.io/badge/Contact%20us%20-%20contact@codeboarding.org-lightgrey?style=flat-square)](mailto:contact@codeboarding.org)
+
+
+
+## Details
+
+
+
+One paragraph explaining the functionality which is represented by this graph. What the main flow is and what is its purpose.
+
+
+
+### Data Ingestion [[Expand]](./Data_Ingestion.md)
+
+Responsible for the initial processing of raw biological data, involving parsing various file formats to extract essential information for downstream feature generation.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+### openfold_light.mmcif_parsing
+
+Module specifically designed for parsing Macromolecular Crystallographic Information File (MMCIF) data. It handles the complex structure of MMCIF files to extract atomic coordinates, chain identifiers, and other structural details of proteins. This is fundamental for processing experimental protein structures.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+### openfold_light.parsers
+
+Module provides general parsing capabilities for sequence and alignment data formats, such as FASTA, A3M, and Stockholm. It extracts sequence information, multiple sequence alignments (MSAs), and template hit data, which are critical for generating evolutionary features for protein folding models. It also includes functionality to convert Stockholm format to A3M.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+### DataHandler
+
+Component responsible for further processing of data, such as feature generation or protein object creation, after initial ingestion.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+### CoreModel
+
+Component that receives evolutionary features for protein folding models.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+
+
+
+
+### [FAQ](https://github.com/CodeBoarding/GeneratedOnBoardings/tree/main?tab=readme-ov-file#faq)
\ No newline at end of file
diff --git a/.codeboarding/Feature_Engineering.md b/.codeboarding/Feature_Engineering.md
new file mode 100644
index 0000000..f640c43
--- /dev/null
+++ b/.codeboarding/Feature_Engineering.md
@@ -0,0 +1,97 @@
+```mermaid
+
+graph LR
+
+    Feature_Engineering["Feature Engineering"]
+
+    Data_Ingestion_and_Parsing["Data Ingestion and Parsing"]
+
+    InferenceEngine["InferenceEngine"]
+
+    Data_Ingestion_and_Parsing -- "provides parsed data to" --> Feature_Engineering
+
+    Feature_Engineering -- "outputs processed features to" --> InferenceEngine
+
+    click Feature_Engineering href "https://github.com/Genentech/equifold/blob/main/.codeboarding//Feature_Engineering.md" "Details"
+
+```
+
+
+
+[![CodeBoarding](https://img.shields.io/badge/Generated%20by-CodeBoarding-9cf?style=flat-square)](https://github.com/CodeBoarding/GeneratedOnBoardings)[![Demo](https://img.shields.io/badge/Try%20our-Demo-blue?style=flat-square)](https://www.codeboarding.org/demo)[![Contact](https://img.shields.io/badge/Contact%20us%20-%20contact@codeboarding.org-lightgrey?style=flat-square)](mailto:contact@codeboarding.org)
+
+
+
+## Details
+
+
+
+This analysis details the 'Feature Engineering' component, primarily implemented by `openfold_light.data_pipeline`, which transforms raw data into standardized numerical features. It clarifies its dependencies on 'Data Ingestion and Parsing' for pre-processed data and its role in providing input to the 'InferenceEngine'.
+
+
+
+### Feature Engineering [[Expand]](./Feature_Engineering.md)
+
+This component transforms the raw data ingested by the `Data Ingestion and Parsing` module into a standardized set of numerical features suitable for the machine learning model. It generates sequence-based features, template features, protein features from structural inputs, and Multiple Sequence Alignment (MSA) features, preparing these as input tensors for the model.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/data_pipeline.py#L38-L62" target="_blank" rel="noopener noreferrer">`openfold_light.data_pipeline:make_template_features` (38:62)</a>
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/data_pipeline.py#L65-L84" target="_blank" rel="noopener noreferrer">`openfold_light.data_pipeline:make_sequence_features` (65:84)</a>
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/data_pipeline.py#L87-L120" target="_blank" rel="noopener noreferrer">`openfold_light.data_pipeline:make_mmcif_features` (87:120)</a>
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/data_pipeline.py#L130-L157" target="_blank" rel="noopener noreferrer">`openfold_light.data_pipeline:make_protein_features` (130:157)</a>
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/data_pipeline.py#L160-L177" target="_blank" rel="noopener noreferrer">`openfold_light.data_pipeline:make_pdb_features` (160:177)</a>
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/data_pipeline.py#L180-L213" target="_blank" rel="noopener noreferrer">`openfold_light.data_pipeline:make_msa_features` (180:213)</a>
+
+
+
+
+
+### Data Ingestion and Parsing
+
+Handles the initial reading and parsing of raw data formats (e.g., FASTA, A3M, mmCIF files) and provides pre-processed data structures to other components.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/parsers.py" target="_blank" rel="noopener noreferrer">`openfold_light.parsers`</a>
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/mmcif_parsing.py" target="_blank" rel="noopener noreferrer">`openfold_light.mmcif_parsing`</a>
+
+
+
+
+
+### InferenceEngine
+
+Consumes the processed feature dictionaries from the Feature Engineering component to perform model predictions or further processing.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+
+
+
+
+### [FAQ](https://github.com/CodeBoarding/GeneratedOnBoardings/tree/main?tab=readme-ov-file#faq)
\ No newline at end of file
diff --git a/.codeboarding/Training_Inference_Orchestration.md b/.codeboarding/Training_Inference_Orchestration.md
new file mode 100644
index 0000000..22667ee
--- /dev/null
+++ b/.codeboarding/Training_Inference_Orchestration.md
@@ -0,0 +1,133 @@
+```mermaid
+
+graph LR
+
+    Training_Inference_Orchestration["Training & Inference Orchestration"]
+
+    ModelRepository["ModelRepository"]
+
+    DataHandler["DataHandler"]
+
+    CoreModel["CoreModel"]
+
+    GeneralUtilities["GeneralUtilities"]
+
+    Training_Inference_Orchestration -- "utilizes" --> DataHandler
+
+    Training_Inference_Orchestration -- "trains" --> CoreModel
+
+    Training_Inference_Orchestration -- "leverages" --> GeneralUtilities
+
+    Training_Inference_Orchestration -- "loads from" --> ModelRepository
+
+    Training_Inference_Orchestration -- "feeds data to" --> CoreModel
+
+    click Training_Inference_Orchestration href "https://github.com/Genentech/equifold/blob/main/.codeboarding//Training_Inference_Orchestration.md" "Details"
+
+```
+
+
+
+[![CodeBoarding](https://img.shields.io/badge/Generated%20by-CodeBoarding-9cf?style=flat-square)](https://github.com/CodeBoarding/GeneratedOnBoardings)[![Demo](https://img.shields.io/badge/Try%20our-Demo-blue?style=flat-square)](https://www.codeboarding.org/demo)[![Contact](https://img.shields.io/badge/Contact%20us%20-%20contact@codeboarding.org-lightgrey?style=flat-square)](mailto:contact@codeboarding.org)
+
+
+
+## Details
+
+
+
+One paragraph explaining the functionality which is represented by this graph. What the main flow is and what is its purpose.
+
+
+
+### Training & Inference Orchestration [[Expand]](./Training_Inference_Orchestration.md)
+
+This component is responsible for orchestrating the entire machine learning workflow, encompassing both model training and inference. For training, it manages data loading, model optimization, loss calculation, and checkpointing. For inference, it handles loading trained models, preparing input data, executing predictions, and post-processing raw model outputs into structured protein data (e.g., PDB files). It acts as the primary control flow for the deep learning operations.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/run_inference.py#L1-L1" target="_blank" rel="noopener noreferrer">`run_inference.py` (1:1)</a>
+
+
+
+
+
+### ModelRepository
+
+Stores model configurations and weights, and saves trained model checkpoints.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+### DataHandler
+
+Handles data loading, preprocessing, and post-processing, including converting sequences to features and generating PDB files.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/utils_data.py#L1-L1" target="_blank" rel="noopener noreferrer">`utils_data.py` (1:1)</a>
+
+- `sequence_to_feats` (1:1)
+
+- `process_one` (1:1)
+
+- `x_to_pdb` (1:1)
+
+
+
+
+
+### CoreModel
+
+Defines the neural network architecture and performs forward and backward passes for protein folding predictions.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/models.py#L1-L1" target="_blank" rel="noopener noreferrer">`models.py` (1:1)</a>
+
+- `NN` (1:1)
+
+
+
+
+
+### GeneralUtilities
+
+Provides common helper functions for tasks like logging, metrics, coarse-graining, and other data transformations.
+
+
+
+
+
+**Related Classes/Methods**: _None_
+
+
+
+
+
+
+
+### [FAQ](https://github.com/CodeBoarding/GeneratedOnBoardings/tree/main?tab=readme-ov-file#faq)
\ No newline at end of file
diff --git a/.codeboarding/on_boarding.md b/.codeboarding/on_boarding.md
new file mode 100644
index 0000000..a782c3a
--- /dev/null
+++ b/.codeboarding/on_boarding.md
@@ -0,0 +1,197 @@
+```mermaid
+
+graph LR
+
+    Data_Ingestion["Data Ingestion"]
+
+    Feature_Engineering["Feature Engineering"]
+
+    Protein_Data_Representation["Protein Data Representation"]
+
+    Biophysical_Utilities["Biophysical Utilities"]
+
+    Core_Model["Core Model"]
+
+    Training_Inference_Orchestration["Training & Inference Orchestration"]
+
+    Configuration_Management["Configuration Management"]
+
+    Data_Ingestion -- "provides raw data to" --> Feature_Engineering
+
+    Feature_Engineering -- "provides processed features to" --> Core_Model
+
+    Protein_Data_Representation -- "relies on" --> Biophysical_Utilities
+
+    Biophysical_Utilities -- "provides constants/helpers to" --> Feature_Engineering
+
+    Biophysical_Utilities -- "provides constants/helpers to" --> Protein_Data_Representation
+
+    Training_Inference_Orchestration -- "orchestrates" --> Core_Model
+
+    Training_Inference_Orchestration -- "utilizes" --> Configuration_Management
+
+    Training_Inference_Orchestration -- "uses" --> Protein_Data_Representation
+
+    Configuration_Management -- "provides settings to" --> Training_Inference_Orchestration
+
+    click Data_Ingestion href "https://github.com/Genentech/equifold/blob/main/.codeboarding//Data_Ingestion.md" "Details"
+
+    click Feature_Engineering href "https://github.com/Genentech/equifold/blob/main/.codeboarding//Feature_Engineering.md" "Details"
+
+    click Core_Model href "https://github.com/Genentech/equifold/blob/main/.codeboarding//Core_Model.md" "Details"
+
+    click Training_Inference_Orchestration href "https://github.com/Genentech/equifold/blob/main/.codeboarding//Training_Inference_Orchestration.md" "Details"
+
+```
+
+
+
+[![CodeBoarding](https://img.shields.io/badge/Generated%20by-CodeBoarding-9cf?style=flat-square)](https://github.com/CodeBoarding/GeneratedOnBoardings)[![Demo](https://img.shields.io/badge/Try%20our-Demo-blue?style=flat-square)](https://www.codeboarding.org/demo)[![Contact](https://img.shields.io/badge/Contact%20us%20-%20contact@codeboarding.org-lightgrey?style=flat-square)](mailto:contact@codeboarding.org)
+
+
+
+## Details
+
+
+
+The `equifold` project, focused on Machine Learning Model Development and Inference in computational structural biology, exhibits a modular and data-centric architecture. The analysis of its Control Flow Graph (CFG) and source code reveals a clear separation of concerns, facilitating robust data pipelines and efficient model training/inference.
+
+
+
+### Data Ingestion [[Expand]](./Data_Ingestion.md)
+
+This component is responsible for parsing raw biological data from various file formats, including structural data (MMCIF) and sequence/alignment data (A3M, Stockholm, HHR). It extracts essential information such as atomic coordinates, chain identifiers, sequence data, and template hit information, preparing it for feature generation.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/mmcif_parsing.py#L1-L1" target="_blank" rel="noopener noreferrer">`openfold_light.mmcif_parsing` (1:1)</a>
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/parsers.py#L1-L1" target="_blank" rel="noopener noreferrer">`openfold_light.parsers` (1:1)</a>
+
+
+
+
+
+### Feature Engineering [[Expand]](./Feature_Engineering.md)
+
+This central component transforms the raw data ingested by the `Data Ingestion` module into a standardized set of numerical features suitable for the machine learning model. It generates sequence-based features, template features, and protein features from structural inputs, and prepares these as input tensors for the model.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/data_pipeline.py#L1-L1" target="_blank" rel="noopener noreferrer">`openfold_light.data_pipeline` (1:1)</a>
+
+
+
+
+
+### Protein Data Representation
+
+This component defines the internal data structures for representing protein information, including atoms, residues, and their coordinates. It also provides utilities for converting protein data to and from common formats (e.g., PDB strings) and for constructing protein objects from model predictions, facilitating downstream analysis and visualization.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/protein.py#L1-L1" target="_blank" rel="noopener noreferrer">`openfold_light.protein` (1:1)</a>
+
+
+
+
+
+### Biophysical Utilities
+
+This component serves as a repository for fundamental amino acid properties, stereochemical constants, and utility functions essential for structural calculations, data manipulation, and validation across the project. It provides foundational data and operations for other components.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- <a href="https://github.com/genentech/equifold/blob/main/openfold_light/residue_constants.py#L1-L1" target="_blank" rel="noopener noreferrer">`openfold_light.residue_constants` (1:1)</a>
+
+
+
+
+
+### Core Model [[Expand]](./Core_Model.md)
+
+This is the heart of the machine learning system, defining the neural network architecture (e.g., OpenFold model). It encapsulates the layers, modules, and forward pass logic responsible for learning and predicting protein structures from the input features.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- `openfold_light.model` (1:1)
+
+
+
+
+
+### Training & Inference Orchestration [[Expand]](./Training_Inference_Orchestration.md)
+
+This component manages the overall training and inference workflows. For training, it handles data loading, optimization, loss calculation, and model checkpointing. For inference, it orchestrates the prediction process, including loading models and running predictions on new data, and post-processes raw model outputs into structured protein data.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- `openfold_light.train` (1:1)
+
+- `openfold_light.inference` (1:1)
+
+- `openfold_light.run_inference` (1:1)
+
+
+
+
+
+### Configuration Management
+
+This component centralizes the management of all configurable parameters for the project, including model hyperparameters, data paths, training settings, and inference options. It ensures that the system can be easily configured and adapted without modifying source code, promoting reproducibility.
+
+
+
+
+
+**Related Classes/Methods**:
+
+
+
+- `openfold_light.config` (1:1)
+
+
+
+
+
+
+
+
+
+### [FAQ](https://github.com/CodeBoarding/GeneratedOnBoardings/tree/main?tab=readme-ov-file#faq)
\ No newline at end of file