[EASI] merge EASI added benchmarks and models into VLMEvalKit#1433
[EASI] merge EASI added benchmarks and models into VLMEvalKit#1433mzr1996 merged 51 commits intoopen-compass:mainfrom
Conversation
|
hi @mzr1996 I have followed the suggestions:
Ready for re-review. Thanks! |
|
hi @mzr1996 |
|
Looks like we need to update the requirements file. |
Thanks for the heads-up! I have updated the requirements to include the missing dependencies. However, it seems a new issue has emerged in the latest CI run. I'm currently looking into what might be causing this. |
|
I have invited our QA team to check the CI result. Please wait for our fix. |
|
@PeterWangyi Please rebase main branch. The baseline and image path is updated in main branch. Thanks! |
* [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download
* [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation
* [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch
* [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com>
* [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com>
* [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models * [Feature] Add SenseSI series models * [Feature] add use custom propmt flag to contrl prompt format. --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com>
…ce and rename spatial utils folder (open-compass#7) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Refactor] Change all EASI related bench tsv download url * [Refactor] Add caa & mra definition and add site & vsi paper link * [Refactor] declare no circular is aligned with mmsi offical mmsi * [Refactor] rename spatial utils folder to reduce confusion * [Refactor] add EASI prompt format explaination * [Refactor] add EASI prompt format explaination * [Refactor] switch to new hf url
…mpatibility with latest transformers (open-compass#8) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version
* [Benchmark] Support RefCOCO (open-compass#1305) * Suppot Qwen3VL Series * Support Qwen3-VL Series * Support Qwen3-VL Series * Support Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * support refcoco * fix lint * [Benchmark] Add MindCube Bench (open-compass#1) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Benchmark] Add EASI related image spatial bench (open-compass#2) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Benchmark] Add EASI related video benchmark (open-compass#3) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Model] Add SpatialMLLM model (open-compass#4) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Model] Add Spatial VLM Models (open-compass#5) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Models] Add SenseSI series models (open-compass#6) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models * [Feature] Add SenseSI series models * [Feature] add use custom propmt flag to contrl prompt format. --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Refactor] Modify EASI tsv download url and add several paper reference and rename spatial utils folder (open-compass#7) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Refactor] Change all EASI related bench tsv download url * [Refactor] Add caa & mra definition and add site & vsi paper link * [Refactor] declare no circular is aligned with mmsi offical mmsi * [Refactor] rename spatial utils folder to reduce confusion * [Refactor] add EASI prompt format explaination * [Refactor] add EASI prompt format explaination * [Refactor] switch to new hf url * [Fix] SiteImage tsv download url and remove load_in_8bit param for compatibility with latest transformers (open-compass#8) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Refactor] change SenseNova-SI series models hf dir and fix vsi sitevideo dataset type (open-compass#9) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Fix] Specify the dataset type for Sitevideo Vsi. * [Refactor] change sensenova_si hf dir * [Feature] Add VsiBench Debiased subset * [Feature] Add VsiBench Debiased subset (open-compass#10) * [Feature] Add cambrian-s model * [Refactor]] Add Requirements guide * [Fix] delete refcoco due to force push * [Fix] error when text is empty * [Feature] automatically specify device * [Refactor] remove unused code and set videoreader num_thread to default=0 --------- Co-authored-by: Junming Lin <114148730+mjuicem@users.noreply.github.com> Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com>
…s#13) * [Benchmark] Support RefCOCO (open-compass#1305) * Suppot Qwen3VL Series * Support Qwen3-VL Series * Support Qwen3-VL Series * Support Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * support refcoco * fix lint * [Benchmark] Add MindCube Bench (open-compass#1) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Benchmark] Add EASI related image spatial bench (open-compass#2) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Benchmark] Add EASI related video benchmark (open-compass#3) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Model] Add SpatialMLLM model (open-compass#4) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Model] Add Spatial VLM Models (open-compass#5) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Models] Add SenseSI series models (open-compass#6) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models * [Feature] Add SenseSI series models * [Feature] add use custom propmt flag to contrl prompt format. --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Refactor] Modify EASI tsv download url and add several paper reference and rename spatial utils folder (open-compass#7) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Refactor] Change all EASI related bench tsv download url * [Refactor] Add caa & mra definition and add site & vsi paper link * [Refactor] declare no circular is aligned with mmsi offical mmsi * [Refactor] rename spatial utils folder to reduce confusion * [Refactor] add EASI prompt format explaination * [Refactor] add EASI prompt format explaination * [Refactor] switch to new hf url * [Fix] SiteImage tsv download url and remove load_in_8bit param for compatibility with latest transformers (open-compass#8) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Refactor] change SenseNova-SI series models hf dir and fix vsi sitevideo dataset type (open-compass#9) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Fix] Specify the dataset type for Sitevideo Vsi. * [Refactor] change sensenova_si hf dir * [Feature] Add VsiBench Debiased subset * [Feature] Add VsiBench Debiased subset (open-compass#10) * [Feature] Add cambrian-s model * [Refactor]] Add Requirements guide * [Fix] delete refcoco due to force push * [Fix] error when text is empty * [Fix] spatialmllm inference error during multi images qa * [Feature] automatically specify device * [Refactor] remove unused code and set videoreader num_thread to default=0 --------- Co-authored-by: Junming Lin <114148730+mjuicem@users.noreply.github.com> Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com>
* [Benchmark] Support RefCOCO (open-compass#1305) * Suppot Qwen3VL Series * Support Qwen3-VL Series * Support Qwen3-VL Series * Support Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * support refcoco * fix lint * [Benchmark] Add MindCube Bench (open-compass#1) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Benchmark] Add EASI related image spatial bench (open-compass#2) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Benchmark] Add EASI related video benchmark (open-compass#3) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Model] Add SpatialMLLM model (open-compass#4) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Model] Add Spatial VLM Models (open-compass#5) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Models] Add SenseSI series models (open-compass#6) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models * [Feature] Add SenseSI series models * [Feature] add use custom propmt flag to contrl prompt format. --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Refactor] Modify EASI tsv download url and add several paper reference and rename spatial utils folder (open-compass#7) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Refactor] Change all EASI related bench tsv download url * [Refactor] Add caa & mra definition and add site & vsi paper link * [Refactor] declare no circular is aligned with mmsi offical mmsi * [Refactor] rename spatial utils folder to reduce confusion * [Refactor] add EASI prompt format explaination * [Refactor] add EASI prompt format explaination * [Refactor] switch to new hf url * [Fix] SiteImage tsv download url and remove load_in_8bit param for compatibility with latest transformers (open-compass#8) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Refactor] change SenseNova-SI series models hf dir and fix vsi sitevideo dataset type (open-compass#9) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Fix] Specify the dataset type for Sitevideo Vsi. * [Refactor] change sensenova_si hf dir * [Feature] Add VsiBench Debiased subset * [Feature] Add VsiBench Debiased subset (open-compass#10) * [Feature] Add cambrian-s model * [Refactor]] Add Requirements guide * [Fix] delete refcoco due to force push * [Fix] error when text is empty * [Fix] spatialmllm inference error during multi images qa * [Feature] Add VST --------- Co-authored-by: Junming Lin <114148730+mjuicem@users.noreply.github.com> Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com>
* [Benchmark] Support RefCOCO (open-compass#1305) * Suppot Qwen3VL Series * Support Qwen3-VL Series * Support Qwen3-VL Series * Support Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * Support Qwen3-Omni and update Qwen3-VL Series * support refcoco * fix lint * [Benchmark] Add MindCube Bench (open-compass#1) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Benchmark] Add EASI related image spatial bench (open-compass#2) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Benchmark] Add EASI related video benchmark (open-compass#3) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Model] Add SpatialMLLM model (open-compass#4) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Model] Add Spatial VLM Models (open-compass#5) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Models] Add SenseSI series models (open-compass#6) * [Model] add SpatialMLLM support * [Model] add SpatialMLLM import to __init__.py * [Style] apply pre-commit check * [Feature] Add more spatial model * [Feature] support correct loading of qwen25 derivative models * [Feature] Add SenseSI series models * [Feature] add use custom propmt flag to contrl prompt format. --------- Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com> * [Refactor] Modify EASI tsv download url and add several paper reference and rename spatial utils folder (open-compass#7) * [Feature] add EASI related spatial bench utils func * [Style] format using pre-commit * [Style] format using pre-commit * [Feature] upgrade unknown image format verify * [Feature] add mindcube bench * [Fix] cache path is none error during first download * [Feature] add embspatial benchmark * [Feature] add viewspatial benchmark * [Feature] add mmsi with out circular benchmark * [Feature] enable mmsi && embspatial && viewspatial evaluation * [Feature] add prepare tsv method to VideoBaseDataset * [Feature] add vsi bench * [Feature] add Site Bench * [Feature] enable vsi && site evaluation * [Fix] Sitebench category name mismatch * [Refactor] Change all EASI related bench tsv download url * [Refactor] Add caa & mra definition and add site & vsi paper link * [Refactor] declare no circular is aligned with mmsi offical mmsi * [Refactor] rename spatial utils folder to reduce confusion * [Refactor] add EASI prompt format explaination * [Refactor] add EASI prompt format explaination * [Refactor] switch to new hf url * [Fix] SiteImage tsv download url and remove load_in_8bit param for compatibility with latest transformers (open-compass#8) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Refactor] change SenseNova-SI series models hf dir and fix vsi sitevideo dataset type (open-compass#9) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Fix] Specify the dataset type for Sitevideo Vsi. * [Refactor] change sensenova_si hf dir * [Feature] Add VsiBench Debiased subset * [Feature] Add VsiBench Debiased subset (open-compass#10) * [Feature] Add cambrian-s model * [Refactor]] Add Requirements guide * [Fix] delete refcoco due to force push * [Fix] error when text is empty * [Fix] spatialmllm inference error during multi images qa * [Feature] Add VST * [Feature] automatically specify device * [Feature] add bagel * [Refactor] remove unused code and set videoreader num_thread to default=0 * [Feature] Add Bagel Model * [Feature] Add Bagel Model --------- Co-authored-by: Junming Lin <114148730+mjuicem@users.noreply.github.com> Co-authored-by: oscarqjh <oscar.jh9@gmail.com> Co-authored-by: Oscar Qian <91544028+oscarqjh@users.noreply.github.com>
…ideo dataset type (open-compass#9) * [Fix] siteimage wrong url * [Fix] transformer do not have load_in_8bit param in current version * [Fix] Specify the dataset type for Sitevideo Vsi. * [Refactor] change sensenova_si hf dir
* [Feature] add sparbench && spatialvizbench && starebench * [Fix] no cot && cot confit error * [Fix] no cot && cot confit error * [Feature] update tsv url && rm useless notes * [Refactor] upgrade output format * [Refactor] upgrade output format * [Feature] Add OmniSpatialBench * [Refactor] upgrade notes * [Refactor] remove useless spatial_rel_bench_folder due to rebase and version rollback issue. * [Refactor] remove useless logic * [Refactor] more elegant way to choose qwen architecture * [Refactor] more elegant way to choose qwen architecture * [Refactor] remove useless model name kwarg * [Refactor] remove vsi EASI prompt since EASI will not promote ss * [Refactor] Modify tsv url and remove useless print * [Feature] added stratified statistical accuracy function * [Fix] remove top comment * [Refactor] extract common parts in build prompt for easier understanding * [Refactor] extract common url * [Refactor] extract common url
* [Feature] add sparbench && spatialvizbench && starebench * [Fix] no cot && cot confit error * [Fix] no cot && cot confit error * [Feature] update tsv url && rm useless notes * [Refactor] upgrade output format * [Refactor] upgrade output format * [Feature] Add OmniSpatialBench * [Refactor] upgrade notes * [Refactor] remove useless spatial_rel_bench_folder due to rebase and version rollback issue. * [Refactor] remove useless logic * [Refactor] more elegant way to choose qwen architecture * [Refactor] more elegant way to choose qwen architecture * [Refactor] remove useless model name kwarg * [Refactor] remove vsi EASI prompt since EASI will not promote ss * [Refactor] Modify tsv url and remove useless print * [Feature] added stratified statistical accuracy function * [Fix] remove top comment * [Refactor] extract common parts in build prompt for easier understanding * [Refactor] extract common url * [Refactor] extract common url * [Feature] add EASI tsv md5 * [Refactor] remove useless LMUdata import
* Squashed 'vlmeval/vlm/vlm3r/CUT3R/' content from commit 5124436 git-subtree-dir: vlmeval/vlm/vlm3r/CUT3R git-subtree-split: 51244364af3566d6473559f71a81b4accc75c424 * Add VLM3R * Add VLM3R * support to download cut3r ckp from official google drive * add code to build cut3r from the source * use EASI prompt for vsibench * rm CUT3R subtree * Add CUT3R code * rm unused code in CUT3R * fix import error * ignore pth and data * rm unused spatial encoder and vision encoder * fix the bug for Siglip vision encoder * download cut3r pth from HF instead of google drive
…ompass#22) * [Refactor] Refactor regex answer parsing and improve comments * [Feature] use last number instead of first to match na options * [Feature] support English number words in NA matcher * [Feature] add tools to build options from xlsx rows * [Feature] support llm matching for both mcq and vqa * [Feature] support parallel llm judge and support na llm extract * [Fix] fix coner case when options in mutliple lines * [Feature] add matching func factory * [Fix] llm extract fetching problem * [Refactor] improve func naming * [Feature] support LLM matching * [Refactor] extract common func * [Refactor] remove useless content * [Refactor] modify the content of the copilot check. * [Fix] construct type error * [Feature] determine result file name by judge model name * [Refactor] fix naming of eval mcq func * [Refactor] improve code style.
…tes consistently (open-compass#26) * [Feature] Extracting common result file naming logic * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Refactor] Adopt copilot recommendations
* refactor VLM3R: remove folder `vlmeval/vlm/vlm3r/*`, and add `vlmeval/vlm/vlm3r.py` * remove unused import package * add the hyper-param in the init func instead of using a fixed value & mv the `VLM3R` into the `spatial_related_models` * add comments
* [Feature] Extracting common result file naming logic * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Feature] Add a caching mechanism for LLM evaluation * [Feature] Enable llm cache mechanism * [Fix] fix bugs according to copilot
* [Refactor] Refactor regex answer parsing and improve comments * [Feature] use last number instead of first to match na options * [Feature] support English number words in NA matcher * [Feature] add tools to build options from xlsx rows * [Feature] support llm matching for both mcq and vqa * [Feature] support parallel llm judge and support na llm extract * [Fix] fix coner case when options in mutliple lines * [Feature] add matching func factory * [Fix] llm extract fetching problem * [Refactor] improve func naming * [Feature] support LLM matching * [Refactor] extract common func * [Refactor] remove useless content * [Refactor] modify the content of the copilot check. * [Fix] construct type error * [Feature] determine result file name by judge model name * [Feature] add ERQA bench * [Refactor] Add task category and add llm eval * [Feature] add robospatialbench * [Feature] add refspatialbench * [Feature] enable three er bench * [Feature] Add er benchs * [Feature] Extracting common result file naming logic * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Refactor] Use single quotes consistently * [Feature] Add a caching mechanism for LLM evaluation * [Feature] Enable llm cache mechanism * [Feature] backup er bench update * [Feature] get ready for ERQA bench * [Feature] use point2dparser to get point coord and get ready for refspatial * [Feature] add point paeser for er benchs * [Feature] use point2dparser to get points and get ready for robospatial * [Feature] aligned with qwen3vl prompt format * [Feature] Add a unified interface to robospatia * [Feature] fix and refactor according to copilot * [Refactor] rename ERQA to ERQABench to align with other EASI added benchs * [Fix] 3DSR result fetch issue
* [Feature] Add SPBench * [Feature] Update dataset url to hf url * [Refactor] refactor according to copilot
* [Feature] Add MMSI-Video-Bench * [Refactor] refactor according to copilot * [Refactor] refactor according to copilot * [Fix] lmudata error
* [Feature] Add MMSI-Video-Bench * [Refactor] refactor according to copilot * [Refactor] refactor according to copilot * [Feature] Init commit on vsi super bench * [Feature] Add VsiSuperCount * [Refactor] refactor according to copilot * [Refactor] remove * import and enable flake8 check * [Refactor] refactor according to copilot * [Fix] lmudata error
…ring download (open-compass#36) * [Feature] support mmsi video sub bench scores and ignore video.zip while download * [Feature] print scores use 100 points format * [Feature] update tsv hf download path * [Feature] upgrade is_nan_or_none func
* [Feature] Init commit for STI-Bench * [Feature] update stibench tsv hf download path * [Refactor] upgrade acccording to copilot
* [Feature] Init commit for STI-Bench * [Feature] add sensenova-si latest models * [Feature] update stibench tsv hf download path * [Refactor] upgrade acccording to copilot
* [Feature] Init commit for STI-Bench * [Feature] add sensenova-si latest models * [Feature] update stibench tsv hf download path * [Refactor] upgrade acccording to copilot * [Feature] Init commit on DSR bench * [Feature] modify download func to download video data from easi hf dir * [Refactor] refactor according to copilot * [Perf] improve save video frames efficiency
…d_eriq [Benchmark] Add ERIQ bench
|
@mzr1996 @zhulinJulia24 |
Feature:
To ensure reproduction accuracy, we have provided a detailed verification report in bench_verify.md, comparing our results with official baselines.