ConceptNet 5.7 (Russian part) extraction scripts + fast API object to access the relations. Note: a simple modification of the preprocessing script allows to build a queryable graph of any other subset of ConceptNet.
pip install ruconceptnet
>>> from ruconceptnet import ConceptNet
>>> cn = ConceptNet()
>>> cn.get_targets("алкоголь")
[('этиловый_спирт', {'Synonym'}), ('спиртной_напиток', {'Synonym'}), ('алкогольный', {'RelatedTo'}),
('алкоголик', {'RelatedTo'}), ('спирт', {'Synonym'}), ('алкоголизация', {'RelatedTo'})]
>>> cn.get_sources("йога")
[('йоги', {'FormOf'}), ('йогу', {'FormOf'}), ('йогический', {'RelatedTo'}), ('йогою', {'FormOf'}),
('йогой', {'FormOf'}), ('йог', {'RelatedTo'}), ('йоге', {'FormOf'})]
>>> cn.check_pair("человек", "зверь")
(['DistinctFrom'], [])
>>> cn.check_pair("зверь", "человек")
([], ['DistinctFrom'])
Every relation carries ConceptNet's weight (assertion confidence). Pass
with_weights=True to get {relation: weight} mappings instead of plain sets:
>>> cn.get_targets("алкоголь", with_weights=True)
[('спирт', {'Synonym': 2.0}), ('алкоголизм', {'RelatedTo': 3.5}), ...]
>>> cn.check_pair("человек", "зверь", with_weights=True)
({'DistinctFrom': 0.5}, {})
When several assertions share the same (source, target, relation), the
strongest (maximum) weight is kept. Data built before weights were added
reports 1.0 for every edge.
Please see the prepare_data.sh script. We get the Russian-Russian pairs of nodes with simple grep and build
a 3-dimensional array (source, target, relation) stored as a single sparse SciPy matrix.
# install the package together with the development dependencies
pip install -e ".[dev]"
# run the test suite with coverage (threshold enforced at 80%)
pytest
# lint and format
ruff check .
ruff format .
# optional: enable the git hooks
pre-commit installPlease do not forget to cite the ConceptNet5 paper.
@inproceedings{10.5555/3298023.3298212,
author = {Speer, Robyn and Chin, Joshua and Havasi, Catherine},
title = {ConceptNet 5.5: An Open Multilingual Graph of General Knowledge},
year = {2017},
publisher = {AAAI Press},
booktitle = {Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence},
pages = {4444–4451},
numpages = {8},
location = {San Francisco, California, USA},
series = {AAAI'17}
}
Citing the repository is not necessary, but greatly appreciated as well, if you use this work.
@misc{ruconceptnet2020alekseev,
title = {{alexeyev/RuConceptNet: /ru/ConceptNet5.7 Python wrapper }},
year = {2020},
url = {https://github.com/alexeyev/RuConceptNet},
language = {english}
}
The code is released under the MIT license (please see the LICENSE file).
This work includes a subset data from ConceptNet 5, which was compiled by the Commonsense Computing Initiative. ConceptNet 5 is freely available under the Creative Commons Attribution-ShareAlike license (CC BY SA 3.0) from http://conceptnet.io.
The included data was created by contributors to Commonsense Computing projects, contributors to Wikimedia projects, DBPedia, OpenCyc, Games with a Purpose, Princeton University's WordNet, Francis Bond's Open Multilingual WordNet, and Jim Breen's JMDict.
The complete data in ConceptNet is available under the Creative Commons Attribution-ShareAlike 4.0 license.
For more details, please see "Copying and sharing ConceptNet".