Loaders and utils to work with RDF data using LangChain:
OntologyLoader: load OWL ontology classes and properties in your vectorstoreSparqlExamplesLoader: load SPARQL query examples to your vectorstore. SPARQL queries are retrieved from a SPARQL endpoint where they are stored using the SHACL ontology, with a human readable description.
๐ฆ๏ธ Installation
This package requires Python >=3.8, install it from the git repository with:
pip install git+https://github.com/vemonet/langchain-rdf.git๐ช Usage
Note
Refer to LangChain documentation to figure out how to best integrate documents loaders to your stack, or check our complete notebook examples, using only open source components, running locally, with conversation memory:
OWL ontology loader
from langchain_rdf import OntologyLoader
loader = OntologyLoader("https://semanticscience.org/ontology/sio.owl", format="xml")
documents = loader.load()
print(len(documents))SPARQL query examples
from langchain_rdf import SparqlExamplesLoader
loader = SparqlExamplesLoader("https://sparql.uniprot.org/sparql/")
documents = loader.load()
print(len(documents))๐งโ๐ป Development setup
The final section of the README is for if you want to run the package in development, and get involved by making a code contribution.
๐ฅ๏ธ Clone
Clone the repository:
git clone https://github.com/vemonet/langchain-rdf
cd langchain-rdf๐ฃ Install dependencies
Install Hatch, this will automatically handle virtual environments and make sure all dependencies are installed when you run a script in the project:
pipx install hatchโ๏ธ Run tests
Make sure the existing tests still work by running the test suite and linting checks. Note that any pull requests to the fairworkflows repository on github will automatically trigger running of the test suite;
hatch run testTo display all logs when debugging:
hatch run test -sโป๏ธ Reset the environment
In case you are facing issues with dependencies not updating properly you can easily reset the virtual environment with:
hatch env pruneManually trigger installing the dependencies in a local virtual environment:
hatch -v env create๐ท๏ธ New release process
The deployment of new releases is done automatically by a GitHub Action workflow when a new release is created on GitHub. To release a new version:
-
Make sure the
PYPI_TOKENsecret has been defined in the GitHub repository (in Settings > Secrets > Actions). You can get an API token from PyPI at pypi.org/manage/account. -
Increment the
versionnumber in thepyproject.tomlfile in the root folder of the repository.hatch version fix
-
Create a new release on GitHub, which will automatically trigger the publish workflow, and publish the new release to PyPI.
You can also build and publish from your computer:
hatch build
hatch publish