LangChain RDF

Loaders and utils to work with RDF data using LangChain:

OntologyLoader: load OWL ontology classes and properties in your vectorstore
SparqlExamplesLoader: load SPARQL query examples to your vectorstore. SPARQL queries are retrieved from a SPARQL endpoint where they are stored using the SHACL ontology, with a human readable description.

📦️ Installation

This package requires Python >=3.8, install it from the git repository with:

pip install git+https://github.com/vemonet/langchain-rdf.git

🪄 Usage

Note

Refer to LangChain documentation to figure out how to best integrate documents loaders to your stack, or check our complete notebook examples, using only open source components, running locally, with conversation memory:

OWL ontology loader

from langchain_rdf import OntologyLoader

loader = OntologyLoader("https://semanticscience.org/ontology/sio.owl", format="xml")
documents = loader.load()
print(len(documents))

SPARQL query examples

from langchain_rdf import SparqlExamplesLoader

loader = SparqlExamplesLoader("https://sparql.uniprot.org/sparql/")
documents = loader.load()
print(len(documents))

🧑‍💻 Development setup

The final section of the README is for if you want to run the package in development, and get involved by making a code contribution.

📥️ Clone

Clone the repository:

git clone https://github.com/vemonet/langchain-rdf
cd langchain-rdf

🐣 Install dependencies

Install Hatch, this will automatically handle virtual environments and make sure all dependencies are installed when you run a script in the project:

pipx install hatch

☑️ Run tests

Make sure the existing tests still work by running the test suite and linting checks. Note that any pull requests to the fairworkflows repository on github will automatically trigger running of the test suite;

hatch run test

To display all logs when debugging:

hatch run test -s

♻️ Reset the environment

In case you are facing issues with dependencies not updating properly you can easily reset the virtual environment with:

hatch env prune

Manually trigger installing the dependencies in a local virtual environment:

hatch -v env create

🏷️ New release process

The deployment of new releases is done automatically by a GitHub Action workflow when a new release is created on GitHub. To release a new version:

Make sure the PYPI_TOKEN secret has been defined in the GitHub repository (in Settings > Secrets > Actions). You can get an API token from PyPI at pypi.org/manage/account.
Increment the version number in the pyproject.toml file in the root folder of the repository.
```
hatch version fix
```
Create a new release on GitHub, which will automatically trigger the publish workflow, and publish the new release to PyPI.

You can also build and publish from your computer:

hatch build
hatch publish

vemonet/langchain-rdf