StyMaar/omogen-test

A Python library that has two public functions :

extract_job_requirements to extract features from a job description:
- it takes a job description
- perform an API call to an LLM service (use the llama.cpp API)
- it passes the requirement extraction system promp stored in the DB to the LLM before passing it the job description
- the LLM call is be set up in a way that forces the LLM to reply valid JSON containing the following fields (all string except the experience which must be an integer):
  1. Skills - alist of technical and soft skills required for the job
  2. Experience - Years of experience required
  3. Location
  4. Education - Degree requirements
  5. Certifications - a list of Professional certifications requirements
- then stores the response in the database, as a json string, linked to the job description
match_resume_to_job which does the matching between the job description and a resume passed as argument
- it takes the id of the job description and the resume (plain text)
- load the requirements from the database
- for each requirement start a new conversation with the LLM, give it the resume parsing system promp, the resume and the requirement we're looking for.
- the LLM anwsers with a JSON containing:
  1. Match Status: an enum representing the match between the resume and the requirement. The enum can take 4 values representing the level of confidence that it's a match. (the name of the values is TBD)
  2. Explanation: a string, the Human-readable justification for the decision
- then we compute an overall percentage score by taking the average match status (casted to integer between 1 and 4) divided by 4.

LLM calls and DB accessed are managed by a helper class that are passed as argument to both functions, so that they can easily be mocked for testing purpose.

How to run

To run the test scenario, call: uv run main.py.

Caveats

So far the current implementation only has been test against mocked llm calls, not actual LLMs. It's very unlikely to work since we don't pass JSON schema constraint to the LLM API nor validate that the LLM output is the expected JSON (though it should work with llama.cpp if we spawn the llama.cpp process with the schema as a constraint, but we would need to spawn two instances of llama.cpp for that, consumming twice the VRAM)
there's barely any error handling, or at least no consistent error handling strategy.
there is no logging whatsoever
only the job requirements are persisted, not the resume evaluation, which is obviously usuitable for actual use.
there's no tests

StyMaar/omogen-test

How to run

Caveats

On this page

Contributors