GitHunt
HY

hyparam/hylang

A stupidly small and fast programming language detection model

HyLang

HyLang

workflow status
mit license
dependencies
coverage

A stupidly small and fast programming language detection model.

Usage

import { detectLanguage } from 'hylang'

const input = `
  function square(x) {
    return x * x
  }
`
console.log(`Predicted language: ${detectLanguage(input)}`)

Accuracy

Hylang eval is sampled from the starcoderdata dataset.

Hylang is 30.4kb packed and achieves 74.5% accuracy.

Implementation

The language detector is implemented as a bag of words model trained on the starcoderdata dataset.

Training is done in python with torch. Model weights are exported to params.json so they can be used in javascript.

Languages

Python65.7%JavaScript32.4%HTML1.9%

Contributors

MIT License
Created June 20, 2024
Updated November 25, 2024