Soichiro Nishimori
nissymori
PhD student. Interested in Game AI, JAX-based RL, offline RL and exploration.
Languages
Loading contributions...
Top Repositories
Repositories
14Clean single-file implementation of offline RL algorithms in JAX
No description provided.
A GPU-Accelerated Mahjong Simulator for RL in JAX
No description provided.
No description provided.
Reference implementation for DPO (Direct Preference Optimization)
No description provided.
[NeurIPS 2023] The official code for paper "State Regularized Policy Optimization on Data with Dynamics Shift"
TD-Gammon implementation
A collection of reference environments for offline reinforcement learning
No description provided.
code for our EMNLP2020 paper: Multilevel Text Alignment with Cross-Document Attention by Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith
A simple REINFORCE algorithm implementation in PyTorch
Game server for Japanese Mahjong AI.