ImlyChlung/LSTM-Stock-Predication
End-to-end algorithmic trading system using PyTorch LSTM. Features a custom Cross-Sectional Alpha ranking engine, dynamic portfolio compounding, and event-driven backtesting (+483% return vs SPY).
AI-Powered Quantitative Trading System (LSTM)
π Executive Summary
This project implements an end-to-end algorithmic trading system leveraging Deep Learning (LSTM) to predict stock price movements and generate alpha.
Unlike traditional price prediction models, this system focuses on Cross-Sectional Alpha Scoringβranking stocks based on their relative strength against the market (SPY) and volatility metrics. The project demonstrates a complete quantitative workflow: from data ingestion and complex feature engineering to model training and event-driven backtesting.
Key Performance Highlight (Backtest):
- Total Return: Achieved a CAGR of 79.9% (Total Return +483%) vs. S&P 500 Benchmark (+23% CAGR) over the 3-year out-of-sample period.
- Strategy: Dynamic position sizing with compounding capital.
π Performance Analysis
1. Equity Curve vs. Benchmark (see backtest_trades.csv how the AI trade)
The strategy significantly outperformed the S&P 500 benchmark during the out-of-sample testing period. The compounding effect and dynamic position sizing allowed the portfolio to capitalize on high-confidence signals.

(Blue: AI Strategy | Grey: S&P 500 Benchmark)
Strategy Logic:
- Alpha Selection (Top-K): The model screens the S&P 500 universe daily. It only enters positions when the predicted Alpha Score is > 0.2 (> 0 indicates that the model has more than 50% confidence that the stock will outperform the S&P 500 index.), selecting the highest-ranked candidates.
- Dynamic Position Sizing (Compounding): The portfolio is capped at 5 positions (20% allocation each). Crucially, trade sizes are dynamically calculated based on Current Total Equity rather than initial capital, allowing the portfolio to compound gains aggressively during winning streaks.
- Condition-Based Exit: Holdings are reviewed on a weekly basis (every 7 days). A position is liquidated if the model's predicted score turns negative (
Score < 0), ensuring capital is protected from deteriorating trends.
2. Risk & Return Distribution
The analysis of closed trades shows a positive skew in returns. The "Realized PnL" chart demonstrates steady capital appreciation with controlled drawdowns.
3. Model Interpretability (Feature Importance)
Using Permutation Importance, we identified that long-term trend indicators (sma100_gap) and momentum oscillators (rsi14) were the most critical drivers for the LSTM's decision-making process.
π Project Structure
The codebase is modularized to mimic a production-grade quantitative pipeline:
βββ getdata.py # Data Ingestion: Downloads historical data via yfinance
βββ indicator.py # Library: Custom implementation of technical indicators (RSI, MACD, BOLL, etc.)
βββ feature_engineering.py # ETL Pipeline: Cleans data, generates factors, calculates Alpha Targets
βββ train.py # Modeling: PyTorch LSTM implementation with sliding window datasets
βββ analyze_features.py # Analysis: Permutation importance to interpret "Black Box" models
βββ backtest.py # Simulation: Event-driven backtester with dynamic portfolio management
βββ analyze_trade.py # Reporting: Generates visualizations and financial metrics (Sharpe, Win Rate)π» Installation & Usage
Prerequisites
pip install torch pandas numpy matplotlib seaborn yfinance scikit-learn tqdm joblibWorkflow
- Download Data:
python getdata.py
- Generate Features:
python feature_engineering.py
- Train Model:
python train_model.py
- Run Backtest:
python backtest.py
- Analyze Results:
python analyze_features.py analyze_trade.py

