SamirWagle/Nepse-All-Scraper
Nepse Scraper. Automated in GitHub Actions. Open Source
π NEPSE All Scraper
A free, open-source data pipeline for the Nepal Stock Exchange.
Automatically scrapes prices, dividends, right shares, and floorsheet data
for 337 listed companies β committed to this repo every weekday via GitHub Actions.
π¦ What's Inside
This repo is data-first. Every weekday after NEPSE closes, GitHub Actions scrapes the latest data and commits it directly back to the
data/folder. No database, no server β just flat CSV files you can plug into anything.
| Data | Where | Updated |
|---|---|---|
| OHLC price history | data/company-wise/{SYMBOL}/prices.csv |
Locally (run once) |
| Dividend history | data/company-wise/{SYMBOL}/dividend.csv |
Every weekday β |
| Right share history | data/company-wise/{SYMBOL}/right-share.csv |
Every weekday β |
| Full daily floorsheet | data/floorsheet_YYYY-MM-DD.csv + .json |
Every weekday β |
οΏ½οΈ Repository Layout
Nepse-All-Scraper/
β
βββ .github/
β βββ workflows/
β βββ daily_scraper.yml # GitHub Actions β runs every weekday at 6:30 PM NPT
β
βββ data/
β βββ company_list.json # 337 priority company symbols
β βββ company_id_mapping.json # Symbol β ShareSansar internal ID
β βββ floorsheet_YYYY-MM-DD.csv # Daily floorsheet (all trades)
β βββ floorsheet_YYYY-MM-DD.json # Same data as JSON
β βββ company-wise/
β βββ {SYMBOL}/
β βββ prices.csv # Full OHLC price history
β βββ dividend.csv # Dividend history
β βββ right-share.csv # Right share history
β
βββ scraper/
βββ run_github_actions.py # β GitHub Actions entry point
βββ run_daily.py # β Local price scraper CLI
βββ core/
βββ daily.py # Orchestrates price scraping
βββ daily_prices.py # Daily price summary updater
βββ floorsheet.py # Floorsheet scraper (merolagani.com)
βββ history.py # OHLC price history scraper
π€ Automation β GitHub Actions
The workflow .github/workflows/daily_scraper.yml runs automatically every weekday (MonβFri) at 6:30 PM Nepal time (12:45 UTC), right after NEPSE closes.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β GitHub Actions β Daily Run β
βββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ€
β Dividends β Updates dividend.csv (all 337) β
β Right Sharesβ Updates right-share.csv (all 337) β
β Floorsheet β Full day's trades from merolagani.com β
β Commit β git push β data/ auto-updated in repo β
βββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββ
Trigger manually: GitHub β Actions β Daily Scraper β Run workflow
β‘ Quickstart
Prerequisites
pip install requests beautifulsoup4Run the same scrape as GitHub Actions (dividends + right shares + floorsheet)
# All three in one go
python scraper/run_github_actions.py
# Or individually
python scraper/run_github_actions.py --dividends
python scraper/run_github_actions.py --right-shares
python scraper/run_github_actions.py --floorsheet
# Test floorsheet with limited pages (faster)
python scraper/run_github_actions.py --floorsheet --max-pages 3Scrape full OHLC price history (local only, first-time)
# Full history for all 337 companies β takes ~2-4 hours on first run
python scraper/run_daily.py --full-scrape
# Incremental β only fetches newer records than what's already in prices.csv
python scraper/run_daily.py --incremental
# Only process newly listed companies (new IPOs)
python scraper/run_daily.py --new-onlyWhy are prices local-only?
Price scraping needs the existingprices.csvfiles to know where to stop (incremental logic). Run it locally once, push the data, then automation handles daily updates.
οΏ½ Data Formats
prices.csv
date, open, high, low, ltp, percent_change, qty, turnover
2024-01-15, 1200, 1250, 1190, 1240, +1.5%, 3400, 4216000
dividend.csv
fiscal_year, bonus_share, cash_dividend, total_dividend, book_closure_date
2079/80, 10%, 5%, 15%, 2023-12-01
right-share.csv
ratio, total_units, issue_price, opening_date, closing_date, status, issue_manager
1:1, 5000000, 100, 2023-11-01, 2023-11-15, Closed, XYZ Capital
floorsheet_YYYY-MM-DD.csv
date, sn, contract_no, stock_symbol, buyer, seller, quantity, rate, amount
2024-01-15, 1, 100012345, ADBL, 21, 42, 500, 1240, 620000
βοΈ How Incremental Scraping Works
prices.csv ShareSansar AJAX
βββββββββββββ ββββββββββββββββββ
Latest date: 2024-01-10 β Stop fetching when
record date β€ 2024-01-10
Result: Only 1-2 pages fetched instead of 70+
history.py reads the most recent date from the existing prices.csv, passes it as a stop_date to the paginator, and halts the moment it hits older data. This makes daily updates take seconds instead of hours.
π First-Time Setup
# 1. Clone the repo
git clone https://github.com/SamirWagle/Nepse-All-Scraper.git
cd Nepse-All-Scraper
# 2. Install dependencies
pip install requests beautifulsoup4
# 3. Run the full price history scrape (one-time, takes 2-4 hours)
python scraper/run_daily.py --full-scrape
# 4. Push everything to the repo
git add data/
git commit -m "chore: initial data load"
git pushFrom that point on, GitHub Actions handles everything automatically every weekday. β
πΊοΈ Roadmap
- Phase 1 β NEPSE Scraper (prices, dividends, right shares, floorsheet)
- Phase 2 β GitHub Actions automation + incremental updates
- Phase 3 β Frontend / API layer
Want to help build Phase 3? PRs are welcome.
β οΈ Disclaimer
This project is for educational purposes only.
Data is sourced from publicly available websites (ShareSansar, Merolagani).
Not financial advice. Do your own research before making investment decisions.
Made with β€οΈ for the Nepali investor community