v1.0 — Release

AI-Powered Trading
with Reinforcement Learning

FxMath RL Studio brings professional-grade Q-Learning algorithms to MetaTrader 5. Train intelligent agents that learn from market data and execute trades autonomously.

1,000 State Space
3 Action Strategies
ε-Greedy Exploration Policy
Bellman Q-Update

What is Reinforcement Learning?

Reinforcement Learning (RL) is a branch of machine learning where an agent learns to make decisions by interacting with its environment. Through trial and error, the agent discovers which actions yield the highest cumulative reward — no pre-labeled data required.

Agent Q-Table Memory
Action Buy / Sell / Hold
Market Environment MT5 Price, RSI, ATR
State + Reward Z-score, RSI, P&L

How the Learning Loop Works

1

Observe State (s)

The agent reads the current market state: price z-score relative to moving average, RSI binned into 10 levels, and ATR volatility normalized into 10 buckets — producing 1,000 possible states.

2

Choose Action (a)

Using ε-greedy policy: with probability ε, explore a random action; otherwise, exploit the best-known action from the Q-Table. This balances exploration (finding new profitable setups) with exploitation (using what works).

3

Execute Trade

The agent sends the action to MT5: Buy (open long), Sell (open short), or Hold (close position / do nothing). Position size and stop-loss are managed automatically.

4

Receive Reward (r)

After each bar closes, the agent calculates profit reward = (equity change — spread cost). Positive pips earn positive reward; losses produce negative reward, teaching the agent to avoid unprofitable behavior.

5

Update Q(s,a)

The Bellman Equation updates the Q-Table entry for this state-action pair, blending the old estimate with the new reality: Q(s,a) ← Q(s,a) + α · [r + γ·maxQ(s′,a′) − Q(s,a)]. Learning rate α = 0.1, discount γ = 0.9.

6

Next Bar → Repeat

On each new bar, the agent observes the new state s′ and the cycle continues. Over thousands of bars, the Q-Table converges toward optimal actions — creating a trading strategy that learns from experience rather than static rules.

Bellman Q-Learning Update

Q(s,a) ← Q(s,a) + α · [ r + γ · max Q(s′,a′) — Q(s,a) ]
α = learning rate (0.1) γ = discount factor (0.9) ε = exploration rate (decays 1→0.01) r = profit reward signal

State Space (1,000 States)

10 Price Z-Score
×
10 RSI Bins
×
10 ATR Volatility

Each state-action pair stores a Q-value — the expected future reward for taking that action in that market condition.

Why Traditional EAs Fail & RL Succeeds

Most Expert Advisors hardcode fixed rules like "if RSI < 30 then buy." These break when market regimes shift. RL doesn't follow rules — it learns them from data, and adapts as markets change.

Fixed EA Rule RL Adapts

Static Rules vs Adaptive Learning

Traditional EAs use fixed thresholds (RSI < 30 = buy) that become obsolete when volatility changes. RL agents continuously update their Q-Table on every new bar, automatically adapting to new regimes — trending, ranging, or high-volatility.

Overfitted Strategy RL Generalizes

No Overfitting — True Generalization

Backtest-optimized EAs fit noise, not signal — they fail forward. RL learns a policy (which action is best in each state) rather than parameter values. The Q-Table generalizes across market conditions because it maps situations to decisions, not dates to trades.

Z-Score RSI ATR Q(s,a) Action

Multi-Factor State Representation

Instead of checking one indicator at a time (if RSI && MA), RL combines price z-score + RSI + ATR volatility into a single state index. The agent learns the joint interaction of all three — for example, "low RSI + low volatility" may signal a stronger buy than "low RSI + high volatility."

γ = 0.9 — distant rewards discounted

Temporal Credit Assignment (γ = 0.9)

A trade might take 5 bars to profit. The discount factor γ = 0.9 lets the agent assign partial credit to earlier actions that led to later rewards. This is impossible in rule-based EAs, where each bar's decision is evaluated independently of past actions.

Reinforcement Learning vs Traditional EAs

❌ Traditional EA
  • Fixed rules — "if condition then action"
  • Backtest-optimized — fits past noise
  • Breaks in unseen market regimes
  • Single-indicator logic (if RSI && MA)
  • No learning — same mistakes repeated
✅ RL Agent
  • Learns policy — "which action maximizes reward"
  • Online learning — updates on every bar
  • Adapts — Q-Table shifts with market regime
  • Multi-factor state (Z × RSI × ATR = 1000 states)
  • Improves over time — remembers what works

Advantages of RL-Powered Trading

Unlike traditional Expert Advisors that rely on fixed rules, RL Studio continuously adapts to changing market conditions through learned experience.

Adaptive Strategy

The RL agent continuously retrains on fresh market data, adapting to regime changes without manual intervention.

Multi-Timeframe Support

Train on any MT5 timeframe (M1 to MN1) with automatic indicator computation matching the MQL5 source.

Real-Time Visualization

Live Matplotlib training charts, equity curves, and Q-Table heatmaps give full insight into agent behavior.

Persistent Models

Save trained Q-Tables to .pkl files. Load and continue training, or deploy immediately to live markets.

Proven MQL5 Heritage

Identical state binning, reward shaping, and Bellman updates as the battle-tested RL_Modified.mq5 EA.

Multi-Instance (Pro)

Run multiple independent agents simultaneously on different symbols, timeframes, or risk profiles.

Free vs Pro

Both editions share the same core RL engine. The Pro edition adds multi-instance capabilities for advanced traders who manage multiple strategies simultaneously.

Feature Free Edition Pro Edition
Q-Learning Engine
Historical Backtesting
Live Trading (1 Instance)
Training Progress Chart
Save / Load Q-Tables
MT5 Auto-Detect
Multi-Instance Trading
Trader Profile Manager
Q-Table Heatmap Viewer
Training Progress Bar
Right-Click Context Menus
Price Free $199 lifetime

Simple, Lifetime Pricing

One payment. Lifetime license. Free updates. No subscriptions, no hidden fees.

Free Edition

$0

Perfect for getting started with RL trading

  • Full Q-Learning Engine
  • Historical Backtesting
  • 1 Live Trading Instance
  • Real-Time Training Charts
  • Save/Load Models
Download Free

Ready to Transform Your Trading?

Download FxMath RL Studio today and start training intelligent trading agents. The Free edition is fully functional — upgrade to Pro when you need multi-instance power.

Need Only MetaTrader 5