A trading signal is a quantitative indication that an asset is likely to move in a particular direction over a defined horizon with a probability above chance. Signal generation is the systematic process of finding, validating, and operationalising these indications — distinguishing genuine predictive patterns from statistical noise, estimating the probability and magnitude of predicted moves, and combining multiple signals into a coherent view that accounts for correlations and regime conditions.
We have built signal generation systems for equity and forex markets and for cryptocurrency exchanges. This includes the full pipeline from data sourcing and cleaning through feature engineering, model training, signal evaluation, and live production deployment — not just the modelling piece in isolation.
Signal types and approaches
Price and microstructure signals
Signals derived from price and volume data at various frequencies — from tick-level microstructure patterns to monthly factor exposures. At the high-frequency end, we build order flow imbalance models, bid-ask spread dynamics indicators, and short-term momentum signals using raw exchange data. At lower frequencies, we build trend-following models, mean-reversion indicators, and volatility regime signals using daily and weekly price series.
Model architectures depend on the signal frequency and data volume: gradient boosting for feature-rich tabular data at daily frequency, LSTMs and Temporal Fusion Transformers for sequence modelling at intraday frequency, and simpler linear models for regime signals where interpretability matters for risk management purposes.
Fundamental signals
Quantitative factors derived from financial statement data applied systematically across equity universes: earnings quality metrics, cash flow sustainability indicators, balance sheet strength ratios, management capital allocation track records, and earnings estimate revision trends. We build the financial data processing pipelines that compute these factors in a point-in-time consistent way — using only information that would have been available at the time, not future data that creates look-ahead bias.
Factor signal evaluation uses Fama-MacBeth regressions and portfolio sort methodologies to measure factor premium and persistence across different market regimes and sub-universes, providing the statistical evidence needed to determine whether a fundamental signal justifies inclusion in a live model.
Alternative data signals
We have built pipelines for web scraping, news sentiment aggregation, and social media signal extraction — processing raw text and structured data from non-traditional sources into model-ready features. The key challenges for alternative data signals are: data cleaning and normalisation across inconsistent source formats; survivorship bias in historical alternative data (the companies with the best alternative data coverage are not a random sample); and regime sensitivity (alternative data signals often work in specific market regimes and fail in others).
Multi-signal combination and portfolio construction
Individual signals are combined into composite views using optimisation models that account for signal correlations, varying reliability across market regimes, and the constraints of practical portfolio construction — turnover costs, liquidity limits, and concentration constraints. We build signal combination frameworks using both linear combination (IC-weighted composite signals) and ML-based combination (stacked models that learn the optimal combination from historical data).
Signal validation infrastructure
Signal validation is what separates deployable research from curve-fitting. We build evaluation infrastructure covering: out-of-sample backtesting with strict temporal separation between training and test periods; walk-forward analysis that simulates live deployment by rolling the training window forward; decay analysis measuring how signal strength diminishes over time; correlation analysis between new signals and existing portfolio exposures; and realistic transaction cost modelling that measures whether a signal generates sufficient gross alpha to survive implementation costs.
