SSRN Id4442770

Statistical Predictions of Trading Strategies in
Electronic Markets
Álvaro Cartea3,4 , Samuel N. Cohen1,3,4 , Rob Graumans2 , Saad Labyad1,3,4 ,
Leandro Sánchez-Betancourt1,3,5,* , and Leon van Veldhuijzen2
1
Alan Turing Institute, London
2
Autoriteit Financiële Markten, Amsterdam
3
Oxford-Man Institute for Quantitative Finance, University of Oxford
4
Mathematical Institute, University of Oxford
5
Department of Mathematics, King’s College London
*
Principal Author
May 16, 2023
Abstract
We build statistical models to describe how market participants choose the direction, price, and vol-
ume of orders. Our dataset, which spans sixteen weeks for four shares traded in Euronext Amsterdam,
contains all messages sent to the exchange and includes algorithm identification and member identifi-
cation. We obtain reliable out-of-sample predictions and report the top features that predict direction,
price, and volume of orders sent to the exchange. The coefficients from the fitted models are used to
cluster trading behaviour and we find that algorithms registered as Liquidity Providers exhibit the widest
range of trading behaviour among dealing capacities. In particular, for the most liquid share in our study,
we identify three types of behaviour that we call (i) directional trading, (ii) opportunistic trading, and
(iii) market making, and we find that around one third of Liquidity Providers behave as market markers.
Disclaimer: The views expressed in this article are those of the authors, and not necessarily those of
the Autoriteit Financiële Markten.
1 Introduction
Modelling economic agents is a fundamental part of understanding financial markets. The increasing use
of electronic trading and the vast quantities of data recorded provide opportunities for modelling agents on
a fine scale. However, privacy concerns and data access restrictions have led to limited academic research
using such data. In this paper we employ a unique dataset to understand trading behaviour. The dataset
contains member identification and algorithm identification and comprises all activity (orders, transactions,
cancellations, and amendments) in Euronext Amsterdam. We model the trading behaviour of each indi-
vidual algorithm that sends limit orders (passive and aggressive) to the market. We obtain an accurate
description of how algorithms choose direction, price, and volume and identify important market-observable
and idiosyncratic features that predict this decision. Furthermore, our work provides a detailed empirical
description of how algorithms respond to market conditions, their own inventory, and their presence in the
limit order book (LOB). Lastly, we find clusters of trading behaviour and compare them to their dealing
capacity in Euronext Amsterdam. A dealing capacity is a contractual arrangement between the trading
venue and the trading member, specifying the trading behaviour allowed by the member when acting in that
capacity. This classification is determined by the trading venue, not the regulator.
Electronic copy available at: https://ssrn.com/abstract=4442770

The models we build have a variety of applications, both in academic and regulatory contexts. From
an academic perspective, our findings provide statistical evidence for many microstructural stylised facts
discussed in the literature, and our study is the first to confirm these facts using agent-based models trained
on a dataset that contains both member and algorithm identification. Furthermore, by understanding
the behaviour of algorithms, our models can inform mathematical and algorithmic modelling of market
participants. In an agent-based modelling context, our results provide guidance for calibration of models
of the behaviour of agents at a microstructure level, rather than purely through their aggregate behaviour
in a market. Our results assess the stability of agents through time, and report the key market and agent
variables which need to be modelled.
From a regulatory perspective, our models provide guidance for surveillance practices. In particular,
our models show the most important features affecting individual trading behaviour. Supervisors could use
this information to improve their anomaly detection, focusing on outliers in features that are especially
important predictors of individual trading behaviour. Also, our work is a starting point to build an agent-
based simulator where market dynamics can be replicated at a granular level to understand how each message
to the order book may affect individual behaviour and therefore affect collective dynamics. Our statistical
framework is devised to capture the fine microstructural facts that drive the trading decisions of individual
algorithms, which is key to understanding how trading algorithms react to new market information and to
messages to the LOB. With such a market simulator, firms will be able to test their trading algorithms
and regulators will be able to study the effect of new trading algorithms on the quality of the market.
Similarly, financial authorities will have a tool to build counterfactual trading tapes to gain insights into
market behaviour in the absence of certain trading algorithms or with potential new entrants in the market.
1.1 Insights
We build models to predict the behaviour of individual trading algorithms with a dataset of beyond-level-
three LOB data that contains the activity of all market participants active in Euronext Amsterdam. For each
share and for each algorithm in our study, we propose models to predict the three principal characteristics of
an incoming order: direction (buy or sell), price, and volume. By implication, the price, the time-in-force and
state of the LOB, determine whether orders are aggressive or passive. We build a comprehensive feature space
comprising information available in the market and information available to the algorithm at the moment of
making decisions. We do not account for the latency of market participants when constructing the features;
instead, we employ information from the time “just before” the order reaches the market.1 Specifically, we
endow each algorithm with 53 features that are updated in continuous time and that describe the state of
the LOB, the algorithm’s past actions, and the member’s past actions. The main contributions of this paper
are:
(i) To obtain high out-of-sample accuracies when one predicts (a) the direction (buy or sell) of limit orders,
(b) the price limit of the limit order (irrespective of its time-in-force), and (c) the volume of the order.
(ii) To provide a detailed description of the features that are important to predict the direction, price, and
volume of limit orders sent by algorithms. For example: (a) the imbalance of the algorithm’s presence
in the LOB, the best quotes, and the intraday accumulated inventory variables, are the most important
features determining the direction of the order; (b) the volumes at the best bid and ask prices, the
imbalance of the algorithm’s presence in the LOB, and the bid-ask spread in the LOB are the most
important features determining the price of the order, and (c) the intraday accumulated inventory
and cash variables of both algorithms and members are the most important features determining the
volume of the order.
(iii) To show three clusters of ‘stylised trading behaviour’. We use regression coefficients to build the clusters
and show how these clusters differ from the registered dealing capacity of market participants.2 For
the most liquid share considered, we find that:
1 We expect that accounting for latency when constructing the features would improve the out-of-sample accuracies we obtain.
See Cartea and Sánchez-Betancourt (2021) for examples of how one can compute implied latency at a market participant level.
2 The three dealing capacities of market participants in our study are (i) Liquidity Provider, (ii) House, and (iii) Client; we
return to this in Subsection 2.2.

(a) Liquidity Providers have the widest range of trading behaviour in the dataset.
(b) One third of the Liquidity Provider algorithms fall within a cluster that we label “market markers”.
These algorithms tend to maintain a balanced provision of liquidity in the LOB, provide liquidity
inside the spread, revert their inventories to zero, and often provide liquidity at the best quotes
available in the book.3 The other two thirds of Liquidity Providers are split roughly in half
according to their trading behaviour and fall in clusters that we call “opportunistic traders” and
“directional traders”.
(c) Across clusters, the wider the bid-ask spread the more likely algorithms are to send eager-to-trade
orders. In this paper, an “eager-to-trade” order is either a buy limit order with limit price higher
than the best bid price, or a sell limit order with limit price lower than the best ask price –
essentially (as we see in Subsection 4.4.7), these are orders that either cross the spread or provide
liquidity inside the spread. Similarly, an increase in the number of messages over the last 1-100
milliseconds reduces the probability of posting at-the-touch in either direction. In this paper, an
“at-the-touch” order is either a buy limit order with a limit price equal to the best bid price,
or a sell limit order with limit price equal to the best ask price. Lastly, for most horizons used
to compute the quadratic variation of transaction prices, the larger the quadratic variation, the
less likely algorithms will show aggressive behaviour, and the more likely it is that they will post
liquidity inside the spread.
i. Algorithms in the first cluster (directional traders) have the lowest order to trade ratios,
do not maintain a balanced provision of liquidity in the LOB, seem indifferent to their own
imbalance in the book when they decide to send an eager-to-trade order, have a small baseline
probability of posting at-the-touch orders or improving the bid-ask spread, and often send
orders in the direction of their intraday accumulated inventories (i.e., buy orders when they
have a positive intraday accumulated inventory; sell orders when they have a negative intraday
accumulated inventory). More than 80% of their eager-to-trade orders trade aggressively
against the other side of the LOB. This behaviour is consistent with execution or position
building algorithms; the split of aggressive and passive trading shows how algorithms choose
to deploy their execution schedule.4
ii. Algorithms in the second cluster (opportunistic traders) trade aggressively (73% of their
transactions are liquidity taking, compared to 51% for cluster 1 and 26% for cluster 3),
and similar to algorithms in cluster 1, more than 80% of their eager-to-trade orders trade
aggressively against the other side of the LOB. These algorithms do not mean-revert their
intraday accumulated inventory and their own imbalance in the book is an indicator of their
trading direction. These algorithms have the lowest (across clusters) percentage of eager-
to-trade orders that improve the spread, the lowest baseline probability of posting orders
at-the-touch, and the highest baseline probability of posting limit orders deeper in the LOB.
These algorithms post most of their orders away from the best quotes in an attempt for higher
but less frequent profits.
iii. Algorithms in the third cluster (market makers) have the highest order to trade ratio across
clusters, are keen to maintain a balanced provision of liquidity in the LOB, have the highest
baseline probability of posting at-the-touch (when compared with the algorithms in the other
two clusters), and show eagerness to revert their intraday accumulated inventory to zero.
Algorithms in this cluster have the highest baseline probability of sending an order that
provides liquidity inside the spread. More precisely, 81% of the eager-to-trade orders sent by
algorithms in this cluster were limit orders that provided liquidity inside the spread. This
behaviour is consistent with that of market makers who are averse to holding inventories (e.g.,
due to asymmetry of information).5
3 An agent has a balanced provision of liquidity in the LOB if the volume that the agent has posted on the ask side of the
LOB is similar to the volume that the agent has posted on the bid side of the LOB.
4 See Cartea and Jaimungal (2015) for a model of how traders optimally deploy such an execution schedule using aggressive
and passive orders.

5 See the early work Amihud and Mendelson (1980), Ho and Stoll (1981), Kyle (1985), Glosten and Milgrom (1985), and the
more recent work Avellaneda and Stoikov (2008), Guéant et al. (2013), and Cartea et al. (2015), and Guéant (2016).

Using the above we discuss how supervisors can use the results of our study for the benefit of orderly
trading and to have oversight on the integrity of the market.
1.2 Connection with survey data

The recent article AFM (2023) reports the results of a survey conducted with a sample of Dutch proprietary
trading firms6 using algorithmic trading. The article reports that (i) more than 80% of their trading algo-
rithms use machine learning, (ii) their trading algorithms use features such as quantities in the LOB, price
trends, volatility, and volume imbalances to make short-term predictions, (iii) model parameters or model
weights are updated frequently, (iv) information of the recent past has predictive power in the short future,
(v) the list of possible actions that an algorithm can take after a prediction is limited and hard coded, and
(vi) firms tend to use simple models, such as linear and logistic regressions, to make predictions.
In this paper we find evidence for a number of these items. For example, our analysis demonstrates
that one can predict their trading actions based on features containing quantities in the LOB, price trends,
volatility, and their accumulated intraday inventory. We validate that more recent information has higher
predictive power than older information, and that simple models, such as the one predicting the most used
price range in the past or the most used volume range in the past, have high accuracies. This lends strong
support to the assertion that the actions available to algorithms are hard coded and the number of actions
limited. Lastly, we show that (multinomial) logistic regression models have high out-of-sample accuracies,
which lends support to algorithms using simpler, faster models.
1.3 Existing literature

The LOB plays a crucial role in the microstructure of many modern financial markets. A LOB contains
information available on a specific market and reflects the past trading decisions of its participants. Abergel
et al. (2016) discuss several models for the LOB covering important techniques such as agent-based modelling.
Bouchaud et al. (2002) investigate empirical statistical properties of equity LOBs. Cont et al. (2010) propose
a model for LOB dynamics that recovers aspects of the empirical properties satisfied by LOBs; further work
has been done in this direction, such as Hambly et al. (2020) where the authors provide a probabilistic
description of LOB dynamics. Gould et al. (2013) present a detailed account of LOBs, together with
statistical analyses of historical LOB data. They discuss how LOB models provide insights into a number of
economic questions (e.g., questions regarding market efficiency, price formation, or the rationality of traders),
but poorly resemble real LOBs and that several empirical facts have yet to be reproduced satisfactorily.
Agent-based models comprise a number of decision-makers (agents) interacting through prescribed rules.
When applied to the economy as a whole, agents can be as diverse as needed — from consumers to policy-
makers to investment banks. These models do not assume that the economy will move towards a predeter-
mined equilibrium state, as other models do; see e.g., Farmer and Foley (2009). Instead, at any given time,
each agent acts according to its current situation, the state of the environment, and the rules governing its
behaviour.
Farmer et al. (2005) apply agent-based modelling to financial markets. They use data from the London
Stock Exchange to test a simple model in which zero intelligence agents place orders to trade at random. The
model recovers simple stylised facts about the arrival rates of orders; see Lehalle et al. (2011) for an overview
of the distinction between a zero-intelligence approach to LOB modelling and agent-based models. Byrd et al.
(2019) introduce the design and implementation of ABIDES, a multi-agent equity market simulator. Assefa
et al. (2020) discuss the importance of having reliable synthetic data generators for financial markets. Cohen
et al. (2021) discuss the role of synthetic data in a modern machine-learning finance context. Vyetrenko
et al. (2020) survey the literature and collect statistics and stylised facts seen in real markets that a market
simulator should be able to recreate. They show that these models explain a large part of the variance in
the bid-ask spread and price diffusion rate. In our work, the coefficients of the regression models open the
door to taking agent-based modelling one step further; we enable the building of a market model with agents
whose actions (direction, price, and volume) are learned from data.
Agent-based modelling can also provide relevant insights for supervisors looking at market manipulation.
Wang et al. (2021) present an agent-based model where agents spoof the LOB to manipulate prices: submit
6 Trading firms dealing exclusively on own account.

spurious orders to mislead market participants. They demonstrate that traders who use historical data to
predict prices improve price discovery and social welfare, but their existence in equilibrium renders a market
vulnerable to manipulation. That is, spoofing strategies can effectively mislead traders, distort prices and
reduce total surplus. Our study provides the foundations to create more realistic agent-based models, because
we show that the trading decisions of agents can be modelled and predicted well. Furthermore, we show
which features are most important in predicting the direction, price and volume of an order; see also Cartea
et al. (2020), Tao et al. (2022), Williams and Skrzypacz (2020).
Roşu (2009) presents a model of an order-driven market in which fully strategic, symmetrically informed
liquidity traders dynamically choose between limit and market orders. The model makes a number of
empirical predictions, such as (i) higher trading activity and higher trading competition cause smaller spreads
and lower price impact, and (ii) market orders lead to a temporary price impact larger than the permanent
price impact. Sirignano and Cont (2019) find evidence of a universal relationship between order flow and the
direction of price moves – here the authors find stable out-of-sample accuracies across shares and periods. Our
results also show stable out-of-sample accuracies when predicting direction, price, and volume of orders sent
to the exchange by market participants. The models we build in this paper can be used to answer questions
in the two aforementioned studies. In particular, we report how the probability of posting liquidity inside the
spread and the probability of crossing the spread change according to key features (e.g., volume imbalances).
We find that the higher the recent activity the more likely algorithms are to submit orders that are eager to
trade.
Bouchaud et al. (2003) uses trades and quotes data from the Paris stock market to show that the random
walk nature of traded prices results from an interplay between two opposite tendencies: long-range correlated
market orders and mean reverting limit orders. They define and study a model where the price is the result
of the impact of all past trades. Aı̈t-Sahalia et al. (2022) use machine learning to study the predictability
of ultra high-frequency share returns. They find that, contrary to low-frequency and long horizon returns,
predictability of high-frequency returns is large, systematic, and pervasive over short horizons. Our study
employs a number of the features they use, specifically by taking features over various short time horizons. We
show how these features remain predictive, even when one takes into account algorithm-specific information
that is not visible to the market.
Hendershott et al. (2011) focus on the impact of algorithmic trading on market quality, and say that
algorithmic trading narrows spreads, reduces adverse selection, and reduces trade-related price discovery.
Cartea and Penalva (2012) propose that greater speed could allow fast traders to profitably intermediate
between liquidity demanders and liquidity suppliers. This additional intermediation layer would increase
execution costs and microstructure volatility. Furthermore, Cartea et al. (2019) find that an increase in
ultra-fast machine-driven activity is associated with lower intraday market quality (greater quoted and
effective spreads and lower depth).
For a review on statistical approaches to modelling high-frequency trading data see Dutta et al. (2022);
they focus on modelling the aggregated publicly available data as opposed to individual agents. There are
strands of the literature that find that deeper layers of the order book contain useful information Libman
et al. (2021). In our models, the likelihood of algorithms sending eager-to-trade orders when they have more
volume posted deeper in the LOB varies between clusters, suggesting the results of Libman et al. (2021)
are applicable to particular types of market participants. We also find that around one third of Liquidity
Provider algorithms behave as market makers and improve the spread around 86% of the times they send an
eager-to-trade order. In comparison, directional traders and opportunistic traders improve the spread only
17% and 10% (respectively) of the times they send an eager-to-trade order.
This paper also builds upon previous research on the clustering of agents. Mankad et al. (2013) develop
a dynamic machine learning method that buckets 15,686 traders in the E-mini S&P 500 futures into five
persistent categories. They identify the categories of high-frequency traders (14 traders), market makers
(271 traders), opportunistic traders (7,126), fundamental traders (254), and small traders (8,021). Wright
et al. (2022) introduce machine learning methods that cluster active traders in the market. Their methods,
which can be used for intraday classification, require a mixture of times, volumes, and prices at which traders
are willing to transact together with traded prices. Cont et al. (2023) present a granular representation of
the LOB that accounts for the origins of different orders. They seek to describe client order flow from a
large broker, in particular, they segment clients into different clusters, for which they identify representative
prototypes. In our paper we give a complementary analysis, by clustering the trading behaviour of algorithms,

and compare these clusters with the dealing capacity in Euronext Amsterdam.
Megarbane et al. (2017) study the behavior of high-frequency traders (HFTs) and their role in liquidity
provision under market stress scenarios such as high-volatility periods surrounding news announcements.
Lastly, Hagströmer and Nordén (2013) emphasise the importance of distinguishing different HFT strategies
and their influence on market quality. Using data from Nasdaq-OMX Stockholm, they distinguish between
two types of HFTs: market makers and opportunistic traders. In our paper, we find evidence of Liquidity
Providers having a wide range of trading behaviour. In particular, we find that one third of Liquidity
Providers behave as market makers and the other two thirds are roughly split into opportunistic traders and
directional traders.
2 Data description
We use data from Euronext Amsterdam spanning the period 11 October 2021 to 30 January 2022 (80
trading days in 16 weeks) in four shares with tickers ASML, INGA, AD, TOM2 – we refer to these shares
as ASML, ING, AHOLD, and TOMTOM respectively. The data is labelled with algorithm identification
and member identification, and contains transactions, orders, cancellations, and amendments processed by
Euronext Amsterdam over the relevant period. Orders are matched via price-time-priority and are displayed
in a central LOB.7
2.1 Assets studied

In our study, the companies represent different sectors in the capital markets (technology, finance, and
consumer goods) and their market capitalisation ranges from large to small. The shares differ in the number
of trading venues on which they trade, the number of derivatives written on the share, and the number of
indices in which the share is a constituent. Euronext Amsterdam is the principal market in which the four
shares trade. Specifically,
(i) ASML Holding (ticker ASML) specialises in the development and manufacturing of machines that
produce computer chips, its market capitalisation is approximately e200B, it is traded on various
trading venues, and it is a constituent in a number of indices.8 We focus on ASML in the main text of
the paper; results for the other shares are in the Appendix.
(ii) ING Groep N.V. (ticker INGA) is a Dutch multinational banking and financial services corporation
with headquarters in Amsterdam, its market capitalisation is approximately e50B, it is traded on
various trading venues, and it is a constituent in a number of indices.9
(iii) Koninklijke Ahold Delhaize N.V. (ticker AD) is a Dutch multinational retail and wholesaling company,
its market capitalisation is approximately e30B, it is traded on various trading venues, and it is a
constituent in a number of indices.10
(iv) TomTom (ticker TOM2) is a Dutch multinational developer and creator of location technology and
consumer electronics, its market capitalisation is approximately e1B, it is traded on fewer trading
venues, and it is a constituent in a few indices.11
Table 1 presents summary statistics of the four shares between 11 October 2021 and 30 January 2022.
7 See
Euronext (2023d, p. 43) for information about the main trading session and p. 44–46 for market mechanisms.
8 Traded
on more than 100 trading venues in 2022 according to AFM MIFID-II transaction data; see Euronext (2023b) for
more information on ASML.
9 Traded on more than 100 trading venues in 2022 according to AFM MIFID-II transaction data; see Euronext (2023c) for
more information on ING.
10 Traded on more than 100 trading venues in 2022 according to AFM MIFID-II transaction data; see Euronext (2023a) for
more information on AHOLD.
11 Traded on more than 50 trading venues in 2022 according to AFM MIFID-II transaction data; see Euronext (2023e) for
more information on TOMTOM.

avg daily tradecount avg daily traded volume e lowest price period e highest price period e
ASML 41,847 543,088,447 566.10 770.50

ING 21,469 186,226,078 11.58 13.62
AHOLD 10,535 78,665,021 27.54 31.35
TOMTOM 1,385 4,329,991 6.00 9.23
Table 1: Statistics for the shares in the study between 11 October 2021 and 30 January 2022.
The tick sizes of the shares do not change during the period 11 October 2021 to 30 January 2022. The
tick size of ASML is e0.1, the tick size of ING is e0.002, the tick size of AHOLD is e0.005, and that of
TOMTOM is e0.005. We observe that ASML, ING, and AHOLD have similar tick sizes as a percentage of
their share price (between 1.3 and 1.8 basis points), while TOMTOM has a larger tick size as a percentage
of its share price.12
2.2 Euronext: members and algorithms

In Euronext’s rule book (Euronext (2023d)), a member is any individual, corporation, partnership, associ-
ation, trust, or entity who has been admitted to Euronext Securities Membership or Euronext Derivatives
Membership and whose membership has not been terminated. Only members can trade on Euronext Ams-
terdam. A member can trade on its own account or for clients. The latter includes retail-brokers who process
so-called “retail orders”. A member can trade under one or multiple dealing capacities.
One of the dealing capacities is “retail liquidity provider”. Orders of members acting as retail liquidity
providers can be matched only with orders submitted by “retail” members – see Euronext (2023d, p. 14).
Other dealing capacities (each with its own set of obligations) are (i) Liquidity Provider, (ii) House, and (iii)
Client. This classification is determined by Euronext, not the regulator.
In this study, an “agent” is the concatenation of the membership, the dealing capacity, and the “entity”
that sends the order. We provide a stylised example: suppose company ABC is a member of the trading
venue and trades in two dealing capacities: Liquidity Provider and House. When company ABC trades
as a Liquidity Provider, it employs either: the Avellaneda–Stoikov algorithm (AS), or the at-the-touch
Cartea–Jaimungal–Penalva algorithm (CJP). When it trades as a House, it employs the internalisation-
externalisation trading algorithm of Cartea–Betancourt (CB). The agents within company ABC are (i)
ABC-LP-AS, which refers to the AS algorithm within the Liquidity Provider dealing capacity of ABC, (ii)
ABC-LP-CJP, which refers to the CJP algorithm within the Liquidity Provider dealing capacity of ABC,
and (iii) ABC-H-CB, which refers to the CB algorithm within the House dealing capacity of ABC.13
Euronext data shows which entity within the member executes a given order, but not if this entity is
(i) a human trader, or (ii) a trading algorithm. However, by cross-checking with regulatory data we know
that all Liquidity Provider agents, and most of the agents within House and Client, are trading algorithms.
Therefore, we use the terms “agent” and “algorithm” interchangeably.
2.3 Data filters and the orders we study

In this study, we apply a number of filters to the data. First, we focus on messages entered between 09:05:00
and 17:25:00. This excludes the first and last five minutes of the main trading session and also omits messages
sent during the pre-opening phase and closing phase.
Second, we ignore any undisclosed volume in the LOB when constructing the features of a given algorithm.
This consists of volumes in the order book that – similar to iceberg orders – are not visible to market
participants. We note, however, that under 0.1% of all orders in any of our shares include undisclosed
volume.
Third, in creating our features, we ignore all orders, deletions, and amendments by agents classified as
“retail liquidity providers” (retail LPs). This specific dealing capacity has the unusual behaviour that their
orders, deletions, and amendments (including the resulting transactions and volumes in the order book) are
12 There is an extensive literature on the effect of tick sizes on liquidity; see Verousis et al. (2018) or Penalva and Tapia (2021)
for relevant references.

13 Trading strategy AS is in Avellaneda and Stoikov (2008), trading strategy CJP is in Chapter 10.2.2 of Cartea et al. (2015),
and trading strategy CB is in Cartea and Sánchez-Betancourt (2022). See Cartea et al. (2015) and Guéant (2016).

visible only to “retail” members. Retail clients are able to trade both with retail LPs and the rest of the
order book. In effect, this creates two LOBs, one for retail members and one for all other agents. In our
dataset, we exclude retail LPs’ orders, and all retail member behaviour which matches against these orders,
because we are not concerned with retail trading behaviour. The remaining orders of retail members are
included in our dataset, for the purposes of determining the information available to the market, but we will
not attempt to fit models for the behaviour of retail members.
2.3.1 Order types

The order types available in Euronext are (i) limit orders, (ii) market orders, (iii) stop orders, (iv) pegged
orders, and (v) mid-point orders. Table 2 shows the number of trades in the first four weeks of data for
ASML for the following combinations: (i) the buy order is a limit order and the sell order is a limit order,
(ii) the buy order is a market order and the sell order is a limit order, (iii) the buy order is a limit order and
the sell order is a market order, and (iv) all other combinations.
order type trade count

buy side sell side number percentage (%)
limit order limit order 752,922 97

market order limit order 6,686 1
limit order market order 4,974 1
all other combinations 10,446 1
Table 2: Order types of transactions in ASML during first four weeks of data.
The algorithms in our dataset mostly use limit orders, instead of market orders, to take liquidity. Approx-
imately 97% of all transactions result from matching two limit orders. Specifically, in 97% of transactions
one leg is a limit order that rests in the LOB and the other leg is an incoming limit order with a limit price
generous enough to match the first leg.
2.3.2 Time-in-force
Orders submitted to Euronext have a validity parameter – also known as the time-in-force of the order. Table
3 shows the trade count grouped by the validity parameters of the buy and sell orders. More precisely, we show
the number of transactions in the first four weeks of data for ASML for the following combinations: (i) the
buy order has validity parameter immediate-or-cancel (“IoC”) and the sell order has validity parameter good-
for-day (“DAY”), (ii) the buy order has validity parameter DAY and the sell order has validity parameter
IoC, (iii) the buy order has validity parameter DAY and the sell order has validity parameter DAY, and (iv)
all other combinations (for example, fill-or-kill).
validity parameter trade count

buy side sell side number percentage (%)
IoC DAY 283,545 37

DAY IoC 280,615 36
DAY DAY 195,691 25
all other combinations 15,177 2
Table 3: Validity types of transactions in ASML during first four weeks of data.
In the first two rows of Table 3, liquidity is consumed with an IoC order. In the third row, liquidity is
taken with a DAY order. IoC and DAY orders account for 98% of all transactions. Naturally, in Table 3, it
is always the case that one of the two legs has DAY as a time-in-force, which represents the liquidity resting
in the LOB waiting to be matched; the other leg is IoC roughly two-thirds of the times and DAY roughly
one-third of the times.

2.3.3 The orders that we model
In this paper we model limit orders (including all possible validity parameters). As shown in the two tables
above, these orders account for almost all transactions conducted in Euronext.
Table 4 shows the order and validity types of all limit orders sent during the first four weeks of data for
ASML. This constitutes a four-week snapshot of the orders that we model in this paper.
order type validity type order count percentage (%)
DAY 7,389,072 96
limit order IoC 274,211 4
other 660 0
Table 4: Order and validity types of orders in ASML during first four weeks of data.
Limit orders with validity type DAY are 96% of all limit orders sent, and orders with validity type IoC
account for only 4%. We also observe that IoC orders are often filled by more than one DAY order, which
explains the difference between the figures in Tables 3 and 4.
Table 5 uses the first four weeks of data for ASML and reports the (contribution to) order count, trade
count, and volume traded of groups of algorithms. More precisely, the table summarises the aggregate
contribution to (i) order count, (ii) trade count, and (iii) volume traded of the top five, top ten, top twenty,
top fifty, and all algorithms. The ranking of algorithms is based on the order count of algorithms during the
first four weeks of data. We note that each trade requires the involvement of two participants, leading to a
double-counting of the number of trades in Table 5.
order count order count trade count trade count volume e volume
percentage of all % percentage of all % percentage of all %
Top 5 2,612,307 34 382,040 27 2,332,746,616 19

Top 10 4,055,626 53 478,985 34 2,972,717,749 25
Top 20 5,652,783 74 679,310 48 4,431,976,153 37
Top 50 7,275,195 96 1,047,174 74 8,086,989,709 67
All 7,613,489 100 1,424,453 100 11,998,022,616 100
Table 5: Descriptive statistics for algorithms trading in ASML during first four weeks of data.
The top ten algorithms account for more than 50% of the orders, are involved in more than 30% of
the trades, and involved in more than 20% of the total traded volume. While algorithms below the top 50
provide only 4% of the orders, they are involved in 26% of all trades (and 33% of all traded volume by value).
In the Appendix we report the above statistics for ING, AHOLD, and TOMTOM in Tables 20, 21, and 22
respectively.
3 Modelling approach
3.1 Features
In this section, we describe the variables we use in the study. Brogaard et al. (2014) find evidence that public
information influences the direction of the orders submitted by HFTs, for example, news announcements,
price movements, and LOB imbalances. Cont et al. (2023) suggest one should also consider the time of
day and other market conditions such as momentum and volatility. Aı̈t-Sahalia et al. (2022) use three
clocks to construct predictor variables over non-overlapping lookback windows. AFM (2023) indicates that
proprietary trading firms report that their trading algorithms use features such as order book imbalance,
volume in the order book, and price trends, all measured over various time horizons. Furthermore, firms
report that the most recent data has the most predictive power – e.g., messages in the last few milliseconds
have more predictive power than older messages.
In this paper, we use 53 features that reflect (i) the state of the LOB, (ii) recent activity in the LOB, (iii)
the current presence of the algorithm in the LOB, (iv) the current presence of the member in the LOB, (v) the

cash and inventory of the algorithm, and (vi) the cash and inventory of the member. By inventory we mean
a transformation f (x) = sign(x) log(1 + |x|), where | · | is the absolute value, of the intraday accumulated
position measured in number of shares, and the starting value of the accumulated position at the beginning
of each trading day is zero by definition. Similarly, cash is the accumulated expense (in EUR) of purchases
and sales of inventory; we work with expenses (so purchasing inventory increases the cash variable), to ensure
that inventory and cash typically have the same sign. We also apply the above transformation f to the cash
variable. Naturally, there is a strong positive relationship between these two variables. Given our data, we
only consider transactions which occur on Euronext Amsterdam. All features are measured at the time just
before the order enters the market.
To describe our features more precisely, let A = {a1 , a2 , . . . , aM } be the identifiers for the algorithms
on Euronext, and B = {b1 , b2 , . . . , bN } be the identifiers for the trading members of Euronext – we have
that N ≤ M , i.e., a trading member can trade with multiple trading algorithms. The features we use for
algorithm ai ∈ A from trading member bi ∈ B are: (i) volumes in the LOB excluding those from trading
algorithm ai , (ii) volumes in the LOB posted by trading algorithm ai , (iii) imbalance in the LOB excluding
volumes posted by algorithm ai , (iv) imbalance in the LOB of volume posted by algorithm ai , (v) logarithm
of the volume at the best bid and the best ask, (vi) bid-ask spread, (vii) returns of transaction prices over
various time windows in the past, (viii) volatility of transaction prices over various time windows in the past,
(ix) number of messages over various time windows in the past, (x) number of aggressive buy orders, number
of aggressive sell orders, and net aggressive order flow over various time windows in the past,14 (xi) logarithm
of volume of last transaction, (xii) inventory of trading algorithm ai , (xiii) inventory of trading member bj ,
(xiv) cash of trading algorithm ai , and (xv) cash of trading member bj . In Appendix B we provide formulae
for each of the above features. The categories (i) to (xv) include between one and six variables each and
account for a total of 53 features.
The features we employ are measurable to regulators and trading venues. Even if important variables
such as inventory of algorithm (i.e., intraday accumulated inventory of algorithm) are not used as signals in
the internal logic of an algorithm, it is nonetheless a variable, visible to the regulator or trading venue, that
can be used to predict an algorithm’s trading behaviour. Effectively, our variables can be seen as proxies for
some of the private signals used by trading firms.
There is correlation between many of the above features; for example, it is well-known that there is
positive correlation between the bid-ask spread and any of the measures of volatility we consider; see e.g.,
Grossman and Miller (1988), Glosten and Milgrom (1985), O’Hara (1998). For each share in our study,
we use the first four weeks of data to fit a PCA transformation and project the 53 features down to 30
features. This captures more than 90% of the variation in the standardised features for all shares.15 For
each share, we apply the transformation obtained for the first four weeks to the remaining twelve weeks.
We refer to the features after the PCA transformation as the PCA-transformed features and we refer to the
features before the PCA transformation as the original features. Unless stated otherwise, we work with the
PCA-transformed features.
3.2 Output variable

Consider an algorithm which sends a limit order at time t. We model the choice of (a) direction, (b) limit
price, and (c) volume of the limit order sent to the market. Recall that Table 2 shows that around 97% of
all trading activity employs limit orders. Our models accommodate both provision of liquidity and taking
of liquidity. We return to this point in Subsection 4.4.7 when we consider a conditional model for liquidity
taking activity.
We consider all limit orders regardless of their time-in-force. To formalise the problem, let (Dt , Pt , Vt ; ai , bj )
be a limit order sent by algorithm ai ∈ A and trading member bj ∈ B arriving at the exchange at time t,
where Dt ∈ {−1, 1} is the direction of the order (Dt = 1 if the order is to buy and Dt = −1 if the order is to
sell), Pt ∈ R is the price limit attached to the order and Vt ∈ R+ its volume.16 The signed difference from
14 We use the terms “aggressive” and “liquidity taking” interchangeably.
15 The exact percentages of the variation captured with 30 features is: 93% for ASML, 92% for ING, 93% for AHOLD, and
93% for TOMTOM.
16 Accounting for latency is out of the scope of this analysis and is left for future research. However, we mitigate some of the
effects of latency in the features that agents observe, by measuring features over various time-intervals.
10

best quote is given by (
Pt − Stb if Dt = 1 ,
Pt = (1)
Sta − Pt if Dt = −1 ,
and
Vt = log(1 + Vt ) , (2)
is the log-volume of the order. For an order to buy (Dt = 1) we have that Pt = Pt − Stb ,
which is: (i) greater
than zero if the order has a limit price that is more generous (higher than) than the best bid price, is (ii)
equal to zero if the order has a limit price that is as generous as (equal to) the best bid price, and (iii) less
than zero if the order has a limit price that is less generous (lower than) than the best bid price. Similarly,
for an order to sell (Dt = −1) we have that Pt = Sta − Pt , which is: (i) greater than zero if the order has a
limit price that is more generous (lower than) than the best ask price, is (ii) equal to zero if the order has
a limit price that is as generous as (equal to) the best ask price, and (iii) less than zero if the order has a
limit price that is less generous (higher than) than the best ask price.
We study three regimes of the variable Pt and use them to define the price bucket variable Pt as follows:

 1 if Pt ∈ (0, ∞) (eager-to-trade orders – more generous than current best limit price),
Pt = 2 if Pt ∈ {0} (at-the-touch orders – at the current best limit price), (3)
3 if Pt ∈ (−∞, 0) (orders deeper in the LOB).

For ASML, we note that Pt = 1 for 9.6% of orders, Pt = 2 for 42.8% of orders, and Pt = 3 for 47.6% of
orders; see Table 12.
With four weeks of data from 11 October 2021 to 5 November 2021, we compute the (up to nine) unique
population deciles of the variable Vt , and use these population deciles to define the (up to ten) buckets
associated with this variable. We denote by v c the number of buckets for the log-volume. The bucket version
of Vt is denoted by Vt ∈ {1, . . . , v c }. We apply this bucketing to each of the twelve subsequent weeks of
data. The variables we model are
(Dt , Pt , Vt ) , (4)
and we call them (i) direction of the order, (ii) price bucket, and (iii) volume bucket, respectively.
As discussed above, our models capture both aggressive and passive behaviour depending on the price
bucket and the time-in-force of the order. For example, note that all aggressive behaviour falls within Pt = 1.
More precisely, Pt = 1 includes orders that (i) cross the spread and trade against the opposite side of the
LOB (aggressive orders), (ii) post liquidity inside the spread (generous liquidity provision), and (iii) are
cancelled by the exchange upon entry (missed attempts) because their price limit is not generous enough
to trade and their time-in-force precludes them from resting in the LOB; we return to this point below in
Subsection 4.4.7 where we study a conditional model for liquidity taking activity.
3.3 Regression models

For each share c we employ W = 16 datasets {Dkc }W
k=1 , each corresponding to a given week of orders in the
period 11 October 2021 to 30 January 2022. For each week k, the dataset is given by Dkc = {(xkt , ytk )}Tt=T
k
1
,
where, xkt ∈ RK̃ are the features, and ytk ∈ {−1, 1}×{1, 2, 3}×{1, . . . , v c } are the order details, given by ytk =
(Dtk , Pkt , Vkt ). The vector xkt ∈ RK̃ contains the features we use for the regressions xkt = (xk,1 k,K̃
1 , . . . , x1 ).
k
Here, K̃ = 30 when we use PCA-transformed features. Note that the features xt implicitly depend on the
algorithm placing the order. We use logistic regressions to model the decisions of agents, that is
1
(R1) P(Dtk = 1) = , on Dkc , (5)
1 + exp β D,k + β D,k · xkt

exp βyY,k + βyY,k · xkt
(R⋆) P(Ytk = y) = Pyc −1 , on Dkc , (6)
1 + j=1 exp βjY,k + βjY,k · xkt
where Y ∈ {P, V} and the operator · denotes the dot product between two vectors. When Y is P then
R⋆ = R2, y ∈ {1, 2, 3}, and y c = 3. Similarly, when Y is V then R⋆ = R3, y ∈ {1, . . . , v c }, and y c = v c .
11

We model (Dtk , Pkt , Vkt ) with the vector of features xkt . We also model price bucket and volume bucket
conditional on the direction Dtk ; for this, we perform logistic regressions

exp βyY,k,d + βyY,k,d · xkt
(R∗) k k
P(Yt = y | Dt = d) = Pyc −1 , on Dkc,d , (7)
Y,k,d Y,k,d k
1 + j=1 exp βj + βj · xt
where we have the following four combinations: (i) when Y is P and d = +1 then R∗ = R4, y c = 3,
Dkc,d = Dkc,+1 , and y ∈ {1, 2, 3}; (ii) when Y is P and d = −1 then R∗ = R5, y c = 3, Dkc,d = Dkc,−1 , and
y ∈ {1, 2, 3}; (iii) when Y is V and d = +1 then R∗ = R6, y c = v c , Dkc,d = Dkc,+1 , and y ∈ {1, . . . , v c }; (iv)
when Y is V and d = −1 then R∗ = R7, y c = v c , Dkc,d = Dkc,−1 , and y ∈ {1, . . . , v c }. We also have that
β3P,k,+1 , β3P,k,+1 , β3P,k,−1 , β3P,k,−1 , βvV,k,+1
c , βvV,k,+1
c , βvV,k,−1
c , βvV,k,−1
c are all zero. Here we use
Dkc,+1 = {(xkt , ytk ) ∈ Dkc : Dt = 1} , (8)

Dkc,−1 = {(xkt , ytk ) ∈ Dkc : Dt = −1} . (9)
For each algorithm we fit the above models. Section 4.2 reports the out-of-sample accuracies and out-
performances over alternative methods for ASML. The results for the other three shares are reported in
Appendix C.1.
3.4 Machine learning models

3.4.1 Random Forests
We also use random forests – introduced by Breiman (2001) – as a benchmark to assess the performance of
the logistic regressions. Random forests go beyond linear models but lack explainability. We return to this
point below when we discuss data privacy concerns.
We train random forests with fifty trees in the forest and a maximum depth of five for a given tree. To
avoid overfitting and to obtain these parameter values, we tune hyperparameters to balance the in-sample
and out-of-sample accuracies. We employ sklearn.ensemble in Python, and the same features as in our
logistic regressions.
3.4.2 Algorithm clusters

The regression coefficients from our logistic regressions describe trading behaviour. Indeed, the value of the
regression coefficients model the probability with which algorithms choose direction, price, and volume. In
this paper, we use the regression coefficients of the models for direction (Dt ) to cluster trading behaviour. For
the clustering exercise we employ hierarchical agglomerative clustering; see Section 21.2 in Murphy (2022).
In particular, we use complete linkage, cosine affinity, and a target of three clusters, to compare with the
dealing capacities of algorithms – recall that we study three dealing capacities in this paper: (i) Liquidity
Provider, (ii) House, and (iii) Client. We employ AgglomerativeClustering from sklearn.cluster.
4 Model fit
4.1 Train and deploy structure
As described in Section 2, we use sixteen weeks of data in the study. The models are trained over four
consecutive weeks of data and applied to the week after. For example, we use weeks 1 to 4 (11 October 2021
to 5 November 2021) to train the models, then obtain out-of-sample predictions for week 5 (8 November
2021 to 12 November 2021), train models from weeks 2 to 5 (18 October 2021 to 12 November 2021), then
obtain out-of-sample predictions for week 6 (15 November 2021 to 19 November 2021), and so on. We refer
to each such run as a ‘train-and-deploy’ exercise.
With the sixteen weeks of data we perform twelve train-and-deploy exercises. Thus, there are twelve
weeks of data for which we perform out-of-sample predictions. For each train-and-deploy exercise we obtain
seven accuracies per algorithm (one for each model in R1 to R7; see Section 3.3). Next, we compute the
12

order-weighted accuracy of the predictions per algorithm over the twelve weeks of data and report the results
in Table 6. With these accuracies, we compute (i) the average, plus and minus the standard deviation, of
the accuracies across groups of algorithms and we report it in the first five rows, and (ii) the order-weighted
accuracy for all algorithms and report it in the bottom row.
4.2 Out-of-sample accuracy of predictions

Table 6 reports out-of-sample accuracies of the predictions made by the logistic regression models for twelve
train-and-deploy exercises on ASML. The direction of order Dt can take two values, the price bucket Pt
three values, and the volume bucket Vt can take nine values.
The ordering of algorithms in Table 6 is based on the order count over all deploy-weeks of the train-and-
deploy exercises. For example, the “Top 5” row refers to the top five algorithms that sent most orders during
the twelve deploy-weeks. The same ordering method is applied to Table 7 and the tables for other shares.
R1 R2 R3 R4 R5 R6 R7
Dt Pt Vt Pt |Dt = 1 Pt |Dt = −1 Vt |Dt = 1 (%) Vt |Dt = −1
Top 5 65 ± 7 80 ± 9 60 ± 23 85 ± 6 84 ± 6 61 ± 21 60 ± 22
Top 10 66 ± 7 86 ± 9 68 ± 22 89 ± 6 89 ± 6 69 ± 21 69 ± 22
Top 20 68 ± 9 87 ± 12 62 ± 23 89 ± 10 89 ± 10 63 ± 22 63 ± 22
Top 50 73 ± 12 86 ± 15 60 ± 25 86 ± 18 87 ± 18 59 ± 26 60 ± 25
All 77 ± 15 80 ± 19 48 ± 25 78 ± 24 77 ± 25 46 ± 25 46 ± 26
order-weighted average 70 85 62 88 88 62 62
Table 6: Accuracies of the logistic regression models for ASML. Twelve train-and-deploy exercises. Accura-
cies are reported in % with ± the standard deviation.
Accuracy of predictions is high and stable over time for all algorithms. Furthermore, we observe that
accuracies of the logistic regression models are stable for all shares – for example, when predicting the
direction of the order Dt , accuracies range between 67% and 70% according to the order-weighted accuracy
for all algorithms; see Appendix C.1. The order-weighted average of the regressions for Pt that condition
on direction of order perform better than regression R2. This is not the case for Vt where we have a similar
performance for R2, R6, and R7.
We compare the accuracies of the logistic regression models against a benchmark where the predicted
bucket is the bucket most frequently used by the algorithm in the training data. Table 7 reports the
outperformance of the logistic regression models over this benchmark.
Dt Pt Vt Pt |Dt = 1 Pt |Dt = −1 Vt |Dt = 1 Vt |Dt = −1
Top 5 15 ± 7 8 ± 10 5 ± 6 14 ± 14 13 ± 14 6±8 5±7

Top 10 15 ± 7 4±9 2 ± 5 7 ± 12 7 ± 12 3±6 2±6
Top 20 17 ± 10 4±7 3 ± 7 7 ± 10 6 ± 10 4±8 3±7
Top 50 20 ± 13 3±6 3 ± 6 5±8 4±8 3±7 3±7
All 21 ± 17 3 ± 10 0 ± 8 3 ± 10 4 ± 11 1 ± 10 0 ± 11
Table 7: Outperformance over benchmark for ASML. Twelve train-and-deploy exercises.
On average, the logistic regression models outperform the benchmark. The outperformance is largest for
predicting the direction of an order, at 18% on average. When predicting volume, we observe the smallest
outperformance over the benchmark. See Appendix C.1 for comparable results for other shares.
Table 8 reports the outperformances of random forests when compared with the same benchmark as that
used in Table 7, and compared with the logistic regressions in Table 6.
13

outperformance
over benchmark over logistic
Dt Pt Vt Dt Pt Vt
Top 5 21 ± 7 7 ± 8 9 ± 10 6 ± 7 −1 ± 3 4 ± 4
Top 10 19 ± 7 4 ± 7 5±8 4 ± 6 −1 ± 2 3 ± 3
Top 20 23 ± 11 3 ± 6 6±8 6 ± 7 −1 ± 2 2 ± 5
Top 50 24 ± 13 2 ± 5 4±7 4 ± 6 −1 ± 2 1 ± 4
All 22 ± 17 3 ± 7 2±7 1 ± 7 0 ± 6 3 ± 5
order-weighted average 23 4 5 5 0 2
Table 8: Outperformance of random forests over benchmark and over logistic regressions for ASML. Twelve
train-and-deploy exercises.
Overall, random forests outperform the benchmark by 23% when predicting whether algorithms send a
buy or a sell order – this is 5% higher than the outperformance achieved by the logistic regression models.
The outperformance over the benchmark is stable across shares; in all cases, random forests outperform
logistic regression models. When predicting the price bucket, random forests outperform the benchmark by
4%, on average, which is the same outperformance attained by the multinomial logistic regression models.
Lastly, when predicting the volume bucket, the outperformance of random forests over the benchmark is
5%, which is 2% higher than the outperformance of the logistic regression over the benchmark. On average,
random forests tend to slightly outperform logistic regressions.
Going beyond logistic regressions comes at a cost. Machine learning models are known to have the
potential to leak parts of the training data when one discloses the model (weights or coefficients obtained)
as investigated in Shokri et al. (2017). That is, some models are vulnerable to privacy attacks – see Rigaki
and Garcia (2020) for a comprehensive survey. For many applications of our study this suggests it is more
appropriate to use logistic regressions because of their interpretability, their simplicity, and their privacy
features. We therefore focus on logistic models in the remainder of this study.
4.3 Feature importance

Feature importance encompasses a set of machine learning techniques that orders features according to their
contribution in explaining a given output variable. In this section, we assess the level of importance of each
of the features we employ to predict Dt , Pt , and Vt – see Section 3.1 for the complete list of features used
in our study.
We employ the ‘permutation feature importance’ approach in Breiman (2001). For a given feature, this
model-agnostic technique randomly permutes the values of the feature in the training set and computes the
change in model score by taking the difference between (i) the accuracy using the original data and (ii) the
accuracy with the permuted values of the feature. The changes in score indicate the sensitivity of the model
to the feature. The permutation of values is random, so we repeat this process ten times for each feature.
Permutation feature importance attempts to quantify the contribution of an individual feature to the
predictive model. This implies that if a variable is highly correlated with others, it will typically be assigned
a low importance, as the model is able to use other variables to obtain the same information.
4.3.1 Logistic regressions

We repeat the train-and-deploy exercises as described in Section 4.1 on the original features described in
Section 3.1 instead of the PCA-transformed features. The use of the original features as opposed to the
PCA-transformed features in the logistic regressions is restricted to the feature importance results. The
results are reported below for each of the three regressions (R1, R2, R3).
First, we look at the most important feature to predict Dt , Pt , Vt for each of the top ten algorithms.
The ranking of the top ten algorithms is based on the order count over all deploy-weeks of the train-and-
deploy exercises. The most important feature is the feature with the highest median importance across the
twelve train-and-deploy exercises, the ten algorithms, and the ten random shuffles per feature (giving us
12 × 10 × 10 = 1, 200 data points per feature describing their importance). See Figure 1 for the results.17
17 In the figures that follow, we abbreviate the feature names for displaying purposes: “excl” is “excluding”, “chg” is “change”,
“m” is “minutes”, “s” is “seconds”, “ms” is “milliseconds”, “quad var” is “quadratic variation”, “num” is “number”, “agg” is
14

Dt Pt Vt
7 7 7
6 6 6
5 5 5
4 4 4
3 3 3
2 2 2
1 1 1
0 0 0
imbalance inventory best volumes spread num inventory cash of inventory inventory cash of
of algo top of algo messages of algo algorithm of member of algo member
five levels 0.1ms
Figure 1: Most important feature for the top ten algorithms in ASML.
Although there are differences in the importance of features across predictions (Dt , Pt , and Vt ), there is
significant overlap on the most important feature within each of the predictions. For example, the imbalance
of the volumes posted by the algorithm in the first five levels of the LOB is the most relevant feature for
seven out of the top ten algorithms when predicting the direction of the order Dt . This is similar to other
empirical results in the literature; see Cartea et al. (2018) where the authors find that the imbalance of
volumes resting in the LOB is a good predictor of the sign of the next liquidity taking order sent to the
exchange, that is, whether the next liquidity taking order will be to buy or to sell.18 We find that, for
most algorithms, when deciding on the direction of the order Dt , the imbalance of the volumes posted by
themselves is more important than the imbalance of the volumes posted by other algorithms.
To predict the price bucket and volume bucket of orders, the bid-ask spread and intraday accumulated
inventory are respectively among the most important variables. Intuitively, as the spread widens (tightens),
the algorithm is more likely to post orders further from (closer to) the midprice. However, as seen earlier,
volumes of orders typically vary little per algorithm. We do not observe ‘directional’ variables, for example
an algorithm’s current exposure in the LOB, contributing to the price model, but we see that these play a
role in the price conditioned on direction model; see Subsection 4.4.6.
Next, Figure 2 shows a box-plot of the distribution of feature importances for the top ten algorithms
computed over twelve train-and-deploy exercises, and ten repetitions of random permutations. We sort
features according to their median importance and plot the top ten – each feature has 1,200 data points
corresponding to 12 sets of four consecutive weeks, 10 algorithms, and 10 repetitions. The results in the
box-plots differ from the results in Table 1 in that Figure 2 shows the distribution of feature importances
between algorithms instead of showing only the most important feature for a given algorithm.
Figure 2 shows the results for ASML when predicting the direction of the order Dt .
“aggressive”, and “algo” is “algorithm”.
18 Cartea et al. (2020) show that volume imbalance helps to predict the direction of the next (non-aggressive) limit order.
15

imbalance of algo 0-5
inventory of algo
best bid volume
inventory of member
best ask volume
cash of member
cash of algorithm
bid volume of algo 0-5
ask volume of algo 0-5
chg imbalance excl algo 0-5
0 0.05 0.1 0.15 0.2
Figure 2: Importance of features to explain the direction of an order for ASML (using permutation impor-
tance and logistic regressions, top features only shown).
Again, the imbalance of volumes posted by the algorithm in the first five levels of the LOB is the most
important variable when predicting whether the algorithm will send a buy order or sell order, followed by
the intraday accumulated inventory of the algorithm and the intraday accumulated inventory of the trading
member, and the best volumes posted in the LOB. This is similar for all shares; the same set of features
appears in each of the top ten features reported. This shows that inventory related features, volumes posted
on best quotes, and imbalances by the algorithm near the top of the LOB are important features to predict
the direction of the order.
Figure 3 shows the most important features to predict the price bucket Pt .19
spread
inventory of algo
cash of algorithm
num messages 0.1ms
num messages 1ms
cash of member
num messages 1s
inventory of member
0 0.02 0.04 0.06 0.08 0.1 0.12
Figure 3: Importance of features to explain the price bucket Pt for ASML (using permutation importance
and logistic regressions, top features only shown).
Spread is the most important feature to predict the price bucket Pt , which is in line with the results in
Table 1. There is consistency across shares – five of the top ten features in ASML also feature in the top
ten of the three other shares. The order of importance is consistent: spread is the most important predictor
for three of our four shares, and intraday accumulated inventory and cash of algorithm appear in the top
19 The Figures for Pt are ordered by 75% percentile instead of median to increase interpretability.
16

three for ASML, ING and AHOLD. Other important features are the number of messages sent in the last
100 microseconds (i.e., 0.1 millisecond) and the volume posted by the algorithm on the first five levels of the
LOB – both bid and ask side.
We draw attention to the differences and similarities between the top features that predict the direction
of the order and the price bucket Pt . The main difference is the importance of spread, which is important
to predict the price of an order, yet does not appear in the top ten features to predict the direction of the
order.20 Another difference is that the number of messages sent by all market participants in the last 100
microseconds is not important for predicting the direction of the order but it is important to predict price
bucket Pt .
Lastly, Figure 4 reports the important features to predict the volume bucket Vt .
inventory of member
cash of member
cash of algorithm
inventory of algo
quad var 5m
ask volume excl algo 0-5
bid volume excl algo 0-5
spread
0 0.1 0.2 0.3 0.4
Figure 4: Importance of features to explain the volume bucket Vt for ASML (using permutation importance
and logistic regressions, top features only shown).
For all shares, the four inventory variables are the most important features to predict the volume of an
order. Also, the volume quoted by the algorithm on the first five levels, on both the buy side and the sell
side, is an important feature for all shares. We see that the number of messages in the last second is an
important feature in three shares (ING, AHOLD, TOMTOM). Contrary to the other output variables, we
observe that quadratic variation – measured over various intervals, is an important feature in three of the
shares (ASML, ING, and AHOLD).
Inventory related features are important for predicting the three variables (Dt , Pt , Vt ). Also, among
order book features, those that describe behaviour closer to the top of the LOB have more predictive power
than features that describe behaviour deeper in the LOB. Similarly, the number of messages sent in the
recent past (e.g., in the last 100 microseconds) tend to be more important than the number of messages
contained in longer horizons. This is consistent with the information that trading firms report to the AFM;
see AFM (2023).
We also observe that features describing the algorithm’s presence in the LOB tend to be more predictive
than features describing the presence of other participants. For example, imbalances of the algorithm and
volumes of the algorithm tend to be among the most important features, yet imbalances and volumes of
other agents are not – this is apart from best bid volume and best ask volume.
4.4 Clustering of algorithms

The model coefficients from predicting (i) direction of the order Dt with a logistic regression, (ii) price bucket
Pt with a multinomial logistic regression, and (iii) volume bucket Vt with a multinomial logistic regression,
capture information about the trading behaviour of an algorithm. More precisely, the coefficients show how
20 The spread is the cost of immediacy in the LOB and it is the same cost for buy and sell aggressive orders.
17

the features affect the probability that an agent selects a given value for the variables (Dt , Pt , Vt ). Here, we
use these coefficients to cluster algorithms.
We revert to using the logistic regression models on the PCA-transformed features because of their higher
predicting power. In particular, we employ the model parameters from the logistic regressions to predict
trade direction Dt . We focus on Dt because results are easy to interpret and because Dt is the output
variable where outperformances are highest. Recall that for Dt we have only one coefficient per feature,
instead of one coefficient per bucket per feature when predicting Pt and Vt .
Arguably, the clusters that arise within the regression coefficients are clusters that identify similarity
in trading behaviour. In Subsection 4.4.2 we compare these clusters of trading behaviour with the dealing
capacity of algorithms in Euronext Amsterdam – recall that the dealing capacities are (i) Liquidity Provider,
(ii) House, and (iii) Client and are determined by the exchange, not the regulator.
We employ the clustering technique described in Section 3.4 with the coefficients of the regressions applied
to the PCA-transformed features and a target of three clusters. We do this over twelve sets of four consecutive
weeks of data each and refer to this as clustering exercises. Each clustering exercise includes only the 96
algorithms present in all of the twelve clustering exercises we perform.
Figure 5 shows the size of the clusters we obtain in each clustering exercise. In this figure, the first cluster
has the most algorithms and the third cluster has the least algorithms.
100
80
60
Cluster 1
Cluster 2
Cluster 3 40
20
0
44
45
46
47
48
49
50
51
52
1
2
3
week
Figure 5: Size of clusters for ASML across the twelve clustering exercises. The x-axis notes the last week
of the clustering exercise. For example, “44” refers to the clustering exercise on weeks 41, 42, 43 and 44 of
2021. Weeks 44 to 52 are in 2021 and weeks 1 to 3 are in 2022. The y-axis is the number of algorithms.
We see that the number of algorithms in each of the three clusters varies for each of the twelve clustering
exercises. Given that the clusters aim to capture fundamental trading behaviour, it is crucial that they
exhibit a degree of consistency over time. Below, we study this point in more detail.
4.4.1 Stability through time

We explore the stability of the clusters across time. For this, given two consecutive clustering exercises
A and B where the last week in B is one week after the last week in A, we compute the probability that
two randomly chosen algorithms exhibit one of the following two properties: (i) they belong to the same
cluster in exercise A and belong to the same cluster in exercise B, or (ii) they do not belong to the same
cluster in exercise A and do not belong to the same cluster in exercise B. The blue line in Figure 6 shows
the probability of (i) or (ii) along the twelve clustering exercises we perform across time. That is, for two
consecutive clustering exercises, we count the pairs of algorithms that were clustered together and remained
18

clustered together, and those that were clustered differently and remained in different clusters, over all pairs
of algorithms. This calculation is a probability, so it lies in [0, 1]. The red dash-line is the theoretical value
of the probability when algorithms are randomly shuffled and reassigned into the clusters while keeping the
number of algorithms in each cluster fixed as those we find for the first four weeks of data.
Intuitively, if the blue line is close to 1, then the clustering does not change over time, so the clusters are
stable. On the other hand, if the blue line is close to the dotted red line, then the clustering resembles that
of random reshuffling of algorithms.
0.9
0.8
0.7
0.6
0.5
45
46
47
48
49
50
51
52
3
week
Figure 6: Stability of clusters for ASML across the twelve clustering exercises. The y-axis shows the prob-
ability of two randomly selected agents in two consecutive clustering exercise A and B (i) being members
of the same cluster in A and B or (ii) different clusters in A and B. The x-axis shows the last week of the
clustering exercise B.
The blue line lies above the red dotted line, which indicates stability of algorithm clusters. However,
there is a still a fair amount on instability in the clusters over time, as indicated by the gap between the blue
line and 1. Part of the instability might be due to algorithms updating their model parameters frequently,
as shown in the results of the survey (AFM (2023)). Recall that our clustering exercises include only the 96
algorithms present in all clustering exercises. Indeed, if any of these algorithms updates the model parameters
they use to send orders, then we expect that the parameters we obtain will change too, resulting in unstable
clusters.
We also study the link between the level of activity of the algorithms and the stability of their regression
coefficients. We use the logistic regression coefficients of the 53 original features described in Section 3.1,
and assign each algorithm according to the clustering obtained on the coefficients of the logistic regressions
on the PCA-transformed features. The first column of Table 9 lists the most important features to predict
the direction Dt of an order. The second and third columns report the temporal deviation, which for a given
feature is defined as the standard deviation of the regression coefficients over time. We report the average
temporal deviation for the top ten algorithms (second column) and for the bottom ten algorithms (third
column) – the ranking is according to the number of orders sent during the first four weeks of data.
19

Top 10 Bottom 10
algorithms algorithms
imbalance of algo top five levels 0.27 0.33

inventory of algo 0.25 0.84
ask volume of algo top five levels 0.19 0.36
bid volume of algo top five levels 0.15 0.26
best bid volume 0.04 0.21
best ask volume 0.04 0.18
chg in imbalance excl algo 0-5 0.03 0.14
spread 0.03 0.22
net agg buy-sell last 1s 0.02 0.15
number of messages 1ms 0.02 0.21
number of messages 0.1ms 0.02 0.22
Table 9: Average temporal deviation of coefficients.
Given the large number of data points used to fit the models of the top 10 algorithms, we expect that
the observed temporal deviation is principally due to changes in the algorithms themselves, for example
due to re-calibration of parameters. Conversely, for the bottom 10 algorithms, the number of observed
actions is relatively small, and so the fact that the temporal deviation is systematically higher for these
algorithms may be due to statistical estimation error, rather than systematic differences between the top
and bottom algorithms. Regardless of the reason for the large temporal deviation of the coefficients for the
bottom algorithms, Table 9 lends support to the claim that the instability of our clusters is due primarily
to variation in the estimated coefficients of the smaller algorithms.
4.4.2 Types of market participants

Here, we explore how the clusters relate to the dealing capacity of market participants. Recall that we
study three dealing capacities: (i) Liquidity Providers (LP), (ii) House, and (iii) Client. Figure 7 shows the
confusion matrix between dealing capacity and clusters.
Client
20 3 1
House
16 7 0
LP
18 15 16
Cluster 1 Cluster 2 Cluster 3
Figure 7: Confusion matrix between the clusters obtained with the parameters from the first four weeks of
data and the dealing capacity for ASML.
Cluster 1 consists of an almost equal number of Liquidity Providers, Houses, and Clients. Furthermore,
the majority of algorithms with type House or Client are in cluster 1. Therefore one would expect this cluster
to show trading behaviour associated with directional trading firms, such as hedge funds or pension funds –
see Subsection 4.4.3.
Clusters 2 and 3 consist mostly of Liquidity Providers. Hence, one would expect these clusters to exhibit
behaviour most often associated with traditional market makers. Furthermore, the L-shaped confusion
20

matrix in Figure 7 lends support to the claim that Liquidity Providers exhibit the highest variability in
trading behaviour.21
4.4.3 Clustered market behaviour

Next, we study each of the clusters in more detail. Table 10 provides summary statistics about the trading
activity of the algorithms within the clusters.22
number of algorithms 54 25 17
order count 1,639,267 1,901,239 3,712,355
trade count 573,417 343,874 461,874
order to trade ratio 2.86 5.53 8.04
order count % 23 26 51
trade count % 42 25 33
liquidity taking trade count 291,545 251,785 120,353
liquidity taking trade count % 51 73 26
average algo volume in ask side of LOB 0-5 1.43 2.07 2.47
average algo volume in bid side of LOB 0-5 1.49 2.04 2.48
average absolute imbalance of algo 0-5 1.93 1.12 0.77
average ask volume of algo 11-20 0.43 1.69 1.07
average bid volume of algo 11-20 0.44 1.68 1.08
Table 10: Statistics for algorithms in clusters 1,2, and 3 for ASML. First four weeks of data.
Although algorithms in cluster 3 send the most orders (51%), they do fewer transactions (33% compared
to 42% for cluster 1 and 25% for cluster 2). This results in a higher order to trade ratio for cluster 3
(8.04) than for clusters 1 and 2 (2.86 and 5.53 respectively). Furthermore, algorithms in cluster 3 have a
relatively low fraction of liquidity taking transactions; 26% of their transactions are liquidity taking, versus
51% and 73% for clusters 1 and 2. This suggests that algorithms in cluster 3 behave as traditional market
makers. Algorithms in clusters 2 and 3 post more volume (on average) on the first five levels of the LOB
than algorithms in cluster 1. Similarly, algorithms in clusters 2 and 3 have more balanced volumes posted
in the LOB when compared to algorithms in cluster 1.23
In the next sections, we investigate how the various features we consider affect the average behaviour of
algorithms in each of these clusters. We focus on the impact of features on direction and price of orders,
as the model for volume does not display much benefit over a naive model (as seen in Table 7). More
precisely, in Subsection 4.4.5 we study the average coefficients associated with the regressions for predicting
the direction Dt , then, Subsection 4.4.6 discusses the average coefficients for predicting the price bucket Pt
conditional on direction, and Subsection 4.4.7 explores liquidity taking activity and liquidity provision in
more detail.
4.4.4 Summary of results about trading behaviour

Consistently, across exercises in the aforementioned subsections, algorithms in cluster 3 show behaviour
associated with inventory-averse market makers. We describe the algorithms in this cluster as “market
makers”. These algorithms are keen to maintain a balanced presence in the LOB (e.g., they are more likely
to post an order on the bid side of the LOB if their volumes on the ask side of the LOB are higher than
those of the bid side of the LOB), they revert their inventories to zero, improve the spread, post liquidity
at-the-touch, and do not exhibit aggressive behaviour unless their inventory is large.
In contrast, algorithms in cluster 2 do not show inventory aversion, they send orders in the direction of
the imbalance of the volumes posted at the best quotes (send sell orders if the volume in the bid is much
smaller than the volume in the ask), have the lowest percentage (across clusters) of eager-to-trade orders
21 Here, L-shaped refers to the higher values in Figure 7 being arranged along the left column and bottom row.
22 Recall that a single aggressive order may result in multiple trades (depending on the number of limit orders it executes
against).
23 Recall that if the imbalance of the volumes posted by the algorithm is close to zero then the volume posted by the algorithm
on the bid side is close to the volume posted by the algorithm on the ask side.
21

that provide liquidity inside the spread and have the highest percentage of eager-to-trade orders that traded
aggressively. We describe the algorithms in cluster 2 as “opportunistic traders’.
Lastly, algorithms in cluster 1 have position-building behaviour in terms of their own imbalances and the
way they choose direction as a function of its inventory and its imbalance in the LOB. These algorithms
have propensity to send eager-to-trade orders and at-the-touch orders. Around 82% of their eager-to-trade
orders trade aggressively. We describe the algorithms in cluster 1 as “directional traders”.
4.4.5 Direction: average behaviour

We show the average regression coefficients per cluster for the most important features to explain the direction
of the order Dt . Table 11 reports the average coefficients per feature computed on the first four weeks of data.
For each of the clusters, the average is taken over all algorithms in the cluster. To obtain the coefficients of
the 53 features, for each of the 96 algorithms, we perform a matrix multiplication of the coefficients of the
logistic regressions on the PCA-transformed features (96 × 30 matrix) and the principal components matrix
(30 × 53 matrix), both computed in Section 4.1. We represent the effects of the volume in the book with the
total volume and the imbalance (rather than the separate volumes on the bid and ask side).
In the tables which follow, we exclude features where the magnitude of the average coefficients is smaller
than or equal to 0.1 for all clusters. We order features by their average absolute coefficient size.
imbalance of algo 0-5 1.48 −1.04 −0.42

imbalance of algo 11-20 0.21 −0.97 0.81
cash of algorithm 0.70 0.13 −0.30
inventory of algo 0.70 0.12 −0.30
best ask volume −0.26 −0.20 0.23
best bid volume 0.24 0.19 −0.24
volume of algo 11-20 0.16 −0.05 0.40
volume of algo 0-5 −0.07 −0.04 −0.28
chg imbalance excl algo 0-5 0.04 0.16 0.05
net agg buy-sell last 1s −0.03 0.07 0.10
return 1s 0.01 0.08 0.10
volume of algo 6-10 0.01 −0.11 0.04
Table 11: Average regression coefficients per cluster on first four weeks of training data for ASML when
predicting direction Dt .
We recall the formulation for the logistic regression in equation (5), where Dt = 1 means that the order
is to buy and Dt = −1 means that the order is to sell. Thus, given that all features are normalised and
centred around zero, all else being equal, if an average coefficient is positive (resp. negative) it means that
positive values of the relevant feature increase (resp. decrease) the probability of a given order being a buy
order; and vice-versa for negative values of the relevant feature.
Table 11 provides a number of stylised facts about the difference among clusters to understand differences
in trading behaviour.
(D-i) Cluster 1 is the only cluster with an average positive value for the coefficient for the imbalance of the
algorithm in the first five levels – recall that when imbalance is positive the algorithm has more volume
posted in the bid than in the ask. Thus, algorithms in cluster 1 are inclined to send a buy order if they
have already posted more volume in the bid than in the ask (and similarly with the roles of bid and
ask, and buy and sell reversed). Conversely, algorithms in clusters 2 and 3 are more likely to revert
their imbalance back to a level where the volume provided in the ask side of the LOB is close to that
provided in the bid side of the LOB. This suggests that clusters 2 and 3 behave in some ways like
traditional market makers, i.e., trading algorithms that provide liquidity in both sides of the LOB in
a balanced way. In contrast, algorithms in cluster 1 tend to post liquidity with a preferred direction,
which is consistent with most algorithms of House and Client type being in cluster 1.
(D-ii) Cluster 3 exhibits mean-reversion to zero in inventories at the algorithm level, that is, when the intraday
accumulated inventory is positive they are more likely to send sell orders, while the inventories of
22

algorithms in cluster 2 are not mean-reverting to zero.24 Conversely, algorithms in cluster 1 have a
strong preference to send orders in the direction of their inventory (buy orders if inventory is positive
and sell orders if inventory is negative).
(D-iii) The estimation we perform through the PCA coefficients leads to an “averaging” effect over coefficients
for similar features. This is seen in the coefficients associated with the inventory and cash of the
algorithm which are similar.
In summary, algorithms in cluster 3 exhibit trading behaviours that the literature describes as market
markers. For example, algorithms in cluster 3 seem keen to maintain a balanced provision of liquidity and
take actions to revert their inventories to zero. On the other side of the spectrum we have the algorithms in
cluster 1. These algorithms do not balance the liquidity posted in the LOB, and their orders are more likely
to be in the same direction as their inventory (i.e., position building and directional trading). Algorithms
in cluster 2 lie somewhere in the middle of the behaviour observed by algorithms in clusters 1 and 3. More
precisely, out of the coefficients shown in Table 11, there are three instances in which the sign of the coefficient
associated with cluster 2 is different from both of the signs of the coefficients for clusters 1 and 3. The three
instances involve volumes deep in the LOB.
4.4.6 Price limit: average behaviour

To gain further insights into the trading behaviour of the algorithms in the above clusters we study how
they choose their price limits as measured through the price bucket Pt .
Cluster 1 Cluster 2 Cluster 3 Total
Pt = 1 178,333 128,174 391,016 697,523

Pt = 2 386,678 456,828 2,257,863 3,101,369
Pt = 3 1,074,256 1,316,237 1,063,476 3,453,969
Total 1,639,267 1,901,239 3,712,355 7,252,861
Table 12: Order count per price bucket Pt and cluster using first four weeks of training data for ASML.
Recall that by the definition of Pt in (3), the first price bucket contains orders whose price limit is more
generous than the best quotes; that is, the limit price is lower than the best ask price for a sell order or the
limit price is greater than the best bid price for a buy order. The second price bucket contains orders whose
limit price equals that of the best quotes; that is, the limit price is the best ask price if the order is to sell
or the limit price is the best bid price if the order is to buy. The third price bucket contains orders deeper
in the LOB.
Thus, given that orders in the first price bucket either (i) trade with the opposite side of the LOB, (ii)
improve the spread, or (iii) get cancelled upon entry because their time-in-force precludes them from resting
in the LOB and their price limit is better than best quotes, we refer to them as orders that show “eagerness
to trade”. Similarly, we call price bucket one the “eager-to-trade” bucket. Next, we study the average
regression coefficients associated with orders in the “eager-to-trade” bucket (Pt = 1).
24 SeeChapter 10 in Cartea et al. (2015) for a mathematical model of the market making problem illustrating how risk aversion
plays a role in how market makers provide liquidity as a function of their inventory.
23

Pt = 1|Dt = 1 (Buy) Pt = 1|Dt = −1 (Sell)
Cluster 1 Cluster 2 Cluster 3 Cluster 1 Cluster 2 Cluster 3
intercept −0.21 −0.46 −0.74 −0.17 −0.48 −0.77

best bid volume 0.35 0.30 1.09 −0.21 −0.17 −0.26
best ask volume −0.20 −0.16 −0.30 0.35 0.30 1.05
imbalance of algo 0-5 −0.00 0.29 0.68 0.08 −0.32 −0.75
spread 0.11 0.07 0.56 0.10 0.07 0.54
volume of algo 0-5 −0.25 −0.12 0.40 −0.29 −0.08 0.30
volume of algo 11-20 0.04 −0.32 −0.22 0.07 −0.26 −0.21
num messages 1ms 0.17 0.20 −0.08 0.21 0.15 −0.06
volume of algo 6-10 −0.13 −0.23 0.11 −0.13 −0.18 0.04
imbalance of algo 6-10 −0.07 0.04 −0.20 0.10 −0.08 −0.28
num messages 0.1ms 0.12 0.19 −0.08 0.15 0.13 −0.06
num messages 0.1s 0.13 0.04 0.07 0.15 0.09 0.07
Table 13: Average regression coefficients for price bucket describing eager-to-trade orders, per cluster, on
first four weeks of training data for ASML, conditioning on the direction of order Dt .
We observe the following stylised features of trading algorithms in each cluster:
(P-i) According to the intercept of the regressions, algorithms in cluster 3 are noticeably less likely – than
those in clusters 1 and 2 – to place orders that are eager to trade.
(P-ii) For algorithms in all clusters, the higher the imbalance of the volumes (by all algorithms) posted at the
best bid and ask, the more likely a buy order is to be eager to trade. This effect is most pronounced
for algorithms in cluster 3, where the bid volume has a much greater impact on orders to buy than the
ask volume; conversely for orders to sell.
(P-iii) Algorithms in cluster 1 seem indifferent to their own imbalance in the book when determining eagerness
to trade, whereas imbalance of their own volume indicates a moderate eagerness to trade for cluster
2 algorithms, and a strong eagerness for cluster 3. The direction of the order for cluster 3 is in the
direction of the imbalance (so if an algorithm has a higher bid than ask volume posted, which means
they have a positive imbalance, its buy orders are typically more eager to trade and its sell orders are
typically less eager to trade).
(P-iv) Regardless of the direction, the wider the bid-ask spread the more likely algorithms are to send orders
that are eager to trade.25 This is particularly pronounced for cluster 3 (we shall see below that this
cluster often refills the book when the spread is above one tick).
(P-v) For algorithms in clusters 1 and 2 we see that the higher their own posted volumes at levels 0 to 5 on
either the bid or ask side, the less likely they are to send an order (either to buy or to sell) that is eager
to trade. Algorithms in cluster 3 exhibit the opposite behaviour; already having volume at the first five
levels of the LOB makes it more likely for them to send eager-to-trade orders. For algorithms in clusters
2 and 3, having volume posted deeper in the book decreases the likelihood of sending eager-to-trade
orders.
(P-vi) For algorithms in clusters 1 and 2, the higher the number of recent messages, the more likely an order
is eager to trade, in either direction.
(P-vii) For all algorithms, there is a reasonably clear symmetry, when the roles of bid and ask, and buying
and selling, are reversed.
We also look at the second price bucket (Pt = 2), which describes orders adding volume to the LOB at
the current best bid or best ask (also known as posting ‘at-the-touch’), and at the third (Pt = 3, whose
coefficients are simply the negative sum of those for Pt = 1, 2), which describes orders adding volume deeper
within the LOB.
25 We refine this result in Subsection 4.4.7. In particular, we find that the larger the bid-ask spread, the more likely all clusters
are to provide liquidity inside the spread, and the less likely they are to cross the spread.
24

intercept −0.15 −0.58 1.25 −0.13 −0.68 0.94

volume of algo 0-5 0.13 0.52 0.16 0.24 0.52 0.21
volume of algo 6-10 −0.03 0.38 0.19 0.00 0.33 0.17
volume of algo 11-20 −0.19 0.06 0.14 −0.27 −0.01 0.06
imbalance of algo 11-20 −0.05 0.09 0.07 0.06 −0.07 0.13
best ask volume 0.08 0.03 0.04 −0.06 0.08 −0.12
num messages 1ms −0.09 −0.01 −0.04 −0.11 −0.02 −0.05
num messages 0.1s −0.10 −0.02 −0.01 −0.12 −0.03 −0.03
spread −0.00 0.04 −0.10 −0.01 0.04 −0.09
Table 14: Average regression coefficients for price bucket describing posting at-the-touch per cluster on first
four weeks of training data for ASML conditioning on the direction of order Dt .

intercept 0.36 1.04 −0.51 0.30 1.16 −0.17

imbalance of algo 0-5 0.02 −0.51 −0.76 −0.09 0.52 0.90
best bid volume −0.31 −0.36 −1.00 0.13 0.15 0.22
best ask volume 0.11 0.13 0.26 −0.29 −0.38 −0.92
volume of algo 0-5 0.11 −0.40 −0.55 0.05 −0.44 −0.51
spread −0.11 −0.11 −0.46 −0.09 −0.11 −0.45
volume of algo 11-20 0.14 0.26 0.09 0.19 0.26 0.15
volume of algo 6-10 0.16 −0.15 −0.30 0.13 −0.16 −0.21
imbalance of algo 6-10 0.11 −0.13 0.02 −0.11 0.14 0.31
num messages 0.1ms −0.07 −0.23 0.11 −0.09 −0.16 0.09
num messages 1ms −0.09 −0.20 0.12 −0.10 −0.13 0.11
imbalance of algo 11-20 0.07 −0.20 0.08 −0.08 0.16 −0.08
Table 15: Average regression coefficients for price bucket describing posting deeper within the order book,
per cluster on first four weeks of training data for ASML conditioning on the direction of order Dt .
By looking at Table 14 and Table 15, we observe the following stylised facts for the typical behaviour of
each cluster. It should be emphasised that these observations are for a given direction of order, and so the
conclusions should be read in the light of Table 11.
(P-viii) Cluster 3 algorithms are much more likely to place their orders at-the-touch than either of the other
clusters (see the intercept of Table 14). Cluster 2 is less likely to place its orders at-the-touch than
cluster 1 (all other things being equal).
(P-ix) Algorithms in cluster 2 have the highest propensity to place orders deeper in the book.
(P-x) For all algorithms, the higher the volume at the best ask price and the lower the volume at the best
bid (i.e., the lower the imbalance of the best volumes in the LOB) the more likely that a buy order will
be placed deeper in the LOB; conversely for sell orders. This indicates that algorithms pre-empt that
when a large volume builds at either of the best quotes then prices will likely move in the direction of
the opposite side of the LOB, hence why they choose to place their orders deeper in the LOB of the
opposite side.
(P-xi) The likelihood that an algorithm in cluster 2 posts at-the-touch is affected by its own imbalance near
the touch. The likelihood is impacted in the direction of the imbalance (so if an algorithm has a higher
bid than ask volume posted, which corresponds to a positive imbalance, it is more likely that the
algorithm posts buy orders at the best bid price). A weaker effect is observed for cluster 3, but little
effect for cluster 1.
(P-xii) For all algorithms, an increase in the recent number of messages reduces the probability of posting
at-the-touch in either direction.
25

In summary, for algorithms in the first two clusters, the higher the number of recent messages (sent
to the exchange by all market participants) the more likely they are to send an eager-to-trade order (in
either direction), similarly, for algorithms in all clusters, the higher the spread the less likely an algorithm
is to place eager-to-trade or at-the-touch orders. Again, algorithms in cluster 1 seem indifferent to their
own imbalance in the book when deciding to send an eager-to-trade order. Algorithms in cluster 2 have
a moderate eagerness to trade according to their imbalance in the LOB (similar for posting at-the-touch).
Algorithms in cluster 3 have a strong eagerness to trade according to their imbalance in the LOB, and they
also have the highest baseline probability of posting at-the-touch. These results are consistent with our
previous findings about cluster 3 exhibiting behaviour associated with market makers.
The low value of the coefficients for the inventory of the algorithm (this predictor is absent from Tables
13, 14 and 15, indicating that all clusters had an average coefficient below 0.10 in magnitude) may suggest
that this has limited effect on the eagerness to trade. However, this may be due to the limitation of our
statistical model to linear-logistic relationships. The plots in Figure 8 show the percentage of orders which
were eager to trade, as a function of (transformed, intraday accumulated) inventory. We see that in the
rare cases where transformed inventory exceeds 6 (in absolute value), there is a noticeable increase in the
eagerness to trade for all algorithms. For algorithms in cluster 3, this effect is smoother as inventory changes,
and larger than in clusters 1 or 2.

35%
30%
25%
20%
15%
10%
5%
0%
0 2 4 6 8 0 2 4 6 8 0 2 4 6 8
Figure 8: Percentage of eager-to-trade orders as a function of the absolute value of the transformed intraday
accumulated inventory of the algorithm. See Appendix B for the formula of the log-inventory of the algorithm.
4.4.7 Liquidity taking activity

We now focus on orders in price bucket Pt = 1, that is, orders with prices better than the current best bid
or ask price. When such an order is sent, there are three possible scenarios, which we label by a variable At
describing whether it takes liquidity, provides liquidity, or is a missed attempt.
(i) The case At = 0 encompasses aggressive trading, i.e., orders that cross the spread. These are buy
orders with limit price greater than or equal to the best ask price, or sell orders with limit price less
than or equal to the best bid price. Almost all orders within this category are executed upon entry
(or partially executed) against liquidity on the other side of the LOB depending on the time-in-force,
the limit price of the order, the volume of the order, and the liquidity available in the LOB.26 From
Table 3 we know that orders with FoK time-in-force are negligible so for all practical purposes At = 1
is aggressive behaviour.
(ii) The case At = 1 covers generous provision of liquidity. These are orders that provide liquidity inside
the spread. The time-in-force of the orders is DAY and their limit price is higher than the best bid
price and less than the best ask price if the order is to buy; conversely for sell orders.
26 We provide an example. Assume the best ask price in the LOB is 100 and the volume available is 50 units. An IoC order
arrives willing to buy with price limit of 100 and volume of 100, then, 50 units will get executed against the available liquidity
and 50 will be cancelled. Alternatively, if the order had FoK time-in-force then it gets cancelled upon arrival.
26

(iii) The case At = 2 describes missed attempts. These are orders that are cancelled upon entry because
their price limit is not generous enough to trade with the available liquidity in the LOB and their
time-in-force precludes them from resting in the LOB. This category covers (i) orders to buy with limit
price less than the best ask price and IoC time-in-force; it also covers (ii) sell orders with limit price
greater than the best bid price and IoC time-in-force. Note that missed attempts were likely aimed at
(a) liquidity that disappeared over the latency period, (b) hidden liquidity, or (c) new liquidity being
added over the latency period; see Cartea and Sánchez-Betancourt (2023).
For example, during the first four weeks of data for ASML we find that algorithms in cluster 1 (resp. cluster
2 and cluster 3) sent 178,333 orders (resp. 128,174 and 391,016 orders) to buy or to sell that were eager
to trade; around 82% of these orders (resp. 89% and 18%) traded against the opposite side of the LOB,
i.e., show aggressive behaviour, similarly, 17% of the orders (resp. 10% and 81%) provided liquidity inside
the spread, i.e., generous liquidity provision, and 1% of the orders (resp. 1% and 1%) were cancelled by
the exchange upon entry because of their lack of generosity and time-in-force, i.e., missed attempts. These
results are summarised in Table 16 together with the usage of IoC associated with each of the clusters.
Cluster 1 Cluster 2 Cluster 3 Total
number of orders Pt = 1 178,333 128,174 391,016 697,523
IoC orders 68% 84% 4% 245,988
At = 0 (aggressive trading) 82% 89% 18% 333,073

At = 1 (generous liquidity) 17% 10% 81% 364,650
At = 2 (missed attempt) 1% 1% 1% 2,800
Table 16: Percentage of orders sent with At ∈ {0, 1, 2} out of all orders with Pt = 1.
These percentages support our findings that algorithms in cluster 3 make markets for ASML. Algorithms
in cluster 3 have the lowest percentage of aggressive behaviour, and they have the highest proportion of
provision inside the spread. On the other hand, algorithms in clusters 1 and 2 render similar proportions
for aggressive behaviour and missed aggressive behaviour, but they differ in their proportions for providing
liquidity inside the spread. Here, algorithms in cluster 1 provide liquidity inside the spread more often than
algorithms in cluster 2. For both clusters 1 and 2, the vast majority of IoC orders successfully trade, and
only a small fraction are cancelled.
Next, we perform a regression similar to those above to predict if the order will take liquidity, conditioned
on direction and conditioned on the order being in the first price bucket. We recall that if an order sent at
time t consumes liquidity we write At = 0, if the order provides liquidity inside the spread we write At = 1,
and we write At = 2 otherwise. Table 17 shows the average coefficients for the regressions for At = 0, Table
18 shows the average coefficients for the regression associated with At = 1, and Table 19 shows the average
coefficients for the regressions associated with At = 2.
27

At = 0 | Pt = 1, Dt = 1 (Buy) At = 0 | Pt = 1, Dt = −1 (Sell)
intercept 0.73 0.97 −0.00 0.82 1.08 0.10

spread −1.09 −0.95 −1.37 −1.29 −0.88 −1.35
best bid volume −0.36 −0.35 −0.50 −0.93 −0.65 −0.78
best ask volume −0.77 −0.65 −0.77 −0.44 −0.26 −0.47
volume of algo 0-5 −0.49 −0.84 0.08 −0.43 −0.77 −0.02
volume excl algo 0-5 0.30 0.31 0.45 0.30 0.29 0.42
volume of algo 6-10 −0.39 −0.58 0.10 −0.30 −0.58 0.03
quad var 5m −0.20 −0.13 −0.34 −0.27 −0.16 −0.36
quad var 1m −0.17 −0.12 −0.28 −0.21 −0.12 −0.30
quad var 60m 0.14 0.25 0.16 0.16 0.19 0.17
quad var 15m −0.14 −0.04 −0.25 −0.19 −0.09 −0.26
last volume transacted −0.08 −0.12 −0.15 −0.10 −0.12 −0.12
imbalance of algo 0-5 0.12 −0.02 0.21 −0.12 −0.00 −0.20
num messages 0.1ms −0.21 −0.12 0.01 −0.16 −0.14 0.02
volume of algo 11-20 −0.18 −0.09 −0.00 −0.14 −0.19 −0.01
net agg buy-sell last 1s −0.09 −0.12 −0.05 0.13 0.10 0.10
imbalance excl algo 0-5 −0.09 −0.11 −0.01 0.11 0.17 0.08
num messages 0.1s 0.07 0.11 0.08 0.11 0.09 0.08
agg sell last 360s 0.10 0.09 0.10 0.08 0.05 0.06
agg sell last 60s 0.09 0.06 0.13 0.07 0.05 0.09
agg sell last 1s 0.03 0.06 0.02 −0.10 −0.10 −0.09
num messages 1ms −0.15 −0.03 0.01 −0.10 −0.05 0.03
agg buy last 60s 0.04 0.04 0.06 0.10 0.07 0.06
return 300s 0.04 0.04 0.12 0.02 0.01 0.08
volume excl algo 6-10 −0.04 −0.01 −0.04 −0.06 0.02 −0.13
cash of algorithm 0.03 0.07 −0.01 −0.07 −0.10 0.01
inventory of algo 0.02 0.06 −0.00 −0.07 −0.10 0.01
imbalance of algo 6-10 0.05 −0.00 −0.10 −0.06 −0.04 −0.01
Table 17: Average regression coefficients. Orders take liquidity upon arrival in the exchange conditioned on
orders being eager to trade and conditioned on the direction Dt of orders.
intercept 0.05 −0.25 0.94 0.10 −0.24 0.83

spread 0.85 0.57 1.25 0.95 0.55 1.25
best bid volume 0.33 0.28 0.50 0.71 0.39 0.71
best ask volume 0.62 0.37 0.70 0.36 0.23 0.49
volume of algo 0-5 0.21 0.68 −0.17 0.17 0.55 −0.15
volume of algo 6-10 0.09 0.47 −0.16 0.02 0.40 −0.13
volume excl algo 0-5 −0.12 −0.15 −0.36 −0.13 −0.15 −0.34
quad var 5m 0.17 0.08 0.32 0.15 0.04 0.29
quad var 60m −0.14 −0.16 −0.14 −0.20 −0.19 −0.18
quad var 1m 0.13 0.07 0.25 0.11 0.05 0.25
imbalance excl algo 0-5 −0.14 −0.11 −0.08 0.20 0.12 0.11
quad var 15m 0.11 0.02 0.23 0.07 −0.02 0.20
num messages 1ms −0.17 −0.09 0.02 −0.22 −0.07 0.05
num messages 0.1ms −0.17 −0.04 0.01 −0.22 −0.05 0.04
inventory of algo 0.12 0.11 −0.03 0.05 0.11 0.05
cash of algorithm 0.12 0.11 −0.03 0.05 0.11 0.05
agg sell last 60s −0.07 −0.03 −0.08 −0.11 −0.04 −0.10
imbalance of algo 0-5 −0.06 −0.08 −0.12 0.02 0.00 0.12
net agg buy-sell last 1s −0.10 −0.07 −0.03 0.08 0.07 −0.03
volume excl algo 6-10 0.08 0.02 0.06 0.05 −0.02 0.13
volume of algo 11-20 0.02 0.11 −0.01 −0.04 0.12 0.04
net agg buy-sell last 360s −0.06 −0.03 0.03 0.11 0.07 0.06
return 300s −0.03 0.03 −0.10 −0.03 −0.02 −0.12
imbalance of algo 6-10 −0.04 −0.07 0.10 0.03 0.06 −0.01
net agg buy-sell last 60s −0.00 −0.01 0.02 0.10 0.04 0.10
volume excl algo 11-20 −0.02 −0.03 0.04 −0.02 0.01 0.10
Table 18: Average regression coefficients. Orders provide liquidity inside the spread upon arrival in the
exchange conditioned on orders being eager to trade and conditioned on the direction Dt of orders.
28

intercept −0.78 −0.72 −0.94 −0.92 −0.83 −0.93

spread 0.24 0.39 0.12 0.35 0.33 0.11
imbalance excl algo 0-5 0.23 0.23 0.09 −0.31 −0.29 −0.19
volume of algo 0-5 0.28 0.16 0.09 0.26 0.22 0.17
num messages 0.1ms 0.38 0.17 −0.01 0.38 0.18 −0.06
volume of algo 6-10 0.30 0.11 0.06 0.28 0.17 0.10
num messages 1ms 0.32 0.12 −0.03 0.32 0.12 −0.08
net agg buy-sell last 1s 0.19 0.20 0.07 −0.20 −0.17 −0.07
volume excl algo 0-5 −0.18 −0.16 −0.10 −0.17 −0.13 −0.08
last volume transacted 0.12 0.11 0.09 0.16 0.13 0.06
best bid volume 0.03 0.07 −0.00 0.21 0.25 0.07
best ask volume 0.15 0.28 0.07 0.08 0.03 −0.02
agg sell last 1s −0.10 −0.13 −0.07 0.15 0.12 0.03
num messages 0.1s −0.09 −0.12 −0.07 −0.10 −0.12 −0.05
agg buy last 1s 0.14 0.11 0.01 −0.10 −0.10 −0.06
volume of algo 11-20 0.16 −0.02 0.01 0.18 0.07 −0.03
cash of algorithm −0.15 −0.18 0.04 0.02 −0.01 −0.07
inventory of algo −0.14 −0.17 0.03 0.02 −0.00 −0.06
imbalance of algo 0-5 −0.05 0.10 −0.09 0.10 −0.00 0.08
quad var 5m 0.03 0.05 0.03 0.12 0.11 0.07
net agg buy-sell last 360s 0.08 0.04 −0.01 −0.12 −0.09 −0.05
quad var 15m 0.03 0.02 0.02 0.12 0.11 0.06
quad var 1m 0.05 0.06 0.02 0.10 0.08 0.05
cash of member −0.12 −0.04 −0.05 −0.06 0.03 −0.04
inventory of member −0.12 −0.04 −0.05 −0.06 0.03 −0.04
agg buy last 360s −0.04 −0.05 −0.04 −0.10 −0.07 −0.02
agg sell last 5s −0.01 −0.02 −0.05 0.13 0.08 0.03
agg buy last 5s 0.13 0.06 0.01 −0.03 −0.03 −0.05
agg sell last 360s −0.10 −0.08 −0.04 −0.02 −0.01 0.01
volume excl algo 11-20 0.06 0.04 0.01 0.02 −0.10 −0.02
Table 19: Average regression coefficients. Orders that are immediately cancelled by the exchange upon
arrival conditioned on orders being eager to trade and conditioned on the direction Dt of orders.
We observe the following stylised features of trading algorithms in each cluster:

(LT-i) According to the intercept of the above tables, algorithms in clusters 1 and 2 are more likely to send
aggressive orders that consume liquidity from the opposite side of the LOB. Algorithms in cluster 3
have the highest propensity to send orders that provide liquidity inside the spread whereas algorithms
in cluster 2 are unlikely to provide liquidity inside the spread. This is consistent with the proportions
reported at the beginning of this subsection, where algorithms in cluster 2 exhibit the lowest percentage
of liquidity provision inside the spread (10%).
(LT-ii) Once again, algorithms across clusters agree on their behaviour according to spread. Given they post
eager-to-trade orders, algorithms are more likely to provide liquidity inside the spread when the bid-
ask spread is wider. Similarly, algorithms are more likely to have missed attempts when the spread is
wider. On the other hand, the wider the spread is, the less likely algorithms are to consume liquidity
from the other side of the LOB. This observation is unsurprising, as for ASML the spread is usually
equal to one tick, in which case it is not possible to provide liquidity inside the spread nor to miss a
trade attempt, making crossing the spread relatively more likely. This should also be compared with
observation (P-iv); the sums of the spread coefficients in Tables 13 and 19 show that the likelihood
of an order crossing the spread is reduced when the spread is larger, unconditionally on whether the
order is eager-to-trade or not.
(LT-iii) The results we discuss for spread also hold for some of the variables describing recent activity (as one
would expect). For example, for most horizons of the quadratic variation, the larger the quadratic
variation, the less likely algorithms will show aggressive behaviour, and the more likely it is that they
will post liquidity inside the spread. For both cases, the quadratic variation over the last trading hour
has an inverse effect.
(LT-iv) The best available volumes in the LOB (best bid volume and best ask volume) also have a similar
consistent effect across algorithms in all three clusters. The higher the available volumes in the market,
29

the less likely algorithms are to consume liquidity from the opposite side of the LOB, the more likely
they are to provide liquidity inside the spread, and the less likely they are to miss aggressive attempts.
For liquidity taking, we observe that the magnitude of coefficients for the best ask volume are roughly
twice the magnitude of coefficients for the best bid volume when describing the aggressiveness of orders
to buy; conversely for orders to sell. This is consistent with the findings in (P-x); indeed, if the market
builds volume in the best ask price for example, then everything else being equal, one expects prices
to move down, this in turn lowers the probability of an aggressive order to buy as a consequence of the
expected price move; conversely for the case of sell orders.
(LT-v) For liquidity provision inside the spread, we observe that the coefficients for best bid volume and best
ask volume swap their importance when considering buys or sells. In particular, the best buy volume
is more important to determine if a buy order will be sent inside the spread; conversely for the best
ask volume and sell orders. These results are consistent with what one expects from the usual models
of trading activity.
Our work has focused on ASML, which is a liquid asset. Similar clustering results hold for ING, see
Appendix C.3. For shares with lower volumes traded (here AHOLD and TOMTOM), the results are less
clear, possibly due to lower levels of trading in these shares. See Tables 32–40 for the clustered coefficients
for other shares.
5 Implications for regulation

The results above provide a starting point to formalise and to quantify the links between a number of
market features and the behaviour of trading algorithms. Our findings provide statistical evidence for many
microstructural stylised facts discussed in the literature, and our study is the first to confirm these facts
with a unique data set that contains both member and algorithm identification. One key contribution is to
show how algorithms make trading decisions as a function of 53 market features, some of which are visible
to all market participants and the others are idiosyncratic features of the algorithms and of each market
participant.
We believe the findings in the above sections are of special relevance to regulators and supervisors.
5.1 Surveillance
Given the vast number of messages that are processed every day by exchanges in electronic markets, it is
difficult to spot behaviour that intentionally or inadvertently may harm the integrity of markets. Either
way, the regulator must develop sophisticated tools to monitor the market and understand the impact of
individual algorithms on metrics of market quality.
Currently, financial regulators and supervisors use algorithms on transaction and order data to detect
practices that may harm the integrity of the market. For example, in electronic trading, the Dutch Authority
for the Financial Markets (AFM) relies on a combination of Suspicious Transaction and Order Reports from
market participants and custom made algorithms on transaction and order data to prevent and detect market
manipulation, see AFM (2021b). Similarly, the Financial Industry Regulatory Authority (FINRA) in the
US uses the SONAR system to monitor suspicious trading activity such as well-timed trades occurring just
before public announcements. These surveillance algorithms mine data to search for unusual patterns, some
of which could be designed to manipulate markets (e.g., pump and dump), to mislead the information in the
LOB (e.g., spoofing), or to trade with privileged information (e.g., front running, insider trading).
Most detection algorithms tend to look for “misleading” signals in the market.27 To learn what is
misleading, one should first know which features agents take into account when making a trading decision.
As a substitute for misleading signals, supervisors often look for outliers or unusual trading behaviour. But
this ignores the fundamental issue at stake: is the unusual trading behaviour misleading market participants?
One should not mistake features that predict algorithmic trading behaviour for features that algorithms
use when making trading decisions. Anecdotally, trading firms often use a variety of non-market information
sources when making decisions, while our data is restricted to quantities which are based in the market (in
27 See Article 12 “Market Manipulation” in EU (2014).
30

particular, which are observable to a regulator). This suggests that some of our conclusions will be affected
by confounding factors. For example, a firm with an off-market signal which encourages them to take a
directional position may often trade in a consistent direction – in this case, their intraday accumulated
inventory may be a good predictor of their trading behaviour in the near future, as it acts as a proxy for
their unobserved signal. This distinction is important when drawing causal conclusions from our results.
Our work has implications for both (i) data-driven detection of market manipulation, and (ii) supporting
claims about misleading behaviour in cases of market manipulation. With regards to (i) our findings lend
support to the claim that supervisors should look at “unusual” behaviour in the most important features
as listed in Section 4.3. Regarding (ii), supervisors have difficulties backing up why some trading behaviour
could be considered misleading. It is often difficult to show that some behaviour (for example spoofing),
causes a reaction in other agents. We believe our findings show how changes in the most important features
affect trading decisions, hence could be informative in cases of market manipulation. More precisely, our
models can be used to simulate the sequence of events that follow a case of potential market manipulation.
By comparing simulations including and excluding the potential manipulation, one could estimate if and how
market participants reacted to the potential manipulation, thereby quantifying the effect of the potential
manipulation.
5.2 Testing of trading algorithms

Our results provide a unique starting point to build more sophisticated trading models than those used in the
extant literature. These models will be useful for market participants to build simulators to test strategies
and will help financial regulators to understand market dynamics, individual behaviour, and the impact of
trading algorithms and strategies on the integrity of markets.
Trading firms are required to test their trading algorithms to make sure they do not behave in an
unintended manner or contribute to disorderly trading conditions, see EU (2016b).28 To do so, firms can use
their own testing environment or one provided by the trading venue. Trading venues are required to provide
members with simulation facilities which reproduce as realistically as possible the production environment.
The simulations should allow members to test a range of scenarios that they consider relevant to their
activity, and realistically reproduce disorderly trading conditions, see EU (2016a).29
AFM (2021a) notes that there is room for improvement for both trading firms and trading venues in
testing trading algorithms. Current testing environments tend to range from one-off order books without
new orders entering the market over time, generated by the trading venue at the request of a trading firm,
to markets generating orders with a predetermined frequency and set parameters in real-time. Overall,
AFM (2021a) notes that trading venues have difficulty creating sufficiently realistic simulation facilities.
Furthermore, this report stresses that testing against disorderly trading conditions should be designed with
a view to addressing the reaction of the algorithm or strategy to conditions that may create a disorderly
market. Agent-based modelling is not currently being used widely in simulation facilities, nor agents trained
on real data. AFM (2021a) encourages trading venues to explore innovative ways to make their simulation
environments more realistic so that they are in line with the requirements.
28 Article 5(4): The methodologies referred to in paragraph 1 shall ensure that the algorithmic trading system, trading
algorithm or algorithmic trading strategy:

(a) does not behave in an unintended manner;
(b) complies with the investment firm’s obligations under this Regulation;
(c) complies with the rules and systems of the trading venues accessed by the investment firm;
(d) does not contribute to disorderly trading conditions, continues to work effectively in stressed market conditions and, where
necessary under those conditions, allows for the switching off of the algorithmic trading system or trading algorithm.
29 Article 10(2)(a) Trading venues shall provide their members with access to a testing environment which shall consist of any
of the following:
(a) simulation facilities which reproduce as realistically as possible the production environment, including disorderly trading
conditions, and which provide the functionalities, protocols and structure that allow members to test a range of scenarios
that they consider relevant to their activity;
(b) testing symbols as defined and maintained by the trading venue.
31

This study is a starting point to build a simulator where market dynamics can be replicated at a granular
level to understand how each message to the order book may affect individual and therefore affect collective
dynamics. In particular, our statistical framework is devised to capture the fine microstructural facts that
drive the trading decisions of individual algorithms, which is key to understanding how trading algorithms
react to new market information and to messages to the LOB. With such a market simulator, firms will be
able to test their trading algorithms and regulators will be able to study the effect of new trading algorithms
on the quality of the market. Similarly, financial authorities will have a tool to build counterfactual trading
tapes to gain insights into market behaviour in the absence of certain trading algorithms or with potential
new entrants in the market.
5.3 Clustering
In terms of the clustering exercise, there are a number of relevant insights for supervisors. Irrespective of
the trading venue, a supervisor could impose its own criteria on what behaviour would be expected of any
Liquidity Provider, House or Client. Then, our clustering methodology can be used to highlight trading
firms whose algorithms might show behaviour contrary to what one would reasonable expect based on their
dealing capacity.
Acknowledgements
We thank Steef Akerboom, Felix Flinterman, Ronald Verhoeven, Blanka Horvath, and Lukasz Szpruch for
helping to bring about the collaboration between the Alan Turing Institute and the Autoriteit Financiële
Markten. We are grateful to participants at the Knowledge Share Day held at the Alan Turing Institute and
participants at King’s College London financial mathematics internal seminar. We thank Patrick Chang and
Jose Penalva for comments.
Contributions
The authors contributed to the paper as follows: SC and RG initiated the collaboration; ÁC, SC, RG, and
LSB developed the modelling framework, with input from SL and LV; RG and LSB implemented the primary
modelling code, with input from SC, SL and LV; ÁC, SC, RG and LSB wrote the paper.
References
Abergel, F., Anane, M., Chakraborti, A., Jedidi, A., and Toke, I. M. (2016). Limit order books. Cambridge
University Press.
AFM (2021a). Algorithmic trading – governance and controls. https://www.afm.nl/en/sector/actueel/
2021/april/beheersing-controles-handelsalgoritmes. Accessed: 1/May/2023.
AFM (2021b). Prevention and detection of market abuse. https://www.afm.nl/en/sector/themas/
beurzen-en-effecten/afm-market-watch. Accessed: 18/April/2023.
AFM (2023). Machine learning in trading algorithms application by dutch proprietary trading firms and pos-
sible risks. https://www.afm.nl/en/sector/actueel/2023/maart/her-machine-learning. Accessed:
7/March/2023.
Aı̈t-Sahalia, Y., Fan, J., Xue, L., and Zhou, Y. (2022). How and when are high-frequency stock returns
predictable? Available at SSRN 4095405.
Amihud, Y. and Mendelson, H. (1980). Dealership market: Market-making with inventory. Journal of
financial economics, 8(1):31–53.
32

Assefa, S. A., Dervovic, D., Mahfouz, M., Tillman, R. E., Reddy, P., and Veloso, M. (2020). Generating syn-
thetic data in finance: opportunities, challenges and pitfalls. In Proceedings of the First ACM International
Conference on AI in Finance, pages 1–8.
Avellaneda, M. and Stoikov, S. (2008). High-frequency trading in a limit order book. Quantitative Finance,
8(3):217–224.
Bouchaud, J.-P., Gefen, Y., Potters, M., and Wyart, M. (2003). Fluctuations and response in financial
markets: the subtle nature ofrandom’price changes. Quantitative finance, 4(2):176.
Bouchaud, J.-P., Mézard, M., and Potters, M. (2002). Statistical properties of stock order books: empirical
results and models. Quantitative finance, 2(4):251.
Breiman, L. (2001). Random forests. Machine learning, 45(1):5–32.
Brogaard, J., Hendershott, T., and Riordan, R. (2014). High-frequency trading and price discovery. The
Review of Financial Studies, 27(8):2267–2306.
Byrd, D., Hybinette, M., and Balch, T. H. (2019). ABIDES: Towards high-fidelity market simulation for AI
research. arXiv preprint arXiv:1904.12066.
Cartea, Á., Donnelly, R., and Jaimungal, S. (2018). Enhancing trading strategies with order book signals.
Applied Mathematical Finance, 25(1):1–35.
Cartea, Á. and Jaimungal, S. (2015). Optimal execution with limit and market orders. Quantitative Finance,
15(8):1279–1291.
Cartea, Á., Jaimungal, S., and Penalva, J. (2015). Algorithmic and high-frequency trading. Cambridge
University Press.
Cartea, Á., Jaimungal, S., and Wang, Y. (2020). Spoofing and price manipulation in order-driven markets.
Applied Mathematical Finance, 27(1-2):67–98.
Cartea, Á., Payne, R., Penalva, J., and Tapia, M. (2019). Ultra-fast activity and intraday market quality.
Journal of Banking & Finance, 99:157–181.
Cartea, Á. and Penalva, J. (2012). Where is the value in high frequency trading? The Quarterly Journal of
Finance, 2(03):1250014.
Cartea, Á. and Sánchez-Betancourt, L. (2021). The shadow price of latency: Improving intraday fill ratios
in foreign exchange markets. SIAM Journal on Financial Mathematics, 12(1):254–294.
Cartea, Á. and Sánchez-Betancourt, L. (2022). Brokers and informed traders: dealing with toxic flow and
extracting trading signals. Available at SSRN.
Cartea, Á. and Sánchez-Betancourt, L. (2023). Optimal execution with stochastic delay. Finance and
Stochastics, 27(1):1–47.
Cohen, S. N., Snow, D., and Szpruch, L. (2021). Black-box model risk in finance. arXiv preprint
arXiv:2102.04757.
Cont, R., Cucuringu, M., Glukhov, V., and Prenzel, F. (2023). Analysis and modeling of client order flow
in limit order markets. Quantitative Finance, pages 1–19.
Cont, R., Stoikov, S., and Talreja, R. (2010). A stochastic model for order book dynamics. Operations
research, 58(3):549–563.
Dutta, C., Karpman, K., Basu, S., and Ravishanker, N. (2022). Review of statistical approaches for modeling
high-frequency trading data. Sankhya B, pages 1–48.
EU (2014). Market abuse regulation. Official Journal of the European Union https://eur-lex.europa.
eu/legal-content/EN/TXT/PDF/?uri=CELEX:32014R0596&from=EN. Accessed: 9/March/2023.
33

EU (2016a). Commission delegated regulation (eu) 2017/584. Official Journal of the European Union
https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32017R0584&from=EN. Ac-
cessed: 1/May/2023.
EU (2016b). Commission delegated regulation (eu) 2017/589. Official Journal of the European Union
https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32017R0589&from=EN. Ac-
cessed: 1/May/2023.
Euronext (2023a). AHOLD asset description. https://live.euronext.com/en/product/equities/
NL0011794037-XAMS. Accessed: 7/March/2023.
Euronext (2023b). ASML asset description. https://live.euronext.com/en/product/equities/
Euronext (2023c). ING asset description. https://live.euronext.com/en/product/equities/
Euronext (2023d). Rule book. https://www.euronext.com/en/media/1905. Accessed: 7/March/2023.
Euronext (2023e). TOMTOM asset description. https://live.euronext.com/en/product/equities/

Farmer, J. D. and Foley, D. (2009). The economy needs agent-based modelling. Nature, 460(7256):685–686.
Farmer, J. D., Patelli, P., and Zovko, I. I. (2005). The predictive power of zero intelligence in financial
markets. Proceedings of the National Academy of Sciences, 102(6):2254–2259.
Glosten, L. R. and Milgrom, P. R. (1985). Bid, ask and transaction prices in a specialist market with
heterogeneously informed traders. Journal of financial economics, 14(1):71–100.
Gould, M. D., Porter, M. A., Williams, S., McDonald, M., Fenn, D. J., and Howison, S. D. (2013). Limit
order books. Quantitative Finance, 13(11):1709–1742.
Grossman, S. J. and Miller, M. H. (1988). Liquidity and market structure. the Journal of Finance, 43(3):617–
633.
Guéant, O. (2016). The Financial Mathematics of Market Liquidity: From optimal execution to market
making, volume 33. CRC Press.
Guéant, O., Lehalle, C.-A., and Fernandez-Tapia, J. (2013). Dealing with the inventory risk: a solution to
the market making problem. Mathematics and financial economics, 7:477–507.
Hagströmer, B. and Nordén, L. (2013). The diversity of high-frequency traders. Journal of Financial Markets,
16(4):741–770.
Hambly, B., Kalsi, J., and Newbury, J. (2020). Limit order books, diffusion approximations and reflected
spdes: from microscopic to macroscopic models. Applied Mathematical Finance, 27(1-2):132–170.
Hendershott, T., Jones, C. M., and Menkveld, A. J. (2011). Does algorithmic trading improve liquidity?
The Journal of finance, 66(1):1–33.
Ho, T. and Stoll, H. R. (1981). Optimal dealer pricing under transactions and return uncertainty. Journal
of Financial economics, 9(1):47–73.
Kyle, A. S. (1985). Continuous auctions and insider trading. Econometrica: Journal of the Econometric
Society, pages 1315–1335.
Lehalle, C.-A., Guéant, O., and Razafinimanana, J. (2011). High-frequency simulations of an order book: a
two-scale approach. Econophysics of Order-driven Markets: Proceedings of Econophys-Kolkata V, pages
73–92.
34

Libman, D., Haber, S., and Schaps, M. (2021). Forecasting quoted depth with the limit order book. Frontiers
in Artificial Intelligence, 4:667780.
Mankad, S., Michailidis, G., and Kirilenko, A. (2013). Discovering the ecosystem of an electronic financial
market with a dynamic machine-learning method. Algorithmic Finance, 2(2):151–165.
Megarbane, N., Saliba, P., Lehalle, C.-A., and Rosenbaum, M. (2017). The behavior of high-frequency
traders under different market stress scenarios. Market Microstructure and Liquidity, 3(03n04):1850005.
Murphy, K. P. (2022). Probabilistic machine learning: an introduction. MIT press.
O’Hara, M. (1998). Market Microstructure Theory. John Wiley & Sons.
Penalva, J. S. and Tapia, M. (2021). Heterogeneity and competition in fragmented markets: Fees vs speed.
Applied Mathematical Finance, 28(2):143–177.
Rigaki, M. and Garcia, S. (2020). A survey of privacy attacks in machine learning. arXiv preprint
arXiv:2007.07646.
Roşu, I. (2009). A dynamic model of the limit order book. The Review of Financial Studies, 22(11):4601–
4641.
Shokri, R., Stronati, M., Song, C., and Shmatikov, V. (2017). Membership inference attacks against machine
learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3–18. IEEE.
Sirignano, J. and Cont, R. (2019). Universal features of price formation in financial markets: perspectives
from deep learning. Quantitative Finance, 19(9):1449–1459.
Tao, X., Day, A., Ling, L., and Drapeau, S. (2022). On detecting spoofing strategies in high-frequency
trading. Quantitative Finance, 22(8):1405–1425.
Verousis, T., Perotti, P., and Sermpinis, G. (2018). One size fits all? high frequency trading, tick size
changes and the implications for exchanges: market quality and market structure considerations. Review
of Quantitative Finance and Accounting, 50:353–392.
Vyetrenko, S., Byrd, D., Petosa, N., Mahfouz, M., Dervovic, D., Veloso, M., and Balch, T. (2020). Get
real: Realism metrics for robust limit order book market simulations. In Proceedings of the First ACM
International Conference on AI in Finance, pages 1–8.
Wang, X., Hoang, C., Vorobeychik, Y., and Wellman, M. P. (2021). Spoofing the limit order book: A
strategic agent-based analysis. Games, 12(2).
Williams, B. and Skrzypacz, A. (2020). Spoofing in equilibrium. Stanford University Graduate School of
Business Research Paper.
Wright, I. D., Reimherr, M., and Liechty, J. (2022). A machine learning approach to classification for traders
in financial markets. Stat, 11(1):e465.
35

A Description of shares
Similar to Table 5, we compute the order count, trade count, and volume traded of groups of algorithms
during the first four weeks of training data for ING, AHOLD, and TOMTOM. Table 20 displays the results
for ING, Table 21 for AHOLD, and Table 22 for TOMTOM.
Top 5 2,326,236 42 190,042 24 970,242,121 19

Top 10 3,542,047 64 268,601 35 1,338,267,032 27
Top 20 4,495,680 81 328,265 42 1,675,327,497 34
Top 50 5,388,559 97 600,684 77 3,631,692,876 73
All 5,555,240 100 777,719 100 4,986,039,098 100
Table 20: Descriptive statistics for algorithms trading in ING during first four weeks of training data.
Top 5 925,564 49 68,711 24 294,230,009 25

Top 10 1,250,004 66 97,494 34 378,566,543 33
Top 20 1,590,981 84 121,131 42 429,239,323 37
Top 50 1,837,989 97 228,241 80 918,281,638 79
All 1,891,709 100 285,439 100 1,163,250,600 100
Table 21: Descriptive statistics for algorithms trading in AHOLD during first four weeks of training data.
Top 5 151,036 59 6,266 17 12,802,205 14

Top 10 193,241 75 9,173 25 24,460,828 28
Top 20 230,321 89 18,410 49 41,265,509 48
Top 50 253,284 98 32,989 88 75,158,861 87
All 257,902 100 37,283 100 86,580,402 100
Table 22: Descriptive statistics for algorithms trading in TOMTOM during first four weeks of training data.
B Feature specifications
In this section, we provide a detailed description of the variables we use in the study; these were introduced
in Section 3.1.
We denote time by t and let the trading horizon be [0, T ] on any given day. At any time t ∈ [0, T ], let S̃t
be price of the last transaction that took place before time t; if the trade involved multiple orders, S̃t is the
volume weighted average price of the transaction. Let Sta be the best ask price available in the LOB at time
t and Vta the aggregated volume at price level Sta . Similarly, let Stb be the best bid price available in the
LOB at time t and Vtb the aggregated volume at price level Stb . Thus, the spread at time t and the midprice
at time t are given by
S a + Stb
St = Sta − Stb , St = t , (10)
2
respectively. For a given number of levels n ∈ N denote by Vta,n the total volume available in the first n levels
of the LOB on the ask side; similarly, denote by Vtb,n the total volume available in the first n levels of the
LOB on the bid side. Recall that A = {a1 , a2 , . . . , aM } are the identifiers for the algorithms on Euronext,
and B = {b1 , b2 , . . . , bN } are the identifiers for the trading members of Euronext with N ≤ M because a
trading member can have multiple trading algorithms. For ai ∈ A (similar for bi ∈ B) let Vta,n,−ai be the
volume available in the first n levels of the LOB on the ask side when we exclude any volume posted by
36

algorithm ai (similar definition for Vtb,n,−ai ). Likewise, we use Vta,n,ai and Vtb,n,ai for the volume posted by
algorithm ai ∈ A. We define the imbalance between positive volumes V a and V b by
I(V b , V a ) = log(1 + V b ) − log(1 + V a ), . (11)
For a time window δ > 0, we define the volatility of the transaction prices at time t of period δ by
s X
Vδt = |∆ log S̃u |2 , (12)
∆ log S̃u ̸=0 ; u∈[t−δ,t)
where
∆ log S̃u = log S̃u − log S̃u− , and S̃u− = lim S̃v . (13)
v↗u
The above definition extends to other stochastic processes in the paper, that is Yu− = limv↗u Yv for any
càdlàg stochastic process Y . The return at time t of period δ is given by

δ St−
Rt = log . (14)
St−δ
The number of messages at time t and period δ is denoted by Uδt and records the number of messages in
the LOB during the window [t − δ, t). Here, a “message” is an order entry, an order deletion, or an order
amendment.
b
Lastly, for algorithm ai ∈ A and the associated trading member bj ∈ B, we let Qat i and Qt j denote the
bj
ai
intraday accumulated inventory up to time t with the assumption that Q0 = 0 and Q0 = 0, and we let Cat i
b
and Ct j be the accumulated expense (in EUR) of purchases of inventory with the assumption that Ca0 i = 0
b
and C0j = 0. Note that for x ∈ {ai , bj }, the variables Cxt and Qxt change in the same direction, that is, if
inventory goes up then cash goes up and vice-versa. Cash and inventory variables are computed for trading
in Euronext Amsterdam.
We are now able to describe the variables we use as features to describe the way algorithms make decisions
in the market. The set of variables that an algorithm ai ∈ A (of trading member bj ∈ B) has at its disposal
at time t ∈ [0, T ] is given by:
(i) Available volumes in the LOB excluding trading algorithm ai (six variables).
In particular, we use log(1+V ) where V is one of the following variables: Vtx,5,−a
−
i
, Vtx,10,−a
−
i
−Vtx,5,−a
−
i
,
x,20,−ai x,10,−ai
Vt− − Vt− , for x ∈ {a, b}.
(ii) Available volumes in the LOB by trading algorithm ai (six variables).
In particular, we use log(1 + V ) where V is one of the following variables: Vtx,5,a
−
i
, Vtx,10,a
−
i
− Vtx,5,a
−
i
,
x,20,ai x,10,ai
Vt− − Vt− , for x ∈ {a, b}.
(iii) Imbalance in the LOB excluding trading algorithm ai (three variables).
In particular, we use: I(Vtb,5,−a
−
i
, Vta,5,−a
−
i
), I(Vtb,10,−a
−
i
−Vtb,5,−a
−
i
, Vta,10,−a
−
i
−Vta,5,−a
−
i
), and I(Vtb,20,−a
−
i
−
b,10,−ai a,20,−ai a,10,−ai
Vt− , V t− − Vt− ).
(iv) Imbalance in the LOB of trading algorithm ai (three variables).

In particular, we use: I(Vtb,5,a −
i
, Vta,5,a
−
i
), I(Vtb,10,a
−
i
− Vtb,5,a
−
i
, Vta,10,a
−
i
− Vta,5,a
−
i
), and I(Vtb,20,a
−
i
−
b,10,ai a,20,ai a,10,ai
Vt− , Vt− − Vt− ).
(v) Change of imbalance in the LOB excluding trading algorithm ai (three variables).
In particular, we use: I(Vtb,5,−a−
i
, Vta,5,−a
−
i
) − I(Vsb,5,−a
−
i
, Vsa,5,−a
−
i
), I(Vtb,10,−a
−
i
− Vtb,5,−a
−
i
, Vta,10,−a
−
i
−
a,5,−ai b,10,−ai b,5,−ai a,10,−ai a,5,−ai b,20,−ai b,10,−ai a,20,−ai a,10,−ai
Vt− )−I(Vs− −Vs− , Vs− −Vs− ), and I(Vt− −Vt− , Vt− −Vt− )−
I(Vsb,20,−a
−
i
− V b,10,−ai
s− , V a,20,−ai
s− − V a,10,−ai
s− ) where s < t is the time of the last message before t and
as usual s− is the time just before s.
37

(vi) Log volume of quantity at best bid and best offer (two variables).
In particular, we use log(1 + V ) where V is one of the following variables: Vtx,1
− for x ∈ {a, b}.
(vii) Spread in basis points (one variable).
In particular, we use St− /St− × 10, 000.
(viii) Returns over a number of periods (four variables).

In particular, we use: R1s 5s 60s 300s
t , Rt , Rt , and Rt .
(ix) Volatility over a number of periods (four variables).
In particular, we use: V1m 5m 15m
t , Vt , Vt , and V60m
t .
(x) Number of messages over number of periods (four variables).

In particular, we count the number of messages sent to Euronext in the last: 1 second, 100 milliseconds,
1 millisecond, 100 microseconds.
(xi) Aggressive buys minus aggressive sells over previous 1, 5, 60 and 360 seconds (four variables).
(xii) Aggressive buys over previous 1, 5, 60 and 360 seconds (four variables).
(xiii) Aggressive sells over previous 1, 5, 60 and 360 seconds (four variables).
(xiv) Volume of last transaction (one variable).
In particular, if the volume of the last transaction is V > 0, we employ log(1 + V ).
(xv) Inventory of trading algorithm Qat−i (one variable).

In particular, we employ
sign Qat−i × log 1 + Qat−i . (15)

b
(xvi) Inventory of trading member Qt−j (one variable).
b b
sign Qt−j × log 1 + Qt−j . (16)
(xvii) Cash of trading algorithm Cat−i (one variable)

Z t
Cat i = S̃u dQaui , Ca0 i = 0 . (17)
0
sign Cat−i × log 1 + Cat−i . (18)

b
(xviii) Cash of trading member Ct−j (one variable)
Z t
b b
Ct j = S̃u dQbuj , C0j = 0 . (19)
0
b b
sign Ct−j × log 1 + Ct−j . (20)
38

C Model fit results for other shares
C.1 Model performance
Similar to Table 6, Table 7, and Table 8, we show (i) the accuracies of the logistic regressions, (ii) the
outperformance of the logistic regressions over the most frequent bucket, and (iii) the outperformance of
random forests over the most frequent bucket and over the logistic regression; we do this for the shares ING,
AHOLD, and TOMTOM.
Table 23 shows the accuracies for ING, Table 24 shows the outperformance of the logistic regressions over
the most frequent bucket, and Table 25 shows the outperformance of random forests over the most frequent
bucket and over logistic regression. Here, Dt has two buckets, the variable Pt has three buckets and Vt has
nine buckets.
R1 R2 R3 R4 R5 R6 R7
Top 5 64 ± 9 88 ± 9 55 ± 2 91 ± 6 90 ± 7 56 ± 2 56 ± 2
Top 10 65 ± 7 87 ± 12 64 ± 21 90 ± 10 90 ± 10 64 ± 21 64 ± 21
Top 20 71 ± 13 90 ± 12 63 ± 19 92 ± 11 92 ± 10 63 ± 19 63 ± 18
Top 50 75 ± 14 87 ± 15 55 ± 21 88 ± 15 89 ± 14 56 ± 21 55 ± 21
All 78 ± 15 78 ± 22 46 ± 23 75 ± 27 76 ± 28 43 ± 25 42 ± 25
Table 23: Accuracies of the logistic regression models for ING. Twelve train-and-deploy exercises.
Top 5 13 ± 8 1±1 0±8 3±4 2±4 0±9 1±9

Top 10 14 ± 7 3±7 1±6 6±9 6±9 1±7 1±7
Top 20 20 ± 13 3±5 3±7 5±7 5±7 3±9 2±7
Top 50 23 ± 13 4±8 1±6 5 ± 10 5±9 2±8 2±7
All 25 ± 17 1 ± 12 −3 ± 12 3 ± 11 3 ± 13 −2 ± 13 −1 ± 12
Table 24: Outperformance over benchmark for ING. Twelve train-and-deploy exercises.
outperformance
Dt Pt Vt Dt Pt Vt
Top 5 22 ± 8 0 ± 1 2 ± 9 9 ± 9 0 ±1 2 ± 1
Top 10 20 ± 8 2 ± 4 2 ± 8 6 ± 8 −1 ±2 2 ± 3
Top 20 25 ± 12 2 ± 3 3 ± 9 4 ± 6 −1 ±2 0 ± 4
Top 50 26 ± 13 2 ± 6 2 ± 7 3 ± 6 −1 ±2 2 ± 4
All 25 ± 17 2 ± 7 1 ± 9 1 ± 8 1 ±7 4 ± 8
order-weighted average 23 2 2 5 −1 2
Table 25: Outperformance of random forests over benchmark and logistic regression for ING. Twelve train-
and-deploy exercises.
Table 26 shows the accuracies for AHOLD, Table 27 shows the outperformance of the logistic regressions
over the most frequent bucket, and Table 28 shows the outperformance of random forests over the most
frequent bucket and over logistic regression. Here, Dt has two buckets, the variable Pt has three buckets
and Vt has nine buckets.
39

R1 R2 R3 R4 R5 R6 R7
Top 5 63 ± 7 84 ± 8 58 ± 14 86 ± 6 87 ± 6 58 ± 13 58 ± 14
Top 10 66 ± 9 87 ± 10 66 ± 21 89 ± 9 89 ± 9 66 ± 21 66 ± 21
Top 20 71 ± 13 89 ± 11 67 ± 23 90 ± 10 91 ± 10 67 ± 23 66 ± 23
Top 50 74 ± 14 84 ± 16 53 ± 23 85 ± 15 84 ± 17 55 ± 22 53 ± 23
All 78 ± 16 77 ± 21 42 ± 24 73 ± 28 73 ± 28 39 ± 27 40 ± 25
Table 26: Accuracies of the logistic regression models for AHOLD. Twelve train-and-deploy exercises.
Top 5 13 ± 7 7±9 5±6 9 ± 10 9 ± 10 6±7 5±7

Top 10 16 ± 7 5±7 4±8 6±8 7±8 4±9 4±9
Top 20 20 ± 14 3±6 4±7 5±7 5±7 5±8 3±8
Top 50 22 ± 14 3±7 2±6 5±8 3 ± 11 2±8 3±8
All 22 ± 21 0 ± 12 −2 ± 12 3 ± 13 4 ± 12 −1 ± 11 −1 ± 11
Table 27: Outperformance over benchmark for AHOLD. Twelve train-and-deploy exercises.
outperformance
Dt Pt Vt Dt Pt Vt
Top 5 20 ± 6 5 ± 7 3±9 7±8 −2 ±3 −2 ± 4

Top 10 21 ± 7 3 ± 5 2±9 5±6 −2 ±3 −2 ± 3
Top 20 25 ± 13 2 ± 4 4±9 5±5 −1 ±2 0 ± 5
Top 50 25 ± 14 2 ± 5 3±7 3±5 −1 ±3 1 ± 3
All 22 ± 19 2 ± 7 2 ± 10 1 ± 11 2 ±8 3 ± 7
order-weighted average 23 3 4 5 −1 0
Table 28: Outperformance of random forests over benchmark and logistic regression for AHOLD. Twelve
Table 29 shows the accuracies for TOMTOM, Table 30 shows the outperformance of the logistic regres-
sions over the most frequent bucket, and Table 31 shows the outperformance of random forests over the most
frequent bucket and over logistic regression. Here, Dt has two buckets, the variable Pt has three buckets
and Vt has ten buckets.
R1 R2 R3 R4 R5 R6 R7
Top 5 68 ± 15 88 ± 15 57 ± 27 89 ± 15 89 ± 15 58 ± 27 58 ± 26
Top 10 66 ± 13 90 ± 13 61 ± 23 90 ± 13 91 ± 12 61 ± 23 62 ± 23
Top 20 70 ± 15 88 ± 13 62 ± 26 89 ± 12 89 ± 12 62 ± 25 62 ± 26
Top 50 73 ± 13 78 ± 19 51 ± 24 79 ± 18 77 ± 20 50 ± 24 50 ± 24
All 71 ± 24 70 ± 23 44 ± 28 54 ± 38 54 ± 39 31 ± 30 31 ± 30
Table 29: Accuracies of the logistic regression models for TOMTOM. Twelve train-and-deploy exercises.
40

Top 5 16 ± 15 2±4 4±9 3±5 3±4 5±9 5±9

Top 10 14 ± 13 1±3 3±7 2±4 2±3 2±8 2±7
Top 20 17 ± 16 0±3 1±6 1±4 2±3 1±7 1±6
Top 50 19 ± 13 1±5 −2 ± 7 2±7 1±5 −3 ± 8 −2 ± 9
All 13 ± 19 −2 ± 15 −6 ± 15 −2 ± 16 −3 ± 16 −6 ± 17 −6 ± 20
Table 30: Outperformance over benchmark for TOMTOM. Twelve train-and-deploy exercises.
outperformance
Dt Pt Vt Dt Pt Vt
Top 5 23 ± 16 2 ± 3 7 ± 10 8 ± 9 −1 ± 1 2 ± 2
Top 10 19 ± 15 1 ± 2 4 ± 8 5 ± 7 0±1 1 ± 2
Top 20 22 ± 15 0 ± 2 2 ± 6 5 ± 9 0±1 1 ± 3
Top 50 22 ± 14 1 ± 4 0 ± 6 3 ± 7 0±3 3 ± 3
All 13 ± 19 0 ± 7 0 ± 8 0 ± 13 2 ± 13 6 ± 13
order-weighted average 24 0 5 9 0 2
Table 31: Outperformance of random forests over benchmark and logistic regression for TOMTOM. Twelve
C.2 Feature importance

We present the permutation feature importance using the logistic regressions. Similar to Figure 2 – where
we show the most important features for predicting Dt , Figure 9 reports the analogous for ING, Figure 10
reports the analogous for AHOLD, and Figure 11 reports the analogous for TOMTOM.
inventory of algo
inventory of member
cash of algorithm
cash of member
best ask volume
best bid volume
return 1s
0 0.05 0.1 0.15 0.2
Figure 9: Most important features to explain the direction of the order for ING using permutation importance
and logistic regressions.
41

inventory of algo
cash of member
inventory of member
cash of algorithm
best ask volume
best bid volume
net agg buy-sell last 1s
0 0.05 0.1 0.15 0.2
Figure 10: Most important features to explain the direction of the order for AHOLD using permutation
importance and logistic regressions.
cash of member
inventory of member
inventory of algo
cash of algorithm
best bid volume
0 0.05 0.1 0.15 0.2
Figure 11: Most important features to explain the direction of the order for TOMTOM using permutation
Unlike the case for ASML, inventory or cash related features are the most important features when
predicting Dt for ING, AHOLD, and TOMTOM.
Similar to Figure 3 – where we show the most important features for predicting price bucket Pt , Figure
12 reports the analogous for ING, Figure 13 for AHOLD, and Figure 14 for TOMTOM.
42

spread
cash of algorithm
inventory of algo
num messages 0.1ms
cash of member
inventory of member
num messages 1ms
num messages 1s
agg sell last 1s
agg buy last 1s
0 2 4 6 8
·10−2
Figure 12: Most important features to explain the price bucket Pt for ING using permutation importance
inventory of algo
cash of algorithm
spread
num messages 1ms
inventory of member
cash of member
num messages 0.1ms
num messages 1s
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14
Figure 13: Most important features to explain the price bucket Pt for AHOLD using permutation importance
43

spread
agg buy last 1s
num messages 1ms
agg sell last 1s
num messages 1s
num messages 0.1s
best bid volume
return 60s
0 1 2 3
·10−2
Figure 14: Most important features to explain the price bucket Pt for TOMTOM using permutation impor-
tance and logistic regressions.
Here, spread retains its place as an important feature. Inventory of algorithm and cash of algorithm are
also among the most important features for ING, and AHOLD.
Similar to Figure 4 – where we show the most important features for predicting volume bucket Vt , Figure
15 reports the analogous for ING, Figure 16 for AHOLD, and Figure 17 for TOMTOM.
inventory of member
cash of member
inventory of algo
cash of algorithm
quad var 5m
bid volume excl algo 0-5
agg buy last 360s
num messages 1s
0 0.1 0.2 0.3 0.4
Figure 15: Most important features to explain the volume bucket Vt for ING using permutation importance
44

inventory of member
inventory of algo
cash of algorithm
cash of member
quad var 15m
quad var 60m
num messages 1s
spread
0 0.1 0.2 0.3 0.4
Figure 16: Most important features to explain the volume bucket Vt for AHOLD using permutation impor-
tance and logistic regressions.
cash of algorithm
inventory of algo
inventory of member
cash of member
spread
num messages 1s
num messages 0.1s
ask volume excl algo 0-5
0 0.1 0.2 0.3 0.4
Figure 17: Most important features to explain the volume bucket Vt for TOMTOM using permutation
Consistently across ING, AHOLD, and TOMTOM, the four most important features are either inventory
variables (inventory of algorithm and inventory of member) or cash variables (cash of algoirthm and cash of
member).
C.3 Clustering of agents

This section presents the clustering results for the shares ING, AHOLD and TOMTOM according to the
methodology explained in Subsection 4.4. First, we show the size of the clusters we obtain in each of the
twelve clustering exercises. As before, for each of the clustering exercises, the first cluster has the most
algorithms, and the third cluster has the least number of algorithms.
45

100
80
60
Cluster 1
Cluster 2
Cluster 3 40
20
0
44
45
46
47
48
49
50
51
52
1
2
3
week
Figure 18: Size of clusters for ING across the twelve clustering exercises.
80
60
Cluster 1
Cluster 2
Cluster 3 40
20
0
44
45
46
47
48
49
50
51
52
1
2
3
week
Figure 19: Size of clusters for AHOLD across the twelve clustering exercises.
46

60
Cluster 1
Cluster 2 40
Cluster 3
20
0
44
45
46
47
48
49
50
51
52
1
2
3
week
Figure 20: Size of clusters for TOMTOM across the twelve clustering exercises.
Second, we show the stability of clusters through time using the technique described in Section 4.4.1 to
create Figure 6.
0.9
0.8
0.7
0.6
0.5
45
46
47
48
49
50
51
52
week
Figure 21: Stability of clusters for ING across the twelve train exercises.
47

1
0.9
0.8
0.7
0.6
0.5
45
46
47
48
49
50
51
52
3
week
Figure 22: Stability of clusters for AHOLD across the twelve train exercises.
0.9
0.8
0.7
0.6
0.5
45
46
47
48
49
50
51
52
week
Figure 23: Stability of clusters for TOMTOM across the twelve train exercises.
Figure 22 and 23 show that the stability of the clusters we obtain for AHOLD and TOMTOM is not good.
In particular, we see that random reshuffling of algorithms produces a similar stability score. Therefore we
do not interpret the clustering results of these two shares. We believe the poor clustering might be due to
the clusters on the first four weeks of data having substantially different sizes than the clusters obtained
on other months of data of the same share (see Figures 19 and 20). If we were to take a clustering on the
second month of data, for example, with clusters whose size is more representative of other months, then
the stability metric would improve.
48

Client
14 4 5
House
13 6 2
LP 27 14 11
Figure 24: Confusion matrix between dealing capacity and clusters obtained on the first four weeks of data
for ING.
Figure 24 shows an “L shape” similar to that in Figure 7. That is: the majority of algorithms are in
cluster 1, with types Liquidity Provider, House and Client all well represented. In cluster 2 we again observe
mostly algorithms with type Liquidity Provider, and the same goes for cluster 3.
Client
9 9 5
House
7 8 4
LP
19 12 18
for AHOLD.
49

Client
23 3 3
House
14 0 1
LP 20 9 3
for TOMTOM.
C.4 Direction: average behaviour

Here we report the average regression coefficients for ING, AHOLD, and TOMTOM using the same tech-
niques as those described in Subsection 4.4.5.

volume of algo 11-20 0.30 −0.11 0.44
best bid volume 0.41 −0.16 0.23
best ask volume −0.35 0.17 −0.24
volume of algo 6-10 0.08 −0.14 0.24
imbalance excl algo 0-5 −0.10 −0.00 −0.22
inventory of member 0.09 −0.09 0.15
cash of member 0.09 −0.08 0.15
num messages 0.1ms −0.02 −0.12 −0.10
num messages 1ms −0.02 −0.10 −0.08
return 1s 0.07 0.00 0.12
volume excl algo 0-5 0.10 0.01 −0.08
net agg buy-sell last 1s 0.03 −0.04 0.11
quad var 60m −0.11 −0.01 0.02
Table 32: Average regression coefficients per cluster on first four weeks of training data for ING when
predicting direction Dt . All excluded features have average coefficients with a magnitude smaller than 0.1
for all clusters.
There are similarities between ASML and ING. For imbalance of the algorithm on the first five levels
we observe a strong positive coefficient for cluster 1, a strong negative coefficient for cluster 2, and we see
cluster 3 lie between the two. These results are consistent with what we observe in ASML. For inventory
of algorithm we observe a strong positive coefficient for cluster 1, a strong negative coefficient for cluster 3,
and cluster 2 is somewhere in between – again consistent with the clustering obtained for ASML, including
the signs of the coefficients. Lastly, we see that imbalances of the algorithm near the top of the LOB have
the largest positive (resp. negative) coefficients for both ASML and ING.
50


cash of algorithm 0.50 0.43 0.56
inventory of algo 0.50 0.43 0.55
imbalance of algo 11-20 −0.04 0.02 −0.93
volume of algo 11-20 −0.33 0.26 −0.18
volume of algo 0-5 0.30 −0.08 0.24
cash of member 0.03 0.31 −0.02
inventory of member 0.03 0.31 −0.02
quad var 60m −0.08 0.18 −0.01
imbalance excl algo 11-20 0.09 0.12 −0.06
chg imbalance excl algo 0-5 −0.01 −0.08 0.17
return 1s 0.10 0.01 0.15
num messages 0.1ms 0.03 −0.19 −0.03
num messages 1ms −0.03 −0.17 −0.02
imbalance excl algo 6-10 −0.00 0.06 −0.13
return 5s 0.05 0.03 0.11
volume excl algo 11-20 −0.02 0.14 −0.02
return 300s 0.02 0.03 −0.10
net agg buy-sell last 1s 0.01 0.03 0.10
imbalance excl algo 0-5 0.03 −0.00 −0.11
agg sell last 1s −0.01 −0.02 −0.10
quad var 15m 0.01 0.10 −0.00
Table 33: Average regression coefficients per cluster on first four weeks of training data for AHOLD when
for all clusters.

imbalance of algo 6-10 −0.05 −0.76 −0.13
volume of algo 6-10 0.05 −0.26 −0.16
num messages 1ms −0.02 0.04 −0.36
imbalance of algo 11-20 −0.18 −0.06 −0.18
num messages 0.1ms −0.03 0.05 −0.30
return 1s 0.15 0.08 0.14
return 5s 0.13 0.09 0.12
agg buy last 5s 0.02 0.07 0.25
return 300s 0.10 0.05 −0.16
volume of algo 0-5 0.08 −0.21 −0.02
num messages 0.1s 0.03 −0.02 −0.26
quad var 5m 0.05 −0.04 −0.20
chg imbalance excl algo 6-10 −0.02 0.15 −0.11
agg buy last 1s 0.04 0.06 0.16
agg sell last 5s −0.06 −0.02 0.17
inventory of member 0.04 −0.03 0.18
spread −0.09 0.03 0.12
cash of member 0.04 −0.03 0.18
volume excl algo 6-10 0.06 0.03 −0.13
volume of algo 11-20 −0.05 0.08 0.10
volume excl algo 11-20 0.15 0.00 −0.05
num messages 1s 0.01 −0.03 0.13
imbalance excl algo 0-5 −0.01 0.02 −0.12
Table 34: Average regression coefficients per cluster on first four weeks of training data for TOMTOM when
for all clusters.
51

C.5 Price limit: average behaviour
Here we report the results of Subsection 4.4.6 for ING, AHOLD, and TOMTOM. We do not comment on
the coefficients of the tables below; the insights are similar to those discussed for ASML.

intercept −0.09 −0.64 −0.20 −0.04 −0.64 −0.28

best bid volume 0.42 0.69 0.32 −0.22 −0.12 −0.16
best ask volume −0.24 −0.12 −0.16 0.38 0.66 0.30
volume of algo 0-5 −0.26 0.20 −0.28 −0.26 0.28 −0.33
volume of algo 6-10 −0.20 0.19 −0.33 −0.22 0.25 −0.35
spread 0.11 0.39 0.12 0.13 0.38 0.15
imbalance of algo 0-5 0.05 0.43 0.01 −0.03 −0.58 0.09
num messages 1ms 0.19 0.12 0.09 0.16 0.18 0.12
volume of algo 11-20 −0.00 0.07 −0.28 −0.06 0.05 −0.27
num messages 0.1ms 0.16 0.10 0.06 0.13 0.17 0.09
num messages 0.1s 0.05 0.12 0.07 0.06 0.12 0.01
imbalance of algo 6-10 −0.06 0.05 −0.02 0.05 −0.17 0.04
imbalance excl algo 6-10 −0.03 0.01 0.02 0.07 0.10 0.05
inventory of member 0.03 0.01 0.10 −0.05 0.01 0.08
cash of member 0.03 0.01 0.10 −0.05 0.01 0.08
Table 35: Average regression coefficient for price bucket describing eagerness to trade per cluster on first
four weeks of training data for ING conditioning on the direction of order Dt .

intercept −0.30 −0.59 0.11 −0.41 −0.47 0.12

best bid volume 0.45 0.76 0.40 −0.19 −0.08 −0.34
best ask volume −0.20 −0.10 −0.30 0.42 0.62 0.46
spread 0.20 0.51 0.17 0.16 0.43 0.18
volume of algo 11-20 −0.11 −0.08 −0.38 −0.14 −0.07 −0.39
volume of algo 6-10 −0.01 −0.08 −0.50 0.00 0.01 −0.49
volume of algo 0-5 0.07 −0.02 −0.38 0.10 0.07 −0.39
num messages 1ms 0.14 0.15 0.03 0.08 0.06 0.06
volume excl algo 0-5 −0.10 −0.08 −0.07 −0.02 −0.07 −0.06
num messages 0.1ms 0.12 0.12 0.01 0.08 0.06 0.04
imbalance excl algo 11-20 −0.11 −0.06 −0.08 −0.03 0.05 0.06
four weeks of training data for AHOLD conditioning on the direction of order Dt .
52

intercept −0.02 −0.71 −0.28 −0.04 −0.76 −0.31

volume of algo 0-5 −0.40 0.01 −0.55 −0.43 −0.15 −0.44
best bid volume 0.18 0.63 0.06 −0.15 −0.19 −0.07
best ask volume −0.16 −0.15 −0.09 0.11 0.52 0.03
num messages 1ms 0.14 0.19 0.15 0.12 0.04 0.14
imbalance of algo 0-5 −0.01 0.21 0.04 0.08 −0.30 0.01
volume excl algo 0-5 0.10 0.08 0.20 0.06 0.06 0.16
num messages 0.1ms 0.11 0.20 0.11 0.09 0.02 0.12
spread −0.03 0.22 0.01 0.02 0.18 −0.00
num messages 0.1s 0.12 0.07 0.06 0.05 0.11 0.01
imbalance excl algo 6-10 −0.02 −0.02 0.18 0.04 0.05 0.11
volume of algo 6-10 −0.11 −0.02 −0.06 −0.07 0.09 −0.04
imbalance excl algo 11-20 −0.00 −0.04 −0.11 0.01 0.06 −0.12
imbalance of algo 11-20 0.04 0.12 −0.01 0.03 −0.10 0.03
last volume transacted 0.05 −0.01 −0.11 0.04 0.01 −0.07
volume of algo 11-20 −0.04 −0.03 0.07 0.00 0.10 0.05
num messages 1s 0.02 −0.00 −0.04 −0.01 0.10 −0.07
four weeks of training data for TOMTOM conditioning on the direction of order Dt .

intercept −0.07 0.08 0.09 −0.08 0.15 −0.02

volume of algo 0-5 0.17 0.26 0.27 0.25 0.27 0.34
volume of algo 11-20 −0.14 −0.23 −0.13 −0.21 −0.30 −0.10
imbalance of algo 0-5 0.01 0.53 0.02 0.02 −0.32 −0.03
volume of algo 6-10 0.05 0.10 0.12 0.09 0.06 0.19
spread 0.03 −0.16 0.08 0.05 −0.17 0.06
best ask volume 0.06 −0.02 0.11 0.04 −0.19 0.10
num messages 0.1s −0.07 −0.08 −0.08 −0.07 −0.10 −0.08
imbalance of algo 6-10 0.00 0.24 0.00 0.02 −0.11 −0.00
best bid volume −0.01 −0.17 0.03 0.05 0.02 0.07
num messages 1ms −0.10 −0.04 −0.04 −0.10 −0.02 −0.03

intercept −0.22 0.10 0.11 −0.24 0.29 0.08

volume of algo 0-5 −0.03 0.18 0.33 −0.06 0.12 0.31
volume of algo 11-20 0.02 −0.09 −0.24 0.07 −0.11 −0.18
best bid volume −0.03 −0.26 −0.02 0.08 0.03 0.11
num messages 0.1s −0.10 −0.19 0.01 −0.07 −0.15 0.01
spread 0.00 −0.23 −0.01 0.05 −0.20 −0.03
best ask volume 0.07 0.04 0.12 0.02 −0.23 −0.04
volume excl algo 0-5 0.10 0.15 0.02 0.09 0.09 0.03
volume of algo 6-10 −0.03 0.08 0.13 −0.03 0.02 0.14
num messages 1ms −0.13 −0.14 0.03 −0.06 −0.06 0.01
imbalance of algo 0-5 0.05 0.11 −0.07 −0.04 −0.09 0.02
num messages 0.1ms −0.11 −0.07 0.05 −0.04 −0.00 0.03
53

intercept 0.08 −0.36 −0.06 −0.02 −0.25 −0.03

volume of algo 0-5 0.13 −0.01 0.13 0.22 0.04 0.18
best bid volume −0.06 −0.26 −0.02 0.09 0.13 0.06
volume of algo 11-20 −0.14 0.09 −0.13 −0.10 −0.01 −0.14
best ask volume 0.11 0.12 0.03 0.01 −0.24 0.04
num messages 0.1s −0.05 −0.08 0.06 −0.06 −0.14 0.12
imbalance excl algo 0-5 0.04 0.06 −0.10 0.00 −0.08 −0.12
volume of algo 6-10 −0.04 −0.08 −0.07 −0.02 −0.13 0.06
imbalance excl algo 11-20 0.02 0.05 0.23 −0.02 −0.05 −0.01
volume excl algo 0-5 −0.00 −0.03 −0.15 0.01 0.08 −0.09
num messages 1s −0.01 −0.06 0.04 −0.02 −0.13 0.10
volume excl algo 11-20 −0.05 0.07 0.15 −0.04 0.01 −0.04
last volume transacted −0.05 0.08 0.11 0.01 0.07 0.02
quad var 15m −0.06 −0.04 0.04 −0.02 −0.13 −0.02
imbalance of algo 0-5 0.01 0.09 −0.00 −0.02 −0.15 −0.04
num messages 1ms −0.08 −0.10 −0.00 −0.05 0.01 −0.06
quad var 60m −0.05 −0.04 0.03 −0.02 −0.14 0.00
volume excl algo 6-10 0.01 0.05 0.10 0.05 −0.00 −0.01
cash of member −0.00 −0.00 −0.03 −0.02 0.00 0.12
inventory of member 0.00 −0.00 −0.03 −0.02 0.00 0.12

intercept 0.16 0.56 0.11 0.12 0.49 0.30

imbalance of algo 0-5 −0.06 −0.96 −0.03 0.01 0.90 −0.06
best ask volume 0.18 0.14 0.05 −0.41 −0.47 −0.40
best bid volume −0.41 −0.52 −0.35 0.17 0.09 0.09
volume of algo 11-20 0.14 0.15 0.41 0.27 0.24 0.38
volume of algo 6-10 0.14 −0.29 0.21 0.13 −0.31 0.16
spread −0.14 −0.23 −0.20 −0.17 −0.21 −0.22
volume of algo 0-5 0.09 −0.46 0.02 0.00 −0.55 −0.01
imbalance of algo 6-10 0.06 −0.28 0.02 −0.07 0.28 −0.03
num messages 0.1ms −0.08 −0.12 −0.07 −0.08 −0.21 −0.11
num messages 1ms −0.08 −0.08 −0.06 −0.07 −0.16 −0.08
imbalance of algo 11-20 0.06 −0.03 −0.04 −0.01 0.14 −0.04

intercept 0.52 0.50 −0.21 0.65 0.19 −0.20

volume of algo 11-20 0.08 0.17 0.62 0.08 0.19 0.56
best bid volume −0.42 −0.49 −0.37 0.11 0.05 0.22
best ask volume 0.13 0.06 0.18 −0.43 −0.40 −0.42
spread −0.21 −0.28 −0.16 −0.21 −0.23 −0.15
volume of algo 6-10 0.04 −0.00 0.38 0.03 −0.04 0.35
volume of algo 0-5 −0.04 −0.16 0.05 −0.03 −0.19 0.08
imbalance excl algo 11-20 0.14 0.08 0.11 0.01 −0.05 −0.06
num messages 0.1s 0.01 0.12 −0.01 0.01 0.11 −0.02
54

intercept −0.06 1.07 0.34 0.06 1.02 0.33

volume of algo 0-5 0.26 0.00 0.41 0.21 0.11 0.25
best bid volume −0.11 −0.37 −0.04 0.06 0.06 0.01
best ask volume 0.05 0.03 0.05 −0.12 −0.28 −0.07
volume of algo 11-20 0.18 −0.06 0.06 0.10 −0.09 0.09
volume of algo 6-10 0.15 0.10 0.13 0.09 0.04 −0.02
num messages 1ms −0.07 −0.09 −0.15 −0.07 −0.04 −0.08
volume excl algo 0-5 −0.09 −0.05 −0.05 −0.07 −0.14 −0.07
spread −0.04 −0.15 −0.03 −0.07 −0.13 −0.04
imbalance excl algo 6-10 0.00 0.02 −0.23 −0.01 −0.10 −0.09
num messages 0.1ms −0.04 −0.11 −0.10 −0.07 −0.05 −0.03
imbalance of algo 11-20 −0.08 −0.13 −0.01 0.01 0.12 −0.03
num messages 0.1s −0.07 0.01 −0.12 0.02 0.02 −0.13
cash of algorithm 0.00 0.05 −0.02 0.05 −0.07 0.14
inventory of algo 0.00 0.05 −0.02 0.05 −0.07 0.14
volume excl algo 11-20 −0.04 −0.04 −0.11 −0.04 −0.02 0.06
imbalance excl algo 0-5 −0.08 0.02 0.04 0.02 0.11 0.03
imbalance excl algo 11-20 −0.02 −0.01 −0.12 0.01 −0.01 0.13
quad var 15m −0.01 0.12 0.03 0.00 0.10 0.03
quad var 60m −0.01 0.12 0.04 0.00 0.08 0.01
agg sell last 360s −0.04 0.01 −0.11 −0.04 −0.00 −0.06
55

SSRN Id4442770

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SSRN Id4442770

Uploaded by

Copyright:

Available Formats

Statistical Predictions of Trading Strategies in

May 16, 2023

Electronic copy available at: https://ssrn.com/abstract=4442770

return to this in Subsection 2.2.

Electronic copy available at: https://ssrn.com/abstract=4442770

and passive orders.

Electronic copy available at: https://ssrn.com/abstract=4442770

1.2 Connection with survey data

1.3 Existing literature

Electronic copy available at: https://ssrn.com/abstract=4442770

Electronic copy available at: https://ssrn.com/abstract=4442770

2.1 Assets studied

Electronic copy available at: https://ssrn.com/abstract=4442770

ASML 41,847 543,088,447 566.10 770.50

2.2 Euronext: members and algorithms

2.3 Data filters and the orders we study

for relevant references.

Electronic copy available at: https://ssrn.com/abstract=4442770

2.3.1 Order types

order type trade count

limit order limit order 752,922 97

validity parameter trade count

IoC DAY 283,545 37

Electronic copy available at: https://ssrn.com/abstract=4442770

order type validity type order count percentage (%)

Top 5 2,612,307 34 382,040 27 2,332,746,616 19

Electronic copy available at: https://ssrn.com/abstract=4442770

3.2 Output variable

Electronic copy available at: https://ssrn.com/abstract=4442770

3.3 Regression models

Electronic copy available at: https://ssrn.com/abstract=4442770

Dkc,+1 = {(xkt , ytk ) ∈ Dkc : Dt = 1} , (8)

3.4 Machine learning models

3.4.2 Algorithm clusters

Electronic copy available at: https://ssrn.com/abstract=4442770

4.2 Out-of-sample accuracy of predictions

Dt Pt Vt Pt |Dt = 1 Pt |Dt = −1 Vt |Dt = 1 Vt |Dt = −1

Top 5 15 ± 7 8 ± 10 5 ± 6 14 ± 14 13 ± 14 6±8 5±7

Table 7: Outperformance over benchmark for ASML. Twelve train-and-deploy exercises.

Electronic copy available at: https://ssrn.com/abstract=4442770

4.3 Feature importance

4.3.1 Logistic regressions

Electronic copy available at: https://ssrn.com/abstract=4442770

Electronic copy available at: https://ssrn.com/abstract=4442770

Electronic copy available at: https://ssrn.com/abstract=4442770

4.4 Clustering of algorithms

Electronic copy available at: https://ssrn.com/abstract=4442770

4.4.1 Stability through time

Electronic copy available at: https://ssrn.com/abstract=4442770

Electronic copy available at: https://ssrn.com/abstract=4442770

imbalance of algo top five levels 0.27 0.33

Table 9: Average temporal deviation of coefficients.

4.4.2 Types of market participants

Cluster 1 Cluster 2 Cluster 3

Electronic copy available at: https://ssrn.com/abstract=4442770

4.4.3 Clustered market behaviour

Cluster 1 Cluster 2 Cluster 3

4.4.4 Summary of results about trading behaviour

Electronic copy available at: https://ssrn.com/abstract=4442770

4.4.5 Direction: average behaviour

Cluster 1 Cluster 2 Cluster 3

imbalance of algo 0-5 1.48 −1.04 −0.42