Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

Performing a Distributed Replay with Multiple Clients using

SQL Server 2012 Distributed Replay


By: Jonathan Kehayias

https://www.sqlskills.com/blogs/jonathan/performing-a-distributed-replay-with-multiple-clients-
using-sql-server-2012-distributed-replay/

Posted on: November 19, 2011 1:52 am

In the first post in this blog series on using SQL Server 2012 Distributed Replay, Installing and
Configuring SQL Server 2012 Distributed Replay, we looked at how to configure a Distributed
Replay environment using multiple clients and a dedicated replay controller. In this post we’ll
actually make use of the previously configured servers to perform a distributed replay using a
random workload that has been generated against the AdventureWorks2008R2 database installed
on our Replay SQL Server.

Collecting the Replay Trace Data


For the purposes of generating a random workload against AdventureWorks2008R2, I created a
workload generator that can be found on my blog post The AdventureWorks2008R2 Books
Online Random Workload Generator. I used this with 2 different PowerShell Windows from
SQL2012-DRU1 and SQL2012-DRU2 to run a random workload across multiple sessions
against the SQL2012-DB1 server. To capture the trace data required for performing the replay,
SQL Server Profiler was used along with the TSQL_Replay template to create the capture.
For production systems, the best way to go about capturing a Replay Trace is to script the trace
definition to a file, and then create the trace as a server side trace that is writing to a trace file on
local disks for the server. This has a significantly lower impact that tracing directly from
Profiler, which uses the rowset provider for Trace. With the replay trace running, and the
workload generating events I waited for the trace to collect around 80000 rows of data and then
shutdown the trace so that I could access the trace file to copy it from the SQL2012-DB1 server
to the SQL2012-DRU server where the Distributed Replay Controller is installed.

Preprocessing the Trace File(s)


At the point that I went to perform the preprocessing of the trace file for replay, I realized a
difference in my environment using multiple servers to build this blog series versus my original
setup using a single server for learning how to use Distributed Replay. In order to preprocess the
trace file for replay, you have to have the Management Tools Basic installed on the server that
will be used for preprocessing the trace data. If you have been following this blog series to learn
how to use Distributed Replay, you will need to run Setup on the SQL2012-DRU server to add
this feature before it can be used for pre-processing the trace file. This is necessary to administer
Distributed Replay.
Once the Management Tools Basic have been installed the server will have to be restarted and
then it is possible to make use of the DReplay.Exe executable to administer the Distributed
Replay components on the controller server. The DReplay executable has multiple options that
can be discovered by using a –? from the command line as follows:

C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn>dreplay -?


Info DReplay Usage:
DReplay.exe {preprocess|replay|status|cancel} [options] [-?]}

Verbs:
preprocess Apply filters and prepare trace data for intermediate file on controller.
replay Transfer the dispatch files to the clients, launch and synchronize replay.
status Query and display the current status of the controller.
cancel Cancel the current operation on the controller.
-? Display the command syntax summary.

Options:
dreplay preprocess [-m controller] -i input_trace_file -d controller_working_dir [-c config_file]
[-f status_interval]
dreplay replay [-m controller] -d controller_working_dir [-o] [-s target_server] -w clients [-c
config_file] [-f status_interval]
dreplay status [-m controller] [-f status_interval]
dreplay cancel [-m controller] [-q]
Run dreplay <verb> -? for detailed help on each verb.

To perform the preprocessing, you will need to do a couple of different steps. The first thing you
need to do is edit any options that you want to set for the pre-processing by editing the
DReplay.Exe.Preproces.config file in the C:\Program Files (x86)\Microsoft SQL
Server\110\Tools\Binn path on the server. There are two configuration files for DReplay.Exe as
highlighted below. At this time make sure that you are only editing the Preprocess.config file.

The DReplay.Exe.Preproces.config file contains a schema defined XML document that


controls the configuration of the preprocessing. In general the options set for preprocessing
should not need to be changed but if you want to include system sessions as a part of the replay,
you can change the options in the XML, which is listed below.

<?xml version="1.0" encoding="utf-8"?>


<Options>
<PreprocessModifiers>
<IncSystemSession>No</IncSystemSession>
<MaxIdleTime>-1</MaxIdleTime>
</PreprocessModifiers>
</Options>
To preprocess the trace data, open a new command prompt window and change directories to the
C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn path. The trace file has been
copied onto the SQL2012-DRU server as C:\DReplay\SQL2012_ReplayTrace.trc. To
preprocess this file first start the “SQL Server Distributed Replay Controller” service by
using NET START:

NET START "SQL Server Distributed Replay Controller"

Then execute the following command from within the Binn path to actually preprocess the trace
file and output:

dreplay preprocess -i "C:\DReplay\SQL2012_ReplayTrace.trc" -d "C:\DReplay"

This will process the trace file and output the working files for performing the Distributed
Replay to the C:\DReplay path. Below is a screenshot of the full window for preprocessing the
trace file.

Note: The dreplay executable can be called from any path within the server because the Binn
path is a part of the Path Environmental variables. However, the executable has to be called
from within the Binn folder to access the necessary .config files and .xsd schema files for the
configuration. If you want to be able to run this executable from another location on the server,
you will need to copy the .config and .xsd files out of the Binn folder to the folder that you want
to be able to run dreplay within for it to work.
Performing the Replay
The first step in performing the replay is to start the “SQL Server Distributed Replay Client”
service on each of the replay clients using NET START.

NET START "SQL Server Distributed Replay Client"

You will want to verify that each of the clients was able to successfully connect to the controller
in the logs as shown in the previous post in this series. Once this has been done, your
environment is almost ready for replay. For the purposes of this blog series, a SELECT only
workload has been generated for replay against AdventureWorks2008R2. However, in most
environments you won’t have a SELECT only workload, so you will have to plan for and
prepare your replay environment using a BACKUP/RESTORE of the production database from a
point within the captured workload so that the database can be replayed against without having
problems associated with Primary Key constraint violations during the replay.

If you want to change any of the parameters associated with the replay operation, you can edit
the DReplay.Exe.Replay.config file in the C:\Program Files (x86)\Microsoft SQL
Server\110\Tools\Binn path. The default contents of the configuration file are shown below:

<?xml version="1.0" encoding="utf-8"?>


<Options>
<ReplayOptions>
<Server></Server>
<SequencingMode>stress</SequencingMode>
<ConnectTimeScale>100</ConnectTimeScale>
<ThinkTimeScale>100</ThinkTimeScale>
<HealthmonInterval>60</HealthmonInterval>
<QueryTimeout>3600</QueryTimeout>
<ThreadsPerClient>255</ThreadsPerClient>
<EnableConnectionPooling>No</EnableConnectionPooling>
<StressScaleGranularity>SPID</StressScaleGranularity>
</ReplayOptions>
<OutputOptions>
<ResultTrace>
<RecordRowCount>Yes</RecordRowCount>
<RecordResultSet>No</RecordResultSet>
</ResultTrace>
</OutputOptions>
</Options>

Before performing the actual replay, make sure that the account being used to run the SQL
Server Distributed Replay Client service has been granted appropriate access to the target SQL
Server and database to be able to perform the replay operations. Once this has been done replay
can be performed using the command line options for DReplay.Exe by providing the appropriate
switches, or you can alternately provide the –c command line switch to specify the configuration
file that should be used for performing the replay. If you change any of the default values listed
above in the DReplay.Exe.Replay.config file, you will need to specify the –c command line
switch for those to take effect. To perform a replay with the defaults, the following command
line execution can be run:

dreplay replay -s "SQL2012-DB1" -d "C:\DReplay" -w "SQL2012-DRU1, SQL2012-DRU2"

Once this is executed, the Distributed Replay Controller will take read in the preprocessed replay
file, and then synchronize the replay across all of the clients specified with the –w command line
parameter. While the replay operation occurs, the command window for the controller will
output periodic updates about the current status of the replay process.

The frequency of the status updates can be controlled using the –f command line switch to
specify the number of seconds between each of the updates. Each of the status updates will
provide information about each of the clients including the total number of events that have been
replayed, the success rate of the replay operations per client, as well as an estimate for the total
amount of time remaining to complete the replay operation. When the replay completes the total
elapsed time and pass rate for the events is output.
In the next and final post in this series, we’ll look at some of the common problems with using
Distributed Replay and how to resolve them, including manually configuring the Controller and
add additional Client Service accounts to the environment after Setup has been completed.

Related Posts

 The AdventureWorks2008R2 Books Online Random Workload Generator


 Installing and Configuring SQL Server 2012 Distributed Replay
 SQL Server 2016 Distributed Replay Errors
 New Article on SQLPerformance.com comparing “Observer Overhead” of Trace vs
Extended Events
 Tracking Problematic Pages Splits in SQL Server 2012 Extended Events – No Really
This Time!

Posted in: Benchmarking, Database Administration, Distributed Replay, SQL Server 2012, SQL
Server Denali
19 Comments

19 Responses to Performing a Distributed Replay with Multiple Clients using SQL


Server 2012 Distributed Replay

1. Jay Dunk says:

September 9, 2013 at 6:08 am

Hi Jonathan,

I’m about to try this to see if possible, however thought i would see if you have any ideas.
i am currently stress testing a bespoke application using Dreplay. I have captured a trace
file from an end to end system that had 5 users running random UI activity. However
when i fire this off from 5 VM’s each one of those 5 users activities are going to fire at
the same time across 5 dreplay clients ( same queries execute synchonously across 5
VM’s) Is it in anyway possible to fire up 5 cmd line windows each to control one of the
vm’s replay activities and offset the start time with teh config file by 5 seconds on each
one? this will mean there is some offset between each dreplay client to avoid inducing
locking on certain resources caused by executing the same session queries 5 times in
parralel

Reply

o Jonathan Kehayias says:

September 16, 2013 at 10:34 am

Hey Jay,

If you do a synchronized replay, then it will fire off exactly like it did for the
capture so you wouldn’t need to offset each of the machines during the replay,
you would just need to offset them during your initial workload generation period.

Reply

2. Iyon Lion says:

October 4, 2013 at 2:44 pm

2013-10-04 11:36:03:185 Error DReplay The client ‘xxx_yyy’ is not a registered


distributed replay
client. Make sure that the SQL Server Distributed Replay Client services is running on
‘xxx_yyy’,
and that the client is registered with the controller ‘localhost’.

Hi Jonathan I followed your steps but got the error message above. I went into
component services and added permissions for DReplay for launch and activate, access
and configuration permissions. I was able to preproces my trace file and and can see from
the. The log file here C:\Program Files (x86)\Microsoft SQL
Server\110\Tools\DReplayClient\Log that the client is registered with the controller. Can
you think of anything I can try?

Reply

3. Iyon Lion says:


October 4, 2013 at 6:10 pm

never mind it looks like DReply does not like the FQDN. That is why I got that error.

Reply

4. Srdjan says:

November 18, 2013 at 1:40 pm

Can you leverage the TSQL or a custom Profiler template instead of the TSQL_Replay
for Distributed Replay?

Thank you.

Reply

o Jonathan Kehayias says:

December 9, 2013 at 10:35 am

You can, but if the events are not included like they are in the TSQL_Replay
profiler template, you won’t be able to perform a replay operation. Why do you
need to customize the TSQL_Replay template that ships with Profiler, except for
filtering on a specific database or user, which you can do in either an scripted
copy of the template or when setting up the trace to begin with?

Reply

 Srdjan says:

December 25, 2013 at 11:02 am

Thank you Jonathan. Our existing server traces are not based on the
TSQL_Replay profile. Instead of matching them, we developed a custom
Replay application.

Reply

5. Jason Ingram says:


May 1, 2014 at 11:13 pm

Hey Johnathon,
Great article as always!!! Excellent detail and easy to follow. My one question is I’d like
to simulate what the response times would look if I doubled or tripled the workload on
the server. Is that possible through Distributed Replay? Is that what the Connect Time
Scale and Think Time Scale are for? Or should I be upping the threads per workstation?
I’ve tried a couple things with no luck. However my problem could also be that I only
have 3 workstation clients. If you could point me in the right direction, I’d really
appreciate it.

Reply

o Jonathan Kehayias says:

May 29, 2014 at 10:55 am

Hey Jason,

You can drive higher volumes of load with stress mode and scaling the connect
and think time, but it’s not necessarily going to be linear or predictable like you
might hope. You may just have to play with the values to see where you get the
double/triple TPS from the captured workload.

Reply

6. Akshay says:

January 5, 2016 at 7:02 am

Hi Jonathan,

I want to read the extended events(.xel) file through distributed replay. I have used the
readtrace RML utility and converted .Xel file to .TRC file. I have used the standard
sample Extended event template which is part of RML utility.
The Preproccesss phase completes successfully, but when I run the replay process I am
getting some errors, I checked in the Controller log and the client log:
In Controller Log:
2016-01-05 16:52:52:129 OPERATIONAL [Controller Service] Event replay in
progress. Detailed options:
2016-01-05 16:52:52:129 OPERATIONAL [Controller Service] Target DB Server:
[DBTestServer].
2016-01-05 16:52:52:129 OPERATIONAL [Controller Service] Controller Working
Directory: [D:\DRDemo].
2016-01-05 16:52:52:129 OPERATIONAL [Controller Service] Generate Result Trace:
[No].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Sequencing Mode:
[STRESS].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Connect Time Scale:
[100].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Think Time Scale:
[100].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Healthmon Polling
Interval: [60].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Query Timeout: [3600].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Data Provider Type:
[ODBC].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Threads Per Client:
[255].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Record Row Count:
[Yes].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Record Result Set:
[Yes].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Connection Pooling
Enabled: [No].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Stress Scale Granularity:
[SPID].
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Replay Clients:
[DBTestServer].
2016-01-05 16:52:52:145 CRITICAL [Controller Service] [0xC821001D] Failed to
assign an event with connection ID [917561] to any client.
2016-01-05 16:52:52:145 OPERATIONAL [Controller Service] Event dispatch in
progress.
2016-01-05 16:52:52:145 CRITICAL [Controller Service] **** Critical Error ****
2016-01-05 16:52:52:145 CRITICAL [Controller Service] Machine Name:
DBTestServer
2016-01-05 16:52:52:161 CRITICAL [Controller Service] Error Code: 0xC8502100
2016-01-05 16:52:52:176 OPERATIONAL [Controller Service] Event replay completed.
2016-01-05 16:52:52:176 OPERATIONAL [Controller Service] Elapsed time: 0 day(s), 0
hour(s), 0 minute(s), 0 second(s).
2016-01-05 16:52:52:223 CRITICAL [Controller Service] [0xC8210003] Event manager
is not running.
2016-01-05 16:52:52:223 CRITICAL [Controller Service] **** Critical Error ****
2016-01-05 16:52:52:223 CRITICAL [Controller Service] Machine Name:
DBTestServer
2016-01-05 16:52:52:239 CRITICAL [Controller Service] Error Code: 0xC8502100
2016-01-05 16:52:52:239 CRITICAL [Controller Service] Unadvise callback interface –
failed, invalid arguments.
In Client Log:
2016-01-05 16:52:52:223 CRITICAL [Client Replay] [0xC8120004] Failed to receive
trace start time from controller with return code 0xC8502100.
2016-01-05 16:52:52:239 OPERATIONAL [Client Replay] Event replay completed.
2016-01-05 16:52:52:239 OPERATIONAL [Client Replay] 0 events replayed in total.
2016-01-05 16:52:52:254 CRITICAL [Client Service] Critical Error: code=[c8503101],
msg=Failed to receive replay data from controller, confirm network connectivity and
restart the Distributed Replay Utility services on both client and controller computers.

Can you please help on this.


thanks

Reply

7. SQLDerp111 says:

May 4, 2016 at 12:00 pm

Hi Jonathan,

I was wondering if you always pre-generate the .trc file through profiler or trace, or if you
have a way to tie this into Extended Events (i.e. is the above poster’s problem seen
often?)? You assume a .trc file to pre-process — could you link to your most recent
articles on how to efficiently create these?

Also, as far as I can see, there is no additional post on common problems… While I know
this is an older series, I was wondering if you could update it for SQL 2014/2016; maybe
just add a note if the basics still apply? Or point us in a new direction if there are newer
features/replacements for this functionality?

Thanks!

Reply

o Jonathan Kehayias says:

May 27, 2016 at 1:34 pm

The basics still apply the same in 2014 and 2016. Nothing has changed with
Distributed Replay as far as I know since 2012 introduced it. You have to use a
.trc file for input into Distributed Replay and the Trace Replay template in profiler
has the correct events, but you should use a server side trace to file and not
Profiler to collect the trace data. READTRACE from RML Utilities can convert a
XEvent file to trc format for use with Distributed Replay but there is a current
issue where it doesn’t always work for replay so Trace is still your best bet.

Reply

8. Dan says:

August 16, 2017 at 6:32 am

Hi Everyone,
my english is bad but i hope you can understand.

I have some problem with distributed replay especially when i run : dreplay replay…
Here is my error :

Error DReplay Could not find any resources appropriate for the specified culture or the
neutral culture. Make sure
“Microsoft.SqlServer.Management.DistributedReplay.ExceptionTemplates.resources”
was correctly embedded or linked into assembly
“Microsoft.SqlServer.Management.DistributedReplay” at compile time, or that all the
satellite assemblies required are loadable and fully signed.

I try manytimes but i can’t resolve it.


If you can help me, i will really appreciate.

Thanks

Reply

o Jonathan Kehayias says:

August 18, 2017 at 6:39 am

Do you have multiple versions of SQL Server installed side-by-side on the same
machine? That is the only thing that I can think of that would lead to this problem.

Reply

 Dan says:

August 22, 2017 at 8:41 am


Yes indeed, I have two versions of sql server. Thank you very much, I
solved my problem.

Reply

9. SGouin says:

February 22, 2018 at 9:49 am

Hi Jonathan,

Thanks for this great article.

It works great for a single client but when I’m trying to register a second machine to the
controller it never show up in the status.

My setup looks like:


SQLLab02 -> Controler, Client
SQLLab03 -> Second client

Executing on SQLab02 dreplay STATUS


2018-02-22 09:06:56:567 Info DReplay Registered clients:
2018-02-22 09:06:56:572 Info DReplay SQLLAB02(status = READY)
2018-02-22 09:06:56:574 Info DReplay The controller “localhost” is in a ready state

All firewall are off

Here are my XML Config files:


SQLLab02.DReplayClient.xml
…sqllab02…
SQLLab03.DReplayClient.xml
…sqllab02…

All Service use the same Domain account that have local admin privilege on SQLLab02
and SQLLab03.
DCOM Security has also been set to everyone (Allow local & Remote) on both server
(Computer access and Application access security)

Here is the Client log file of SQLLab03 when the service start:
2018-02-22 09:31:27:667 OPERATIONAL [Client Service] Microsoft SQL Server
Distributed Replay Client – 13.0.1601.5.
2018-02-22 09:31:27:667 OPERATIONAL [Client Service] © Microsoft Corporation.
2018-02-22 09:31:27:668 OPERATIONAL [Client Service] All rights reserved.
2018-02-22 09:31:27:671 OPERATIONAL [Client Service] Current edition is:
[Developer Edition].
2018-02-22 09:31:27:671 OPERATIONAL [Common] Initializing dump support.
2018-02-22 09:31:27:672 OPERATIONAL [Common] Failed to get DmpClient.
[HRESULT=0x8007007F]
2018-02-22 09:31:27:673 OPERATIONAL [Client Service] Windows service “Microsoft
SQL Server Distributed Replay Client” has started under service account
“XXXXXXXXXXXX”. Process ID is 7720.
2018-02-22 09:31:27:674 OPERATIONAL [Client Service] Time Zone: Eastern
Standard Time.
2018-02-22 09:31:27:676 OPERATIONAL [Client Service] Controller name is
“SQLLab02”.
2018-02-22 09:31:27:677 OPERATIONAL [Client Service] Working directory is
“C:\Program Files (x86)\Microsoft SQL Server\130\Tools\DReplayClient\WorkingDir”.
2018-02-22 09:31:27:677 OPERATIONAL [Client Service] Result directory is
“C:\Program Files (x86)\Microsoft SQL Server\130\Tools\DReplayClient\ResultDir”.
2018-02-22 09:31:27:678 OPERATIONAL [Client Service] Heartbeat Frequency(ms):
3000
2018-02-22 09:31:27:679 OPERATIONAL [Client Service] Heartbeats Before Timeout:
3

No error shown anywhere (Real Account has been replaced here by XXXXXXXXXXX)

Thanks to highlight anything that I could have missed.

Reply

o Jonathan Kehayias says:

February 22, 2018 at 10:07 am

Developer Edition only allows one replay client. The controller has to be a full
Enterprise Edition licensed install for multiple replay clients.

Reply

 SGouin says:

February 22, 2018 at 10:40 am

Thanks Jonathan, I can also see that restriction in the SQL Server Edition
comparaison list.

I guess the main used case for that feature (with multiple clients) is to test
your production infrastructure in a maintenance windows.
Have you blogged (or intend to blog) about how to scale the charge with
distributed replay? It seems that there is not a lot of documentation on how
to play with those configuration parameters. Default value already seems
to be set to its maximum of 100%. in “DReplay.Exe.Replay.xml”

SequencingMode : stress :/SequencingMode


ConnectTimeScale: 100 :/ConnectTimeScale
ThinkTimeScale: 100 :/ThinkTimeScale

Thanks

Reply

You might also like