A Digital Design Flow For Secure Integrated Circuits
A Digital Design Flow For Secure Integrated Circuits
Abstract—Small embedded integrated circuits (ICs) such as ranging from time delay [2] and power consumption [3] to elec-
smart cards are vulnerable to the so-called side-channel attacks tromagnetic radiation [4] and often apply advanced statistical
(SCAs). The attacker can gain information by monitoring the techniques to reveal the secret information. In general, side-
power consumption, execution time, electromagnetic radiation,
and other information leaked by the switching behavior of dig- channel attacks (SCAs) do not require expensive equipment and
ital complementary metal–oxide–semiconductor (CMOS) gates. are rather quick to set up. Even if measures are included to make
This paper presents a digital very large scale integrated (VLSI) the devices tamperproof, side-channel information can leak out.
design flow to create secure power-analysis-attack-resistant ICs. SCAs are a real threat for any device of which the security
The design flow starts from a normal design in a hardware IC is easily observable such as smart cards and embedded
description language such as very-high-speed integrated circuit
(VHSIC) hardware description language (VHDL) or Verilog and devices [5], [6].
provides a direct path to an SCA-resistant layout. Instead of a At first, SCAs have been fought with ad hoc countermea-
full custom layout or an iterative design process with extensive sures. For instance, the addition of random power-consuming
simulations, a few key modifications are incorporated in a regular operations obscured the data-dependent variations in the power
synchronous CMOS standard cell design flow. The basis for power consumption [7]. The attacks, however, have evolved and be-
analysis attack resistance is discussed. This paper describes how
to adjust the library databases such that the regular single-ended come more and more effective. Subsequently, countermeasures
static CMOS standard cells implement a dynamic and differ- have been conceived at the different abstraction levels of the
ential logic style and such that 20 000+ differential nets can be security application. It started at the algorithmic level. One
routed in parallel. This paper also explains how to modify the illustration is masking [8]. In this technique, a random “mask”
constraints and rules files for the synthesis, place, and differential is added to the data prior to the encryption and removed
route procedures. Measurement-based experimental results have
demonstrated that the secure digital design flow is a functional afterwards without changing the result. Algorithmic counter-
technique to thwart side-channel power analysis. It successfully measures, however, need to be reformulated for each algo-
protects a prototype Advanced Encryption Standard (AES) IC rithm, and, often, proposed solutions actually appear insecure
fabricated in an 0.18-µm CMOS. and/or inefficient afterwards [9], [10]. Only recently, dedicated
Index Terms—Circuit synthesis, CMOS digital integrated hardware techniques have been presented [11]–[15]. Instead
circuits, cryptography, design automation, routing, security, of concealing or decorrelating the side-channel information,
side-channel power analysis. these techniques pursue the effect of not creating any side-
channel information. The goal of these countermeasures is to
I. I NTRODUCTION balance the power consumption of the logic gates. The major
advantages are that this approach is correct by construction,
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
1198 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 25, NO. 7, JULY 2006
research projects have been set up in an attempt to develop a power consumption and high noise margins. Yet, two conditions
secure digital design flow [19], [20]. To our knowledge, this must be satisfied for VML to have constant power consumption,
publication is the first to present a comprehensive top-down namely: 1) a logic gate must have exactly one switching event
automated synchronous very large scale integrated (VLSI) per signal transition and 2) the logic gate must charge a constant
design flow that pursues a constant power dissipation of the capacitance in that switching event.
security IC. Dynamic differential logic, sometimes also referred to as
The modifications and additions are inserted in the back end dual rail with precharge logic, fulfills the first condition [23]. A
of the regular automated design flow and have been imple- differential logic family uses the true and the false representa-
mented in a “push-button” approach. They only have a minimal tion of the input and output signals and a dynamic logic family
influence on the design flow and a negligible overhead in design alternates precharge and evaluation phases. As a result, since
time. The additional steps required only a total of 6 min of both outputs (true and false) are precharged to 1, exactly one of
central processing unit (CPU) time for our prototype IC im- the two output nodes evaluates to 0 to have a differential output
plementing a high-throughput Advanced Encryption Standard signal in the evaluation phase. The discharged output node is
(AES), controller, and fingerprint processor [37]. charged to 1 in the following precharge phase to precharge
The remainder of this paper is organized as follows. In both outputs to 1. In other words, every signal transition,
Section II, we discuss constant power-consuming logic styles. including the events in which the input signals remain constant,
In Section III, a place & route technique that controls the is represented with an actual switching event, in which the logic
parasitic effects on differential interconnect wires is described gate charges a capacitance. All the logic families that have been
and analyzed. Next, in Section IV, we present the secure digital introduced to thwart the DPA [asynchronous logic [12]–[14],
design flow. Section V compares the secure digital design sense amplifier based logic (SABL) [11], [23], and WDDL
flow with a regular digital design flow. With the prototype IC, [15]], employ some form of dynamic differential logic.
two functionally identical coprocessors have been fabricated In self-timed asynchronous logic [12]–[14], the terminology
in an 0.18-µm CMOS on the same die. The first, “secure,” refers to dual rail encoded data, in which codewords are inter-
coprocessor is implemented using the secure design flow. The leaved with spacers. The codewords can be seen as differential
second, “insecure,” coprocessor is implemented using a regular data in the evaluation phase, while the spacers as the precharge
design flow. Area and power numbers are given, and the results values in the precharge phase. The major disadvantage of the
of a differential power analysis (DPA) are provided. Finally, a asynchronous approach is that it is extremely difficult to make
conclusion is formulated. reasonable sized designs. The methodology for the design of
large asynchronous logic systems lags substantially behind
that of synchronous circuits. Compared to electronic design
II. C ONSTANT P OWER -C ONSUMING L OGIC S TYLES
automation (EDA) support for synchronous designs, which is
The power consumption of traditional standard cells and very mature, there is still a shortage of computer-aided design
logic is dependent on the signal activity. When the output of (CAD) tools to support asynchronous circuit designs as is
the logic gate makes a 0 to 1 transition, a current comes from acknowledged by the asynchronous research community.
the power supply and charges the output capacitance. On the SABL [11], [23] has been conceived to thwart the DPA.
other hand, when the output sees a 1 to 0, a 0 to 0, or a 1 to 1 It uses advanced circuit techniques to guarantee that the load
transition, no or only a limited amount of energy (due to short capacitance has a constant value. SABL completely controls the
circuit or leakage) is consumed from the power supply. This is portion of the load capacitance that is due to the logic gate. The
the fundamental reason why information is leaked through the intrinsic capacitances at the differential input and output signals
power supply and why power attacks are possible. The basis of are symmetric, and, additionally, it discharges and charges the
a secure digital design flow is a logic style with constant power sum of all the internal node capacitances. A major disadvantage
consumption. is the nonrecurrent engineering costs of a custom-designed
Current mode logic (CML), e.g., current steering logic [21], standard cell library development. SABL also suffers from a
seems the ideal solution. This type of logic continuously draws large clock load. The clock signal is distributed to all standard
a current from the supply and measures its state through the path cells, as is common to all clocked dynamic logic styles.
that the current takes. A gate has constant power consumption In this paper, we propose to use the WDDL [15], because
if it draws a perfectly constant current from the power supply it can be implemented with static CMOS logic. Static CMOS
independently of the input and output signals. To build a current standard cells are combined to form secure compound standard
source capable of generating a constant current, special circuit cells, which have a reduced power signature. WDDL has many
techniques that minimize channel length modulation have to be advantages. It can be readily implemented from an existing
used [22]. The decisive drawback of CML, however, is its static standard cell library. The design flow is fully supported with
power consumption. When the logic gate is not processing accurate EDA library files that come directly from the vendor.
any data, it burns the current, which makes this logic style WDDL also results in a dynamic differential logic with only
unacceptable for embedded battery-operated devices. a small load capacitance on the precharge control signal and
Voltage mode logic (VML), e.g., static CMOS logic, only with the low power consumption and the high noise margins
draws a current from the supply to change state and measures of static CMOS. Furthermore, since the gates do not precharge
its state by the amount of charge it stores on a capacitance. in parallel, it also benefits from a low supply current derivative
Static CMOS is the preferred logic style because of its low di/dt and peak supply current.
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
TIRI AND VERBAUWHEDE: A DIGITAL DESIGN FLOW FOR SECURE INTEGRATED CIRCUITS 1199
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
1200 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 25, NO. 7, JULY 2006
Fig. 2. Measurement of output transient: Single-ended design (top); and WDDL implementation (bottom).
Fig. 3. AOI32 gate with drive strength 2 in: Static CMOS (left); WDDL (middle); and negative differential logic (right).
of the evaluation phase to the 1-spacer, the other output is The parasitic effects of the interconnects are caused by the
charged. Note that it is perfectly possible to differentiate distributed resistance and by the distributed capacitance to
between the precharge and the evaluation phase in a measured the substrate and to neighboring wires in other metal layers.
supply current trace. Thus, it is sufficient for the attacker to Though aside from process variationsn, these effects are equal
only look at the transition from the 0-spacer to the evaluation for both nets. The resistance is the same, since both intercon-
phase. In order for the logic gate not to have a different power nects have the same number of vias and have the same length in
signature for each output event that is possible during this each metal layer. The capacitance to the other layers is ideally
transition, the two output capacitances must be matched and the same, since, in general, the length of the differential route
routing differences may not exist between the two differ- is orders of magnitude larger then the pitch between the two
ential nets. differential routes and one can, therefore, argue that both nets
travel in the same environment. Making every other metal layer
a ground plane would completely control the capacitance to
III. M ATCHING I NTERCONNECT C APACITANCES
other layers. This reduces the solution space and increases the
OF D UAL R AIL L OGIC
total capacitance.
Matched interconnect capacitances can be obtained by rout- The pair of interconnects, however, needs also be routed with
ing the true and false output signals with parallel routes that are, control over any crosstalk. Crosstalk, which is the phenomenon
at all times, in adjacent tracks of the routing grid, on the same of noise induced on one wire by a signal switching on a neigh-
layers, and of the same length. Then independent of the place- boring wire, has an effect on the power consumption. Crosstalk
ment, the two routes have the same first-order parasitic effects. effects are caused by the distributed capacitance to adjacent
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
TIRI AND VERBAUWHEDE: A DIGITAL DESIGN FLOW FOR SECURE INTEGRATED CIRCUITS 1201
Fig. 4. Placed & routed design: Fat design (left); and differential design (right).
wires in the same metal layer. Routing the two output nets the place & route approach. The figure shows a placed &
in parallel already removes the uncertainty of one neighbor: routed design consisting of six differential gates. On the left,
During a switching event, only one output line switches, the the result is shown of the fat routing. On the right, the result
other output line remains quiet. All uncertainty can be removed after decomposition is shown. Two normal wires replace each
by shielding the differential routes on either side with a VDD fat wire.
or VSS line. Reserving one grid line out of three upfront for The place & route tool cannot handle differential standard
a power line reduces the problem to routing two differential cells and fat interconnects at the same time. It is not possible
lines. Note that the approach of alternating signal lines and to connect one single fat interconnect wire to two differential
quiet power lines has been shown to produce predictable in- pins. The tool needs a fat gate level netlist and a fat gate library
terconnect parasitics [28]. Alternatively, the crosstalk effects database. The fat gate level netlist is obtained from the differ-
can be controlled by increasing the distance between different ential gate level netlist by substituting each differential input
differential routes. As for any security application, there is a and output signal by one single signal. The fat library database
tradeoff between increased security and implementation costs, contains the routing rules that are applicable for the fat wires
which are loss of routing tracks. and the macro cell definition of the fat gates. A macro cell is a
Differential pair routing has been available through gridless simplified representation of the standard cell [30]. It contains
routers. However, their goal is to route a few critical signals, information such as height, width, and pin placement. The
such as the clock or general reset signal. They are not built for macros of the fat cells are obtained from the differential cells by
crypto applications where all signals need a differential route, abstracting the pins of the differential signals as one single pin.
and, thus, router performance and completion rate degrade In a postprocessing procedure, the fat wire is decomposed
rapidly with increasing number of differential pairs. These tools into the differential wires. This procedure, depicted in Fig. 5,
are unable to route 20 000+ differential pairs as an encryption consists of two translations of the fat wire and a width reduction
algorithm requires. An experiment with a mere 221 differential to the normal width. The translation of the fat wires to the
pairs required 7 h 56 min and 33 s in CPU time on a Sun differential wires is done by editing the netlist that comes
ULTRA 5 for Cadence Chip Assembly Router version 11.0.06 out of the router. The width reduction of the translated wires
[29] to perform 100 iterations without generating a completely is accomplished by importing the edited netlist and a library
routed result. It still had 972 conflicts and 125 unconnected database that contains the real macros of the differential gates
nets. High-capacity gridded routers, on the other hand, have no and the routing rules for the normal wires into the router.
or only limited capability to route differential pairs and often
even avoid running wires in parallel to prevent crosstalk effects.
B. Matching Precision
We have recently presented a way to work around tool limita-
tions [27]. The same experiment only required 3.85 s in CPU The matching precision and optimization of the interconnect
time to route the 221 differential pairs without any violations. capacitances has to be in line with the quality and optimizations
of the logic style. The intrinsic capacitances of the logic gates
and the interconnect capacitances must have similar matching.
A. Differential Pair Routing
There is no need for concentrating on balancing the intrinsic
The technique is built on top of a commercial place & route capacitances of the logic gates if the interconnect capacitances
tool and forces the tool to route the two output signals at all are not balanced and vice versa.
times in adjacent tracks. In the technique, each differential Fig. 6 plots the capacitances of the true signal nets versus the
output pair is abstracted as a single fat wire. The differential capacitances of the false signal nets for three cases, namely:
design is routed with the fat wire, and at the end, the fat wire 1) the input capacitances of our WDDL 0.18-µm library;
is decomposed into the differential wire. Fig. 4 demonstrates 2) the interconnect capacitances of a DES substitution box
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
1202 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 25, NO. 7, JULY 2006
Fig. 5. Fat routes (left); translation operation (middle); and differential routes (right).
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
TIRI AND VERBAUWHEDE: A DIGITAL DESIGN FLOW FOR SECURE INTEGRATED CIRCUITS 1203
depend on the WDDL cells that have been assembled. The level netlist (diff.v) is used together with the differential
minimum set consists of a register, an inverter, an AND gate, (diff_lib.v) and the original (lib.v) library to verify that
and an OR gate. Our WDDL library contains 128 distinct the design goals are met. Since the WDDL gates are compound
macro cells and implements 37 logic functions. The library file, gates, we have an accurate representation in function of the
however, is the original static CMOS standard cell library file. original gates. The verification step gives an estimate on the
The functionality and a preliminary timing of the gate level critical path delay and the area requirements. The verification
netlist (rtl.v) are verified with a gate-level simulation and step also includes a gate level simulation.
a static timing analysis. This also requires the library file In the place & route step, the fat gate level netlist (fat.v)
(lib.v) and is done in the verification step. is placed and routed. The place & route tool requires the
The cell substitution procedure modifies the gate level de- fat gate library database (fat_lib.lef), containing cell
scription. A script, e.g., in practical extraction report language macros and routing rules, and a functional description of
(PERL) or in Awk, transforms the gate level netlist (rtl.v). that library (fat_lib.v). The tool executes the commands
Two files are generated, namely: 1) a fat gate level netlist file (script). This file contains the instructions for, among
(fat.v), which will be used to route the design and 2) a other things, floor planning, power planning, routing, etc.
differential gate level netlist (diff.v), which will be used Note that the information from the original library files is
in the verification steps. The differential netlist is obtained by used in procedures such as clock-routing- and timing-driven
replacing each gate by its WDDL counterpart. This means that placement. The resulting design file (fat.def) specifies the
each net is duplicated, made differential, and connected to the location of the cells in the core and of the wires connecting
differential pin. The inverters are also removed; the inversions the cells.
are implemented by switching the nets. The fat netlist is equiva- Clock routing changes the fat gate level netlist (fat.v). The
lent to the differential netlist except that the differential signals new netlist contains the buffers from the clock tree and the
have been abstracted as one single signal. This kind of parse original fat gate level netlist. The differential gate level netlist
procedure is not present in the regular design flow. The run time (diff.v) must also be updated with this information. The fat
overhead, however, is negligible. The parser required a little less gate level netlist can be generated by the place & route tool.
than 4 min to generate both files for the prototype IC containing Parsing this file will result in the new differential gate level
39 000 effective gates on a SunFire v100 [550 MHz CPU, 2 GB netlist. A logic equivalence test between the original and the
random access memory (RAM)]. new differential gate level netlist ensures correctness.
In order to validate the result of the parser, a logic equiv- The wires in a .def design file are described as lines between
alence checker, such as Formality [32] or Verplex LEC two points and vias are assigned as points. The wire width
[33], is used to verify the equivalence between the fat gate and via characteristics are defined in the .lef library database.
level netlist and the original netlist. The differential gate The fat to differential routing transformation consists of two
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
1204 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 25, NO. 7, JULY 2006
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
TIRI AND VERBAUWHEDE: A DIGITAL DESIGN FLOW FOR SECURE INTEGRATED CIRCUITS 1205
Fig. 9. Number of measurements to disclosure (left); and peak-to-peak value of the differential traces at 2000 measurements (right).
Fig. 10. IC micrograph: secure coprocessor using WDDL and differential routing (left); and insecure coprocessor using standard cells and regular routing (right).
B. Prototype IC
be used as a synchronization signal. The power consumption
The secure digital design flow described in this paper is profile of the secure implementation, on the other hand, is
applicable to large realistic designs. It is part of a domain invariant and does not reveal any information in a simple
specific codesign methodology for secure embedded systems power analysis. In each clock cycle, the same total load ca-
[38]. It implements the secure portion of a system-on-a-chip pacitance is charged. To facilitate the synchronization of the
(SOC) design. The prototype IC, depicted in Fig. 10, consists measurements, however, we have access to the encryption start
of two functionally identical coprocessors and is fabricated signal.
on the same die using a TSMC 6M 0.18-µm process [37]. We performed a correlation DPA attack [36] on each co-
An insecure coprocessor serving as benchmark is implemented processor as it executed AES to find the secret key byte per
using standard cells and regular routing techniques. A secure byte. For the insecure implementation, the correct key bytes are
coprocessor is implemented through the secure digital design found very easily. On average, 2000 measurements are required
flow using WDDL and differential routing. Both coprocessors to disclose a key byte. In one case, a mere 320 samples were
have been implemented starting from the same synthesized sufficient to mount a successful attack. The secure coprocessor,
gate level netlist. The WDDL compound gates have been on the other hand, substantially improves the DPA resistance.
derived from the Artisan SAGE-X 0.18-µm 1.8-V static CMOS Our measurements show that out of 16 key bytes, WDDL
standard cell library [24] that has been used in the regular effectively protects five key bytes. One and a half million
insecure design. measurements, which is larger than the lifetime of the secret
The cryptographic engine is an AES core. The data path key in most practical systems, are not sufficient to disclose the
is based on a single round of the AES-128 algorithm. A correct key bytes. The 11 key bytes that are found require,
full encryption of 128-bit data using a 128-bit key takes on average, 255 000 measurements, an increase of more than
precisely 11 cycles. Fig. 11 shows transient measurements two orders of magnitude when compared with the insecure
of the encryption start signal and the supply current of the coprocessor.
coprocessors with the AES cryptographic engine in OFB mode. Table I summarizes the results. WDDL and differential rout-
The supply current of the insecure coprocessor exhibits large ing is a functional technique to thwart power attacks. The trade-
variations. It broadcasts the 11 encryption rounds. The high- off is a three times increase in area and a four times increase
power peak at the starting point of each new encryption can in power consumption and minimum clock period.
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
1206 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 25, NO. 7, JULY 2006
Fig. 11. Transient measurement (2 encryptions and 22 clock cycles) of encryption start signal (top) and core supply current (bottom): Insecure coprocessor (left);
and secure coprocessor (right).
TABLE I
AES RESULTS SUMMARY
To our knowledge, the secure digital design flow is the first to ments and assessment of the DPA resistance, however, have not
deliver a working practical DPA countermeasure implemented yet been performed.
and tested in actual silicon. All other published techniques have
never been implemented in silicon, have never been measured
VI. C ONCLUSION
and attacked, or did not offer any significant DPA resistance.
A dual rail asynchronous chip has been presented previously We have presented a secure digital design flow. The design
[12]. The implementation, however, did not provide a signifi- flow provides an accessible means to fabricate a security IC
cant increase in DPA resistance. This failure has been attri- that is SCA resistant regardless of the implementation details.
buted to unbalanced signal paths caused by routing differences. The approach is independent of the cryptographic algorithm
Note that if asynchronous logic is used to increase the DPA re- implemented. It relies on a logic style that has constant power
sistance, dual rail encoded asynchronous logic must be used. consumption and a place & route approach that controls the
Because of the dual rail logic, there is also a factor three parasitic effects: WDDL has exactly one charging event per
area increase compared with a single-ended synchronous cycle and differential pair routing matches the interconnect ca-
benchmark [13]. pacitances of the true and false output signals. The design flow
Algorithmic countermeasures are mathematically DPA resis- is completely supported by mainstream EDA tools and uses a
tant. In practice, however, proposed solutions actually have commercially available static CMOS standard cell library. The
been insecure [9], [10]. We are aware of one silicon imple- differences with a regular synchronous CMOS standard cell
mentation of an algorithmic countermeasure [39]. Measure- design flow are minor. The secure digital design flow starts
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
TIRI AND VERBAUWHEDE: A DIGITAL DESIGN FLOW FOR SECURE INTEGRATED CIRCUITS 1207
from a normal design in a hardware description language and [14] L. Plana, P. Riocreux, W. Bainbridge, A. Bardsley, S. Temple,
only a few key modifications with a minimal influence on the J. Garside, and Z. Yu, “SPA—A secure amulet core for smartcard
applications,” J. Microprocess. Microsyst., vol. 27, no. 9, pp. 431–446,
design flow are incorporated in the back end of the design flow. Oct. 2003.
The additional steps required only a total of 6 min of CPU time [15] K. Tiri and I. Verbauwhede, “A logic level design methodology for
for the prototype IC. A cell substitution phase and an intercon- a secure DPA resistant ASIC or FPGA implementation,” in Proc.
Design, Automation and Test Eur. Conf. (DATE), Paris, France, 2004,
nect decomposition phase parse intermediate design files. The pp. 246–251.
former procedure modifies the gate level description, the latter [16] ——, “A VLSI design flow for secure side-channel attack resistant
duplicates and translates the interconnect wires. Measurement- ICs,” in Proc. Design, Automation and Test Eur. Conf. (DATE), Munich,
Germany, 2005, pp. 58–63.
based experimental results have demonstrated that it is a work- [17] M. Renaudin, F. Bouesse, P. Proust, J. P. Tual, L. Sourgen, and F. Germain,
ing practical technique to thwart power analysis attacks. It “High security smartcards,” in Proc. Design, Automation and Test Eur.
successfully protects AES on a prototype IC fabricated in an Conf. (DATE), Paris, France, 2004, pp. 228–232.
[18] J. Yoshida. (2004). “Smart card designers need security tools,”
0.18-µm CMOS. A DPA attack does not disclose the entire EEtimes. [Online]. Available: http://www.eedesign.com/showArticle.
secret key at 1 500 000 measurements, which is larger than the jhtml?articleID=17701143
lifetime of the secret key in most practical systems. [19] SCA Resistant Design (SCARD), 6th Framework Program of the Euro-
pean Comission Sponsored Research Project. (2004). [Online]. Available:
http://www.scard-project.org/
[20] Opensmartcard. (2004). [Online]. Available: http://www.comelec.enst.fr/
ACKNOWLEDGMENT recherche/opensmartcard/
[21] H. Ng and D. Allstot, “CMOS current steering logic for low-voltage
The authors acknowledge D. Ching, A. Hodjat, D. Hwang, mixed-signal integrated circuits,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 5, no. 3, pp. 301–308, Sep. 1997.
B.-C. Lai, Y. Matsuoka, P. Schaumont, and S. Yang for their [22] K. Tiri and I. Verbauwhede, “A dynamic and differential CMOS logic
effort in the design of the ThumbPodII chip. This work was style to resist power and timing attacks on security IC’s,” IACR Cryptol-
performed while the authors were at UCLA. ogy ePrint Archive, Santa Barbara, CA, Rep. 2004/066, Feb. 2004.
[23] K. Tiri, M. Akmal, and I. Verbauwhede, “A dynamic and differential
CMOS logic with signal independent power consumption to withstand
differential power analysis on smart cards,” in Proc. Eur. Solid-State
R EFERENCES Circuits Conf. (ESSCIRC), Florence, Italy, 2002, pp. 403–406.
[1] E. Hess, N. Janssen, B. Meyer, and T. Schuetze, “Information leakage [24] Artisan SAGE-X Standard Cell Library. [Online]. Available: http://
attacks against smart card implementations of cryptographic algorithms www.artisan.com
and countermeasures—A survey,” in Proc. Eurosmart Security Conf., [25] D. Sokolov, J. Murphy, A. Bystrov, and A Yakovlev. (2004). Improving
Marseille, France, 2000, pp. 55–64. the Security of Dual-Rail Circuits. [Online]. Available: http://www.
[2] P. Kocher, “Timing attacks on implementations of Diffie-Hellman, RSA, staff.ncl.ac.uk/i.g.clark/async/tech-reports/NCL - EECE - MSD -TR-2004-
DSS, and other systems,” in Proc. Advances Cryptology (CRYPTO), Santa 101.pdf 2004
Barbara, CA, 1996, vol. 1109, pp. 104–113. [26] ITRS. (2003). “Interconnect,” The International Technology Road-
[3] P. Kocher, J. Jaffe, and B. Jun, “Differential power analysis,” in Proc. map for Semiconductors. [Online]. Available: http://public.itrs.net/Files/
Advances Cryptology (CRYPTO), Santa Barbara, CA, 1999, vol. 1666, 2003ITRS/Interconnect2003.pdf
pp. 388–397. [27] K. Tiri and I. Verbauwhede, “Place and route for secure standard cell
[4] J. Quisquater and D. Samyde, “ElectroMagnetic analysis (EMA): Mea- design,” in Proc. Smart Card Research and Advanced Application IFIP
sures and counter-measures for smart cards,” in Proc. Smart Card Pro- Conf. (CARDIS), Toulouse, France, 2004, pp. 143–158.
gramming and Security (E-smart), Cannes, France, 2001, vol. 2140, [28] S. Khatri, A. Mehrotra, R. Brayton, A. Sangiovanni-Vincentelli, and
pp. 200–210. R. Otten, “A novel VLSI layout fabric for deep sub-micron applica-
[5] P. Kocher, R. Lee, G. McGraw, A. Raghunathan, and S. Ravi, “Security tions,” in Proc. Design Automation Conf. (DAC), New Orleans, LA, 1999,
as a new dimension in embedded system design,” in Proc. 41st Design pp. 491–496.
Automation Conf. (DAC), San Diego, CA, 2004, pp. 753–760. [29] Cadence Chip Assembly Router. [Online]. Available: http://www.cadence.
[6] S. Ravi, A. Raghunathan, and S. Chakradhar, “Tamper resistance mecha- com/products/custom_ic/chip_assembly
nisms for secure, embedded systems,” in Proc. 17th Int. Conf. Very Large [30] LEF/DEF Language Reference 5.5. (2003, Jan.). [Online]. Available:
Scale Integration Design (VLSID), Mumbai, India, 2004, pp. 605–610. http://www.openeda.org
[7] J. Daemen and V. Rijmen, “Resistance against implementation attacks: [31] Silicon Ensemble. [Online]. Available: http://www.cadence.com/products/
A comparative study of the AES proposals,” in Proc. 2nd Advanced digital_ic/sepks
Encryption Standard (AES) Candidate Conf., Rome, Italy, 1999, [32] Formality. [Online]. Available: http://www.synopsys.com/products/
pp. 122–132. [Online]. Available: http://csrc.nist.gov/encryption/aes/ verification/
round1/conf2/aes2conf.htm [33] Verplex LEC. [Online]. Available: http://www.cadence.com/products/
[8] S. Chari, C. S. Jutla, J. R. Rao, and P. Rohatgi, “Towards sound functional_ver/index.aspx
approaches to counteract power-analysis attacks,” in Proc. Advances [34] Design Analyzer. [Online]. Available: http://www.synopsys.com/products/
Cryptology (CRYPTO), Santa Barbara, CA, 1999, vol. 1666, pp. 398–412. logic/deanalyzer_ds.html
[9] E. Oswald, S. Mangard, and N. Pramstaller, “Secure and efficient mask- [35] Virtuoso. [Online]. Available: http://www.cadence.com/products/custom_
ing of AES—A mission impossible?” IACR Cryptology ePrint Archive, ic/index.aspx?lid=custom_ic_design
Santa Barbara, CA, Rep. 2004/134, Jun. 2004 [36] J. Coron, P. Kocher, and D. Naccache, “Statistics and secret leakage,” in
[10] S. Mangard, T. Popp, and B. Gammel, “Side-channel leakage of masked Financial Cryptography (FC), Anguilla, British West Indies, Feb. 2000,
CMOS gates,” in Cryptographers’ Track—RSA Conf. (CT-RSA), San vol. 1962, Lecture Notes in Computer Science, pp. 157–173.
Francisco, CA, Feb. 2005, pp. 351–365. [37] K. Tiri, D. Hwang, A. Hodjat, B. Lai, S. Yang, P. Schaumont, and
[11] K. Tiri and I. Verbauwhede, “Securing encryption algorithms against I. Verbauwhede, “AES-based cryptographic and biometric security co-
DPA at the logic level: Next generation smart card technology,” in processor IC in 0.18-µm CMOS resistant to side-channel power analysis
Proc. Cryptographic Hardware and Embedded Systems (CHES), Cologne, attacks,” in Symp. Very Large System Integration (VLSI) Technology and
Germany, 2003, vol. 2779, pp. 125–136. Circuits, Kyoto, Japan, Jun. 2005, pp. 216–219.
[12] J. Fournier, S. Moore, H. Li, R. Mullins, and G. Taylor, “Security [38] P. Schaumont and I. Verbauwhede, “Domain-specific codesign for
evaluation of asynchronous circuits,” in Proc. Cryptographic Hardware embedded security,” IEEE Computer, vol. 36, no. 4, pp. 68–74,
and Embedded Systems (CHES), Cologne, Germany, 2003, vol. 2779, Apr. 2003.
pp. 137–151. [39] N. Pramstaller, F. Gürkaynak, S. Häne, H. Kaeslin, N. Felber, and
[13] S. Moore, R. Anderson, R. Mullins, and G. Taylor, “Balanced self- W. Fichtner, “Towards an AES crypto-chip resistant to differential power
checking asynchronous logic for smart card applications,” J. Micro- analysis,” in 30th Eur. Solid-State Circuits Conf. (ESSCIRC), Leuven,
process. Microsyst., vol. 27, no. 9, pp. 421–430, Oct. 2003. Belgium, Sep. 2004, pp. 307–310.
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.
1208 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 25, NO. 7, JULY 2006
Kris Tiri (S’99–M’06) was born in Bree, Belgium, Ingrid Verbauwhede (M’92–SM’00) received the
in 1976. He received the M.S. degree in electrical en- M.S. degree in electrical engineering, in 1984, and
gineering from the Katholieke Universiteit Leuven, the Ph.D. degree in applied sciences, in 1991, both
Leuven, Belgium, in 1999, and the Ph.D. degree in from the Katholieke Universiteit Leuven, Leuven,
electrical engineering from the University of Cali- Belgium.
fornia, Los Angeles, in 2005. His doctoral research She was a Lecturer and Visiting Research Engi-
focused on the design for side-channel attack resis- neer at the University of California, Berkeley, from
tant security integrated circuits (ICs). 1992 to 1994. From 1994 to 1998, she was a Princi-
He is currently with the Trusted Platform Lab- pal Engineer first with TCSI and then with Atmel,
oratory of Intel Corporation, Hillsboro, OR. From Berkeley, CA. She joined UCLA in 1998 as an
1999 to 2005, he was a Research Assistant with Associate Professor and the Katholieke Universiteit
the Electrical Engineering Department of the University of California, Los Leuven, in 2003. Her interests include circuits, processor architectures, and
Angeles. During the spring of 1999, he was with the COMELEC of Ecole design methodologies for real-time embedded systems in application domains
Nationale Supérieure des Télécommunications, Paris, France. During 2001 and such as cryptography, security, digital signal processing, wireless, and high-
2002, he was with IMEC, Heverlee, Belgium. speed communications.
Dr. Tiri was awarded a Francqui Foundation Fellowship by the Belgian Dr. Verbauwhede is or was a member of several program committees,
American Educational Foundation, in 1999, and he received the 2005 EDAA including DAC, ISSCC, DATE, CHES, ICASSP. She is the design community
Outstanding Dissertation Award. chair on the 42nd and 43rd DAC executive community. More details of her
embedded security research group can be found at www.emsec.ee.ucla.edu and
www.esat.kuleuven.be/cosic.
Authorized licensed use limited to: University of Management & Technology Lahore. Downloaded on May 08,2020 at 23:31:48 UTC from IEEE Xplore. Restrictions apply.