Download as pdf or txt
Download as pdf or txt
You are on page 1of 112

HPE 3PAR StoreServ

Workshop
15-16 Nov 2018

Tomasz Piasecki Technical Consultant Hewlett Packard Enterprise


[email protected]
(this presentation is partialy basing on Peter Mattei's technical docs)
Table of Content
3PAR OS Virtualization Concepts and space mangement
• Understand space utilization
• Volume space
• Logical disk space
• CPG space
• Overprovisioning
3PAR OS Performance
• Understanding performance bottleneck
• Distributed sparing
• Servicing I/O
• Adaptive sparing
• Express layout
• Adaptive data reduction System events & alerts
• Front end vs back end
• Thin deduplication • View, interpret, and manage system
• CPU
• Space reclamation events and alerts
• Cache
• Mapping space • Spare Part Notification
• Adaptive Flash Cache
• Single Click Locate
• SSD Performance
• Use the checkhealth command for Remote Copy & Peer Persistence
• Unbalanced systems troubleshooting Remote Copy Overview
• Express writes • Alert notifications for System Reporter Supported Topologies
• I/O Queueing • Alerts threshold criteria editing Transport Layers
• Troubleshooting discussion • SR Alert space metrics
Restrictions & Limitations
• SSMC email notifications Remote Copy Setup – CLI
• Event log monitoring and management Remote Copy – SSMC
• Syslog support Failure Scenarios
Peer Persistence - Overview
Virtualization Concepts &
Space Utilization
Virtualization Concepts &
Space Utilization
• Understand space utilization
• Volume space
• Logical disk space
• CPG space
• Overprovisioning
• Distributed sparing
• Adaptive sparing
• Adaptive data reduction
• Thin deduplication
• Space reclamation
• Mapping space
The traditional way of setting up a Storage Array
High chance of fragmentation and hot-spots
For my five servers I need five 2TB LUNs with
average performance
OK, let’s build a RAID5 pool consisting of five
RAID5 4+1 groups
Ah, we also need some spares of course
0 1 2 3 4 5 6 Now I am carving five 2TB LUNs out of the pool
R5 R5 R5 R5 R5 R1 R1 and present them to you
Traditional Controllers Listen, I have two more servers. They also need
2TB LUNs but with very high write performance
RAID5 Group RAID5 Group RAID5 Group RAID5 Group Spares
Okie-dokie, I am going to have more drives
installed, will build a RAID1 pool and create new
RAID5 Group RAID1 Group RAID1 Group LUNs which will have higher write performance
I also need to create snapshots of my LUNs

Snapshot Pool
No problem, I’ll have another set of drives
installed to create a snapshot pool
The 3PAR way of setting up a Storage Array
Optimal use of resources
For my five servers I need five 2TB LUNs with
average performance
OK, easy! I just create 5 VVs in a RAID5 CPG
striped across all available drives and present the
VLUNs to you. Thanks to distributed sparing I
don’t have to waste drives for sparing
0 1 2 3 4 5 6 Listen, I have two more servers. They also need
R5 R5 R5 R5 R5 R1 R1 2TB LUNs but with very high write performance
3PAR Controller Nodes No problem, I just create two more VVs in a
RAID1 CPG striped across the very same drives.
Physical Drives These new LUNs have higher write performance
I also need to create snapshots of my LUNs
Easy, I still have enough space on my set of
physical drives to hold your snapshots
Terminology 1/2
Chunklets, LD, CPG

• Chunklets - Physical disks (PD) are divided into 1 GiB chunklets. Each chunklet occupies
contiguous space on a physical disk. Chunklets are automatically created by the HPE 3PAR Operating
System and they are used to create Logical Disks (LD). A chunklet is assigned to only one LD.
• Logical Disks (LD) - collection of chunklets arranged as rows of RAID sets. Each RAID set is made
up of chunklets from different physical disks. LDs are pooled together in Common Provisioning Groups
(CPG), which allocate space to virtual volumes.
• Common Provisioning Groups (CPG) - defines the LD creation characteristics, such as RAID type,
set size, disk type plus total space warning, and limit. A CPG is a virtual pool of LDs that allocates
space to virtual volumes (VV) on demand.
Logical disk types
LDs used by system

System sets LDs for logging, preserved data, and for system administration.
These LDs are multilevel LDs with 3-way mirrors.

• Logging LDs - RAID 10 LDs - temporarily hold data during disk failures and disk replacement.
Created by the system during the initial installation. Depending on the system model, each controller node
in the system has a 20 GiB or 60 GiB logging LD.
• Preserved data LDs - RAID 10 LDs used to hold preserved data.
Created by the system during the initial installation. The size of the preserved data LD is based on the
amount of data cache in the system

• Administration volume LDs - storage space for the admin, a single volume (not LD!) created on each
system during installation. Used to store system administrative data such as the system event log
Terminology 2/2
Virtual Volumes FPVV & TPVV and CPG space allocation

• Virtual Volumes (VV) - draw their resources from CPG and volumes are exported as LUN to hosts.
Virtual volumes are the only data layers visible to the hosts.

• Fully Provisioned Volume (FPVV) - fixed size and dedicated LDs. The FPVV size is fixed.
FPVV reserves the entire amount of space required, whether or not the space is actually used.

• Thin Thinly Provisioned Virtual Volume (TPVV) - uses Logical Disks (LD) that belong to a CPG.
TPVVs associated with the same CPG draw space from the LD pool as needed, allocating space on
demand in 16 KiB increments.
When the volumes require additional storage, the HPE 3PAR OS automatically creates additional LDs.
The CPG can grow until it reaches the user-defined growth limit, which restricts the CPG max. size.
These allocations are adaptive, because subsequent allocations are based on the rate of consumption
for previously allocated space. For example, if a TPVV is initially allocated 1 GiB for each node, but
consumes that space in less than 60 seconds, the next allocation becomes 2 GiB for each node
Virtual Volume Types

There are 3 types of virtual volumes:


•• User
Fully Volumes (FPVV
Provisioned or TPVV)
Virtual Volume (FPVV)
• Thinly Provisioned Virtual Volume (TPVV)
• Administrative Volume admin (created by the system and for system usage only)

FPVVs and TPVVs have 3 separate data components:

• User space - contains the user data and is exported as a LUN to the host.
• Snapshot space - contains copies of user data that changed since the previous snapshot.
• Admin space - contains pointers to copies of user data in the snapshot space.
Virtual Volume Types
TPVV – Warnings & Limits
Allocation warning threshold - The user-defined threshold at which the system generates an alert.
This threshold is a percentage of the virtual size of the volume, the size that the volume presents to the
host.
Allocation limit threshold - The user-defined threshold at which writes fail, preventing the volume from
consuming additional resources. This threshold is a percentage of the virtual size of the volume, the size
that the volume presents to the host.

Applicable only to Thin Provisioned Volumes!


or Copy Space (for TPVV and FPVV)
RAID Types

The HPE 3PAR storage system supports the following RAID types:
• RAID 0
• RAID 10 (RAID 1)
• RAID 50 (RAID 5)
• RAID MP (Multi-Parity) or RAID 6 (default)
RAID Types – Raid 0
Set Size, Step Size, Raw Size

RAID 0
set size always =1
The number of sets in a row is the row size

The system accesses data from a RAID 0 LD in


step sizes
step size is the number of contiguous bytes that the
system accesses before moving on to the next
chunklet

LD with a set size of 1 and a row size of 3.


RAID Types – Raid 1 & 10
Set Size, Step Size, Raw Size

On a RAID 10 LD, data is striped across RAID 1


(or mirrored) sets.
RAID 1 – made up of 2 or more chunklets that contain
the same data
The number of chunklets in a RAID 1 set is the set
size, or mirror depth. The number of sets in each row is
the row size. The maximum row size is 40

The system accesses data from a RAID 10 LD in step


sizes.

A RAID 1 set can function with the loss of all but one of
the chunklets in the set.

LD with a set size of 2 and a row size of 3.


RAID Types – Raid 5 & 50
Set Size, Step Size, Raw Size

On a RAID 50 LD, data is striped across rows of RAID 5


sets.
RAID 5 – must contain at least 3 chunklets.
A RAID 5 set with 3 chunklets has a total of 2 chunklets of
space for data and 1 of space for parity. Set sizes with
between 3 and 9 chunklets are supported. The data and
parity steps are striped across each chunklet in the set.
The number of sets in a row is the row size.
The system accesses the data from a RAID 50 LD in step
sizes.

LD with a set size of 3 and a row size of 2.


RAID Types – Raid 6
Set Size, Step Size, Raw Size

A RAID MP set with 8 chunklets has a total of 6 chunklets


of space for data and 2 chunklets of space for parity.
RAID MP set sizes of 8 and 16 chunklets are supported.
The data and parity steps are striped across each chunklet
in the set.
The number of sets in a row is the row size. The system
accesses the data from a RAID 6 in step sizes. The step
size varies and is dependent on the size of the RAID 6 set.

2 RAID MP sets in one row, the 2nd set is shown below


the first set. In the first RAID 6, p0 is the parity step for
data steps F, L, M, Q, T, V, and X.
Data striped across RAID MP sets on a RAID MP LD
shows a RAID MP LD with a set size of 8, and 2 sets in 1 LD with a set size of 3 and a row size of 2.
row
Common Provisioning Groups (CPG)

CPGs are Policies that define Service and Availability level by


• Drive type (SSD, FC – Fast Class, NL – Nearline)
• Number of Drives (default = all; striping width)
• RAID type and set size (R10 / R50 2:1 to 8:1 / R60 4:2*; 6:2; 8:2*; 10:2; 14:2*)
• High availability level (HA magazine or Cage)

Multiple CPGs can be configured and optionally overlap the same drives
• i.e. a System with 200 drives can have one CPG containing all 200 drives and other CPGs with
overlapping subsets of these 200 drives.

CPGs have many functions:


• They are the policies by which free Chunklets are assembled into logical disks
• They are a container for existing volumes and used for reporting
• They are the basis for service levels and our optimization products.
Create CPG with SSMC
Easy and straight forward

On the Mega Menu select


Common Provisioning Group
Create CPG
Define and select:
• CPG name
• 3PAR System
• Device Type
7k, 10k, 15k, SSD100, SSD150
• RAID Type
RAID 1, 5, 6

Hit Create
In the Advanced options you can define
and select
• Set Size
• Cage Availability
• Growth Size
CPG & space management
Growth increments, warnings, and limits

• Within CPG we can create Fully Provisioned Virtual Volumes (FPVV) and Thinly Provisioned Virtual
Volumes (TPVV) that draw space from the LD pool on the CPG

• CPG is configured to grow new LDs automatically when the amount of available LD space falls below a
configured threshold.

• When creating a CPG, set a growth increment and an growth warning and growth limit to restrict the
CPG growth and maximum size. By default, the growth warning and growth limit are set to none
CPG & space management
Growth increments
As volumes that draw from a CPG require additional storage, the system automatically creates additional
LDs according to the CPG growth increment. The default and minimum growth increments vary according
to the number of controller nodes in the system, as shown in the following table:

The optimal growth increment depends on several factors:


• Total available space
• Nature of the data
• Number of CPGs
• Number of volumes associated with those CPGs
• Anticipated growth rate of the volumes associated with the CPGs
CPG & space management
Growth warning & limit
• When the size of the volumes that draw from a CPG reach the CPG growth warning, the system
generates an alert.
• The storage system does not prevent you from setting growth warnings that exceed the total capacity
of the system.
• If the volumes that draw from a CPG are allowed to reach the CPG growth limit, the system prevents
them from allocating additional space
• This mechanism stops an application or volume from exhausting all free space available to the CPG
and causing invalid (stale) snapshot volumes or new application write failures for volumes associated
with that CPG. However, the storage system does not prevent you from setting growth limits that
exceed the total capacity of the system
CPG & Volume space management

3par cli% showcpg -s


-Private(MiB)- -----------(MiB)----------- ---------- Efficiency -----------
Id Name Warn% Base Snp Shared Free Total Compact Dedup Compress DataReduce Overprov
4 CPG_Linux_01 - 66560 0 17408 88064 172032 >25 1.08 1.87 1.92 0.01
0 CPG_Linux_XXXXXXXX - 73138816 0 49465344 556416 123160576 3.20 1.11 - 1.11 1.18
2 CPG_Linux_standby - 1024 1024 1024 82944 86016 - - - - 0.00
5 CPG_Windows_01 - 11136512 0 3725056 2943744 17805312 14.56 1.54 1.80 2.44 0.43
1 CPG_Windows_XXXXXXXX - 28284288 0 38205952 13891712 80381952 3.74 1.11 - 1.11 0.73
3 CPG_Windows_standby - 1024 1024 1024 82944 86016 - - - - 0.00
------------------------------------------------------------------------------------------------------------------
6 total 112628224 2048 91415808 17645824 221691904 4.06 1.15 1.80 1.00 0.53

3par cli% showvv -s


---------Snp---------- ---------------Usr--------------- -----------------Total------------------
--(MiB)-- -(% VSize)-- -------(MiB)------- --(% VSize)-- -----------------(MiB)------------------ ---Efficiency---
Id Name Prov Compr Dedup Type Rsvd Used Used Wrn Lim Rsvd Used Used Wrn Lim Rsvd Used HostWr VSize Compact Compress
136 .shared.CPG_Linux_01 dds NA No base 0 0 0.0 -- -- 17408 8174 0.0 -- -- 17408 8174 -- 67108864 -- --
21 mdm_tst_01 tdvv Yes Yes base 0 0 0.0 -- -- 66560 61858 2.9 0 0 66560 61858 133943 2097152 >25 1.87
128 .shared.CPG_Windows_01 dds NA No base 0 0 0.0 -- -- 2875520 2858643 4.3 -- -- 2875520 2858643 -- 67108864 -- --
8 dmz_tst_01 tdvv Yes Yes base 0 0 0.0 -- -- 424960 336593 6.4 0 0 424960 336593 948660 5242880 15.58 1.91
24 sqle_tst_02 tdvv Yes Yes base 0 0 0.0 -- -- 467840 467840 4.5 0 0 467840 467840 2679447 10485760 22.41 3.52
38 win _tst_01 tdvv Yes Yes base 0 0 0.0 -- -- 1840128 1820600 17.4 0 0 1840128 1820600 5226134 10485760 5.76 1.62
39 win_tst_02 tdvv Yes Yes base 0 0 0.0 -- -- 992256 942890 9.0 0 0 992256 942890 3027552 10485760 11.12 1.63
CPG, LD & Volume space management
Lab Demonstration – Differences between Full and Thin Space Management

createvv -tpvv FC_r1 wolume1 1G createvv FC_r1 wolume1 1G


showvvmap wolume1 showvvmap wolume1
showld showld
showcpg –s showcpg –s
removevv wolume1 removevv wolume1
showcpg –s showcpg –s

createvv -tpvv FC_r1 wolume2 1G createvv FC_r1 wolume2 1G


showvvmap wolume2 showvvmap wolume2
showcpg -s showcpg -s
showvvmap wolume1 showvvmap wolume1
HPE 3PAR – Virtual Volume vs. LDs

Full VVs have their own LDs Thin VVs share the same LDs in a CPG

Thin Volumes
Virtual Volumes Full VV

CPGs LD LD LD LD Chunklets are assembled in


Contains LDs Node 0 Node 1 Node 0 Node 1 LDs to form the CPG

RAID is done at the LD level


Chunklets
1GB

25
3PAR Virtualization – the Logical View
With one drive type
3PAR autonomy User initiated
Physical Drives Logical Disks (LD) CPG(s) Virtual Exported
formatted in 1GB autonomically Volumes LUNs
B
Chunklets created Device Type FC
RAID Type 10
Set Size 2
A
Device Type FC
RAID Type 50
Set Size 7+1

B
3PAR Virtualization – the Logical View
With three drive types
3PAR autonomy User initiated
Physical Drives Logical Disks (LD) CPG(s) Virtual Exported
formatted in 1GB autonomically Volumes LUNs
Chunklets created

SSD SSD LDs


Fast Class Drives FC LDs
AO
Nearline Drives NL LDs

SSD

FC

NL

AO = Adaptive Optimization
Why are Chunklets so Important?
Think of many Virtual Drives on a single Physical Drive
Ease of use and Drive Utilization
• Array managed by policies, not by administrative planning
• The same drives can service all RAID types at the same time
– RAID10
– RAID50 – 2:1 to 8:1
– RAID60 – 4:2*; 6:2; 8:2*; 10:2; 14:2*
• Transparent mobility between drives and RAID types
thanks to Dynamic and Adaptive Optimization
Performance
• Enables wide-striping across hundreds of drives
• Avoids hot-spots
• Autonomic data restriping after disk installations
High Availability – selectable by CPG
• HA Drive/Magazine - Protect against drive/magazine failure (Industry standard)
• HA Cage - Protect against a cage failure (complete drive Enclosure)
* Preferred, performance enhances R6 set sizes
3PAR Virtualization Concept
Drive ownership – Example 1: 2-Node System with 2 drive enclosures

• Nodes are installed in pairs for redundancy


and write cache mirroring

• 2-node configuration with 2 drive enclosures

Node0 Node1
• A Physical Drive (PD) is owned by one node
Node 0 owns the odd drives, Node 1 owns the
even drives
Note: if one node fails the partner node takes ownership of all PD
3PAR Virtualization Concept
Drive ownership – Example 2: 4-Node system with 8 drive enclosures

• Nodes are installed in pairs for redundancy


and write cache mirroring
• 3PAR StoreServ arrays with 4 or more nodes
supports “Cache Persistency”

• 4-node configuration with 8 drive enclosures Node 2 Node 3


A Drive Enclosure belongs to a node pair
Node 0 Node 1

• A Physical Drive (PD) is owned by one node.


Node 0 and 2 own the odd drives, Node 1 and
3 own the even drives
Note: if one node fails the partner node takes ownership of the
failed node's PD
3PAR Virtualization Concept
End-to-end on a 4-node system Server

Active-active

i.e. RAID5 (3+1)


Multipathing
Process step Phase state
Physical drives (PD) are Exported
automatically formatted in Disk initialization LUN
1GB Chunklets

Chunklets are bound together to form


LD LD LD LD
Logical Disks (LD) in the format Defines RAID level, step size, set LD LD
C LD LD
LD LD LD LD
defined in the CPG policies size and redundancy Virtual
(RAID level, Step Size …) P Volume
LD
LD
LD
LD G LD
LD
LD
LD
Virtual Volumes are built striped LD LD LD LD
Autonomical wide-striping across all
across all LDs of all nodes from all
Logical Disks (LD)
drives defined in a particular CPG

Present and access LUNs across


Virtual Volumes can now be
multiple active-active paths
exported as LUNs to servers
(HBAs, Fabrics, Nodes)
Set size, Step size, Regions

Set size (ssz) The size of a RAID set. For example RAID5 with set size 4 = 3+1
On RAID6, set size 8 = 6+2

Step size Stripe size = how often do we jump from one chunklet to another
Default RAID1 = 256KB for FC/NL, 32KB for SSD
Default RAID5 = 128KB for FC/NL, 64KB for SSD
Default RAID6 = 64KB for 6+2, 32KB for 14+2

Region minimum protected data allocated (reserved) to a VV


LDs are split in regions of 128 MiB
Region cannot be shared by multiple VVs
Regions are the unit of data that Adaptive Optimization moves between CPGs

32
3PAR Virtualization Concept
Chunklets, Regions, Virtual Volumes and Steps
Host IO
Default step sizes
Steps Steps
Fast Class Nearline SSD
Virtual Volume Virtual Volume
RAID1 256KB 256KB 32KB
A1 A2 A3 A4 A5 A6 A7 A1 A2 A3 A4 A5 A6 A7
VV I1 I2 I3 I4 I5 I6 I7 RAID50 128KB 128KB 64KB A8 A9 A10 A11 A12 A13 A14
Data
Regions Regions
x1 x2 x3 x4 x5 x6 x7 RAID60 32 - 64KB 32 - 64KB 32 - 64KB n10 n11 v12 n13 n14 n15 n16

Logical Disk (LD2) RAID 50 (7:1) – Set size = 8 Logical Disk (LD) RAID 10 – Set size = 2
Logical Disk (LD1) RAID 50 (7:1) – Set size = 8 P (I1-7) A10’ A’’ Logical Disk (LD) RAID 10 – Set size = 2
K7 B10’ B’’
A1 A2 A3 A4 A5 A6 A7 P (A1-7) L7 C10’ C’’
A1’ A1’’ A9’ A9’’
B1 B2 B3 B4 B5 B6 P (B1-7) B7 M7 D10’ D’’
B1’ B1’’ B9’ B9’’
C1 C2 C3 C4 C5 P (C1-7) C6 C7 N7 E10’ E’’
C1’ C1’’ C9’ C9’’
128MB D1 D2 D3 D4 P (D1-7) D5 D6 D7 D1’ D1’’ D9’ D9’’ 128MB
LD Regions E1 E2 E3 P (E1-7) E4 E5 E6 E7 x7 n10’ E1’
n’’ E1’’ E9’ E9’’
LD Regions
P (n1-7) n1 n2 n3 n4 n5 n6 n7 n1’ n1’’ n9’ n9’’

Chunklets Chunklets

Physical
Disks
3PAR Virtualization Concept
Pages

Pages are allocated sequentially in regions


regardless of where they are written on the
TPVV Host IO
Steps

1 page =16 KiB of used data Virtual Volume


A1 A2 A3 A4 A5 A6 A7
When writing less than 16 KiB of consecutive VV I1 I2 I3 I4 I5 I6 I7
Regions
data, the system will fill up the remaining x1 x2 x3 x4 x5 x6 x7

data with zeroes (default policy on a TPVV)


Reclaming unmapped LD space from CPG

• The HPE 3PAR OS space consolidation features allow to change the way that virtual volumes
(VV) are mapped to logical disks (LD) in a CPG.
• Moving VV regions from one LD to another enables you to compact the LDs, and frees up disk
space to be reclaimed for use by the system.
• What causes CPG space to be in a nonoptimal state:
– Volumes are deleted,
– Volume copy space grows and then shrinks,

• Compact CPG - allows to reclaim space from a CPG that has become less efficient in space usage from
creating, deleting, and relocating volumes. Compacting consolidates LD space in CPGs in as few LDs as
possible

35
Logical disks and chunklet initialization

• After deleting logical disks, the underlying chunklets must be initialized before their space is
available to build logical disks. The initialization process for chunklets takes about
1 minute per 1 GiB chunklet.

• Deleting 1 TB volume means about 16 hours of processing to fully reclaim the space

• Use showpd –c command to see the progress. Chunklets that are uninitialized are listed in
the Uninit column

36
3PAR Thin Technologies

Go back to
SW and
Features
Adaptive Data Reduction (ADR)

Adaptive Data Reduction – collection of technologies designed to reduce your data footprint

There are four stages of ADR that progressively improve the efficiency of data stored on SSDs:
• Zero Detect
• Deduplication
• Compression
• Data Packing
Each feature uses a combination of a 5th generation hardware ASIC, in-memory metadata, and efficient
processor utilization to deliver optimal physical space utilization for hybrid and all-flash systems.
3PAR Adaptive Data Reduction Overview
Zero Detection
Examines all incoming write streams, identifies extended strings of zeros, and removes them to prevent unnecessary
data from being written to storage. It is performed within the HPE 3PAR ASIC, not only do all operations take place
inline and at wire-speed, but they consume no CPU cycles so they do not impact system operations

Start Thin with Get Thin with Stay Thin with


Thin Provisioning Thin Conversion (Zero Detect) Thin Persistence

16TB Before After

Fast
8TB

2TB Buffer
1TB

Presented 24TB Linux


Consumed 3TB + Buffer

Buy less storage capacity Reduce Tech Refresh Costs Integrated space reclamation
3PAR Thin Persistence – Manual thin reclaim
LUN 2 LUN 2
LUN 1 LUN 1

Unused
Unused
Data 2 Free Free
Data 1 Chunklets
Chunklets Data1 Data 2

Initial state: After a while:


• LUN1 and 2 are ThP VVs • Files deleted by the servers/file system still
• Data 1 and 2 is actually written data occupy space on storage

LUN 2 LUN 2
LUN 1 LUN 1
000000000 Free
00000000 000000000 Chunklets
Free
Data1 Data 2 Chunklets Data 1 Data 2

Zero-out unused space: Run Thin Reclamation:


• Windows: sdelete (free Microsoft utility) • Compact CPG and Logical Disks
• Unix/Linux: dd script • Freed-up space is returned to the free Chunklets
on ext4: mount -o discard
3PAR Thin Persistence in VMware vSphere Environments
Introduced with vSphere 4.x
• VMware VMFS supports three formats for VM disk images
− Thin
− Thick - ZeroedThick (ZT)
− EagerZeroedThick (EZT)

• 3PAR Adaptive Data Reduction Technologies work with and optimize all three formats
• VMware recommends EZT for highest performance; see VMware Whitepaper
• vStorage API for Array Integration (VAAI)
− Thin VMotion
− Active Thin Reclamation
Introduced with vSphere 5.0
• VM Space Reclamation
− Using vmkfstools to manually reclaim VMFS deleted blocks; see KB article
− Leverages industry standard T10 UNMAP*
− Supported with VMware vSphere 5.0 and 3PAR OS 3.1.x
Introduced with vSphere 5.5
− esxcli storage vmfs unmap datastore_id
Automatically reclaim space with UNMAP in Windows

Active Thin Reclamation with T10 SCSI UNMAP


– HPE implemented support for UNMAP on 3PAR in
accordance to the T10 SCSI industry standards
– Detects/identifies thinly provisioned virtual disks
– Supported on Windows Server 2012 and 2016
– UNMAP - returns storage when no longer needed
– Scheduled UNMAP runs at times of low server IO/CPU
utilization and at scheduled times
– Also runs at time of permanent file deletion; i.e. when
NOT put into the recycle bin
Linux space reclamation
• Automatic if the mounted file system supports periodic (online) discard i.e. it is mounted with “discard”
option.
(e.g. Red Hat 6.x ext4, SLES 11 SP2 Btrfs)
For example in /etc/fstab:
– /dev/vgappP01/lvappP01 /app ext4 defaults,discard 2 2

• Use batch discard via command fstrim(8).


– This approach is implemented via FITRIM ioctl
– The file system itself has to support this ioctl
– At this point ext4, ext3, xfs, btrfs and ocfs2 support it
For example:
# fstrim –v /myfs
3PAR Deduplication and Compression
A simpler way to understand the differences and target use-cases

Deduplication Compression
• Per CPG • Per Virtual Volume
• Removes and references duplicate 16kB pages • Based on LZ4 Lossless compression algorithm
(focused on compression and decompression speed)
1
Original VVs
1 2 3 4 5 n

      
       

       

n




  
 
 

Consumed space on SSD


CPG Original VV Compressed VV on SSD
3PAR Deduplication
A simpler way to understand the differences and target use-cases
• Deduplication eliminates duplicated data on your SSDs and
reduces the amount of capacity required to store data.

• Deduplication uses the HPE 3PAR ASIC.


- assigns a unique "fingerprint" to write requests.
- saves these "fingerprints" for future reference.
- cross-references the "fingerprints" of the new data against
previously captured data.

• Instead of writing the duplicative data to storage, the system


records a "pointer" to the original data.

• Deduplication is an inline process


HPE 3PAR StoreServ Deduplication
Enhanced with 3PAR OS 3.3.1

Pre 3.3.1 with TDVV1 and TDVV2


Express Index tables
DDC – All new writes go to the DDS
– Only collision data is stored in the DDC
– Data overwrites are written to new location in the
DDS – old data becomes garbage
– Garbage collection potentially produces lots of
backend IO
DDS
New with 3.3.1 and TDVV3
Deduplication: – New writes go to the DDC
Write data
hash calculated by Unique data
in cache – Second write with same data is written to the DDS
the 3PAR ASIC 1)
and first write owner informed to update map
– Results in less garbage collection and higher overall
performance
DDC – Client Volume – one per Virtual Volume
DDS – Dedup Store – one per CPG
HPE 3PAR StoreServ Deduplication
Requirements & Limitations

– No additional licence for using dedupe


Express Index tables
DDC
– Available to all models with the GEN 4/5 ASIC
7000, 8000, 10,000 and 20,000 series.

– Virtual Volumes must be on flash drives, so this


negates the use of AO with dedupe.

DDS

Deduplication:
Write data
hash calculated by Unique data
in cache
the 3PAR ASIC 1)

DDC – Client Volume – one per Virtual Volume


DDS – Dedup Store – one per CPG
Allocation units and alignment
Important things to consider when implementing ADR
• The granularity of 3PAR deduplication is 16 KiB and therefore the dedup efficiency is best when
the blocks written are aligned to this granularity
• For hosts that use file systems with tunable allocation units consider setting the allocation unit to
16 KiB or a multiple of 16 KiB
• For Microsoft Windows the NTFS default allocation unit size is 4kiB for volumes up to 16TB.
MS Article ID: 140365 https://support.microsoft.com/kb/140365?wa=wsignin1.0
Set the value to16kB or higher when formatting the NTFS volume; find the procedure here:
https://support.microsoft.com/en-us/kb/2272294
• Linux hosts using Btrfs, EXT4 and XFS file systems require no tuning
EXT3 file systems align to 4 KiB boundaries and cannot be tuned; for maximum dedup efficiency migrate to EXT4
• For applications that have tunable block sizes consider to setting the block size to 16 KiB or a multiple of 16 KiB

For more info also see the 3PAR Adaptive Data Reduction whitepaper
Evolution of 3PAR Dedup Volumes (TDVV)

• 3.2.1 MU1 – TDVV1 – First release of Dedup for 7000 and 10000
• 3.2.2 GA – TDVV2 – Introduced on 8000 and 20000 providing defrag of the Dedup Store
• 3.3.1 GA – TDVV3 – Enhanced design with reduced overhead for 7k, 8k, 9k, 10k and 20k
showcpg –d (on a 3.3.1 system)
----Volumes---- -Usage- ---------(MiB)---------- --LD--- -RC_Usage- -Shared-
Id Name Warn% VVs TPVVs TDVVs Usr Snp Base Snp Free Total Usr Snp Usr Snp Version
4 NL_r6 - 0 0 0 0 0 0 0 0 0 0 0 0 0 -
1 SSD_r1 - 0 0 0 0 0 0 0 0 0 0 0 0 0 -
2 SSD_r5 - 9 4 4 5 4 205952 2048 17280 225280 0 2 0 0 2
3 SSD_r6 - 9 4 4 5 4 66432 2048 17536 86016 0 2 0 0 3
0 system_cpg - 1 0 0 1 1 102400 2560 13824 118784 2 2 0 0 -

• TDVV format is tied to the CPG it is provisioned from


• Existing customers’ TDVV format will stay as it is
• New TDVV from CPGs with TDVV1 or 2 will be created inherited.
• New TDVV on new CPG or CPG without any TDVV1 or 2 will be created as TDVV3
• A systems can have a mix of TDVV1 or 2 and TDVV3
• Online tune your existing TDVV 1 or 2 to TDVV3 (tunevv)
HPE 3PAR StoreServ Compression and Packing
New with 3PAR OS 3.3.1

– Available for Gen 5 systems (8k, 9k, 20k) Express Scan


– No License required
– Data Compression performed by
Intel CPU (LZ4 algorithm) 
– Virtual Volumes must be on flash drives,
compression can't be used with AO 
– The minimum size of compressed VVs 
is 16 GiB
– Compression is on a per VV basis for thin and
dedup volumes
– Compressed data pages are Write data Compression done Compressed and
in cache by the Intel CPU packed write data
packed into 16kB 3PAR pages
HPE 3PAR StoreServ Adaptive Data Reduction technologies
Combined Dedup and Compression

Express Scan

Compression and
Write Data Resulting data
Deduplication by Data Packing
in cache written to SSD
the Intel CPU

When used together, duplicate pages are removed first and unique pages are then compressed
Compressed data presents a unique problem
After compression, blocks will have odd sizes

Uncompressed Compressed Backend

16KB 2.3KB

16KB 8.3KB

16KB

16KB

16KB
3.2KB
5.2KB

10.7KB
?
16KB 1.1KB

Total: 96KB Total: 30.8KB


Theoretical saving:
65.2KB – 3.12:1
Storing compressed data in variable block sizes
Some systems use variable block sizes (4K, 8K, 16K)

Uncompressed Compressed Backend

16KB 2.3KB 4KB

16KB 8.3KB 16KB


16KB 5.2KB 8KB

16KB 3.2KB 4KB

16KB 10.7KB 16KB

16KB 1.1KB 4KB

Total: 96KB Total: 30.8KB Total consumed: 52KB


Theoretical saving: Effective saving:
65.2KB – 3.12:1 44KB – 1.85:1

Padding per-backend page means lots of wasted space, resulting in lower total system efficiency
HPE 3PAR Data Packing
16K optimized

Uncompressed Compressed 3PAR Backend

16KB 2.3KB

16KB 8.3KB

16KB 5.2KB 16KB

16KB 3.2KB

16KB 10.7KB 16KB

16KB 1.1KB

Total: 96KB Total: 30.8KB Total consumed: 32KB


Theoretical saving: Effective saving:
65.2KB – 3.12:1 64KB – 3:1
3PAR Compression Overview

• Compression occurs when pages are flushed from cache to the backend SSDs
• Only pages belonging to a single VV are compressed together
• The pages do not need to belong to contiguous addresses
• Pages belonging to different VVs will not be compressed together
• Pages belonging to different snapshots of same base VV also not compressed together
• When data is re-written to a compressed Virtual page, we try to recompress the data into the existing
compressed page (a "refit"). If the new compressed virtual page does not fit into the original page
then the virtual page is written to a new compressed page
• Up to eight 16 KiB cache pages can be compress into a single 16 KiB compressed page
• The number will depend upon how compressible the data is
3PAR Compression Estimation
Dry run using the CLI – checkvv -compr_dryrun “VV _name”
cs-3par5 cli% checkvv -compr_dryrun Test_VV cs-3par5 cli% showtask -d 5646

Id Type Name Status Phase Step -------StartTime-------- -------FinishTime------- -Priority- -User--


The -compr_dryrun command relies on HPE 3PAR Thin 5646 compr_dryrun checkvv done --- --- 2017-04-19 17:25:26 CEST 2017-04-19 17:25:28 CEST n/a pmattei
Compression technology to emulate the benefit of
Detailed status:
converting VVs to compressed volumes.
2017-04-19 17:25:26 CEST Created task.
The information provided is applicable to HPE 3PAR Thin 2017-04-19 17:25:26 CEST Started checkvv space estimation started with option -compr_dryrun
Compression only and it should be considered for 2017-04-19 17:25:28 CEST Finished checkvv space estimation process finished

estimation purposes only; results of a conversion may


User Data (Compressed)
differ slightly.
Id Name Size(MB) Size(MB) Ratio
39 Test_VV 105250 23183 4.54
The command can be run against live production volumes
-----------------------------------------
and will generate a non-intrusive background task. The
1 total 105250 23183 4.54
task may take some time to run, depending on the size and
cs-3par5 cli%
number of volumes, and can been monitored via the
showtask commands.

Do you want to continue with this operation?


select q=quit y=yes n=no: y
Task 5646 has been started to validate administration
information for VVs: Test_VV
cs-3par5 cli%
3PAR Combined Dedup and Compression Estimation
Dry run using the CLI – checkvv –dedup_compr_dryrun “VV _name”
cs-3par5 cli% checkvv -dedup_compr_dryrun Test_VV cs-3par3 cli% showtask -d 5648

The -dedup_compr_dryrun command relies on HPE 3PAR Thin Id Type Name Status Phase Step -------StartTime-------- -------FinishTime------- -Priority- -User--
5648 dedup_compr_dryrun checkvv done --- --- 2017-04-19 17:29:29 CEST 2017-04-19 17:32:17 CEST n/a pmattei
Compression and Deduplication technology to emulate the
total amount of space savings by converting one or more Detailed status:
2017-04-19 17:29:29 CEST Created task.
input VVs to Compressed TDVVs. Please note that this is
2017-04-19 17:29:29 CEST Started checkvv space estimation started with option -dedup_compr_dryrun
an estimation and results of a conversion may differ 2017-04-19 17:32:17 CEST Finished checkvv space estimation process finished
from these results.
-(User Data)- -(Compression)- -------(Dedup)------- -(DataReduce)-
Id Name Size(MB) Size(MB) Ratio Size(MB) Ratio Size(MB) Ratio
The command can be run against live production volumes
39 Test_VV 105250 32089 3.28 -- -- 32089 3.28
and will generate a non-intrusive background task. The
--------------------------------------------------------------------------------------
task may take some time to run, depending on the size and
1 total 105250 32089 3.28 5293 19.89 2978 35.34
number of volumes, and can be monitored via the showtask cs-3par5 cli%
commands.

Do you want to continue with this operation?


select q=quit y=yes n=no: y
Task 5648 has been started to validate administration
information for VVs: Test_VV
cs-3par5 cli%
Space Savings & Host tuning
Windows
• Host File System block size and alignment can play a significant role the space efficiency
• On Windows, it is highly recommended to adjust block allocation unit for NTFS to 16K (or 32/64K)

The following example shows the effect of allocation unit on dedup ratios. Five TDVVs were created and then
formatted with NTFS file systems using allocation units of 4 KiB–64 KiB. Four copies of the Linux kernel source
tree, which contains many small files, were unpacked into each file system.

cli% showvv -space ntfs*


----Adm---- ---------Snp---------- ----------Usr-----------
---(MB)---- --(MB)--- -(% VSize)-- ---(MB)---- -(% VSize)-- ------(MB)------ -Capacity Efficiency-
Id Name Prov Type Rsvd Used Rsvd Used Used Wrn Lim Rsvd Used Used Wrn Lim Tot_Rsvd VSize Compaction Dedup
149 ntfs_4k tdvv base 6781 6233 0 0 0.0 -- -- 11831 8660 1.7 0 0 18612 512000 34.4 1.1
151 ntfs_8k tdvv base 6200 5679 0 0 0.0 -- -- 11006 7889 1.5 0 0 17206 512000 37.7 1.4
152 ntfs_16k tdvv base 3908 3498 0 0 0.0 -- -- 7748 4853 0.9 0 0 11656 512000 61.3 3.1
153 ntfs_32k tdvv base 3907 3491 0 0 0.0 -- -- 7748 4846 0.9 0 0 11655 512000 61.4 3.1
154 ntfs_64k tdvv base 3907 3494 0 0 0.0 -- -- 7748 4847 0.9 0 0 11655 512000 61.4 3.1
Space Savings & Host tuning
Linux

• On Linux, EXT4, and XFS will yield better dedupe efficiency


Example: 3 TDVVs formatted with ext3, ext4 and XFS file systems.
Copied a 10GB non-dedupable file 4 times into each file system.

cli% showvv -s -cpg SSD_r1


----Adm---- ---------Snp---------- ----------Usr-----------
---(MB)---- --(MB)--- -(% VSize)-- ---(MB)---- -(% VSize)-- -----(MB)------ -Capacity Efficiency-
Id Name Prov Type Rsvd Used Rsvd Used Used Wrn Lim Rsvd Used Used Wrn Lim Tot_Rsvd VSize Compaction Dedup
296 ext3 tdvv base 10325 9595 0 0 0.0 -- -- 19230 14400 14.1 0 0 29555 102400 4.3 2.0
297 ext4 tdvv base 5270 4787 0 0 0.0 -- -- 11890 7171 7.0 0 0 17160 102400 8.6 4.0
298 xfs tdvv base 5269 4784 0 0 0.0 -- -- 9840 7181 7.0 0 0 15109 102400 8.6 4.0
---------------------------------------------------------------------------------------------------------------------
3 total 20864 19166 0 0 40960 28752 61824 307200 6.4 3.0
End od Adaptive Data Reduction

60
3PAR Autonomic Sets
Simplify Provisioning Autonomic 3PAR Storage
Traditional Storage Autonomic Host Set
Cluster of VMware vSphere Servers

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

Individual Volumes
Autonomic Volume Set
– Initial provisioning of the cluster – Initial provisioning of the cluster
• Requires 50 provisioning actions • Add hosts to the Host Set
(1 per host – volume relationship) • Add volumes to the Volume Set
– Add another host/server • Export Volume Set to the Host Set
• Requires 10 provisioning actions (1 per volume) – Add another host/server
– Add another volume • Just add host to the Host Set
• Requires 5 provisioning actions (1 per host) – Add another volume
• Just add the volume to the Volume Set
3PAR Distributed and Adaptive
Sparing
HPE 3PAR High Availability - Sparing
Spare Disk Drives vs. Distributed Sparing

Traditional Arrays 3PAR StoreServ

Spare Chunklets

Spare drive

Few-to-one rebuild Many-to-many rebuild


hotspots & long rebuild exposure parallel rebuilds in less time
Spare space – Spare chunklets

Physical Disk
DC DC DC DC
Each physical disk in a 3PAR array is initialized DC DC
with data and spare Chunklets of 1GB each
DC DC DC DC
DC SC

DC SC DC DC
DC DC

DC DC DC SC
DC DC

DC = 1GB Data Chunklet SC = 1GB Spare Chunklet


Spare space – Spare chunklets
How spare chunklets work
• When a connection is lost to a physical disk all future writes to the disk are automatically written to a
logging LD until the physical disk comes back online, or until the time limit for logging is reached.
Logging disk space is allocated when the system is set up. This does not apply to RAID 0 chunklets,
which have no fault-tolerance.
• If the time limit for logging is reached, or if the logging LD becomes full, the relocation of chunklets on
the physical disk to free chunklets designated as spares starts automatically.
• For automatic relocations, the system uses up to a maximum of one disk worth of chunklets per system
node.
• When selecting a target chunklet for relocation, the system attempts to identify:
local spare chunklet,
local free chunklet,
remote spare chunklet,
remote free chunklet
HPE 3PAR High Availability – Rebuild Process
Chooses a target spare chunklet + criteria.
Criteria are prioritized to maintain the same level of performance and availability as the source
chunklet if possible:

1. Locate a chunklet on the same type of drive (e.g. NL, FC).


2. Maintain the same HA charact. as the failed chunklet (e.g. HA cage, HA mag.).
3. Keep the chunklet on the same node as the failed drive.

The best case: spare chunklets are on the same node as the failed drive with the same availability
characteristics

When spare chunklets with this criteria are not available, free chunklets with the same
characteristics are considered. During the sparing process if the number of free chunklets used
exceeds a threshold set in the HPE 3PAR StoreServ OS, consideration will be given to spare
chunklets on another node. This will help keep the array balanced
HPE 3PAR High Availability – Sparing policy
Default: ~ 2.5 percent with minimum Small configurations (e.g. 8000’s with less than 48 disks)
have a spare space target equal to the size of 2 of the largest drives of the type being
spared. setsys SparingAlgorithm Default
• Minimal: ~2.5 percent without minimums. Minimal is the same as default except for small
configurations. When the configuration is small, minimal will only set aside spare space equal to
one of the largest drives (compared to two for default) of the type being sized.
• Maximal: one disk’s worth in every cage. Spare space is calculated as the space of the largest
drive of the drive type being sized for each disk cage.

• Custom: Implemented manually by the administrator using the createspare, showspare,


and removespare commands.
3PAR Adaptive Sparing
Optimizing SSD capacity and wear-out
1. Traditional SSD implementation
2. Adaptive Sparing - Sparing combining the Fixed OP space
with HPE 3PAR spare space. Adaptive sparing allows
what would usually be a 1.6 TB SSD, for example, to have
1.92 TB of usable capacity, (20 % increase in usable
capacity)
3. Adaptive Sparing 2.0 combining the Fixed OP space with
HPE 3PAR spare space and unused (free) space. Unused
free space on the drive allows the SSD to operate with
even greater OP space which further increases
endurance. When the space is needed by 3PAR to store
data, a normal write operation will return the space from
SSD OP space to 3PAR without performance penalty.
3PAR High-Availability and
Persistency Features

Go back to
Table of
Content
HA Enclosure (Cage) vs. HA Drive (Magazine)
Tier 1 Availability feature in a modular class array
Raid Group 1 Raid Group 2

Most modular arrays stripe Data and Parity


across and within Enclosures

If you lose access to an Enclosure you lose access to


ALL Data in the array
Node 0 Node 1

Enterprise Arrays, including all HPE 3PAR arrays allow


HA configurations that stripe Data and Parity across

Raidlets Sets
Enclosures without compromising data availability
Note: Each node owns its vertically protected drives

In the event of an entire Enclosure outage,


Data remains Online and available to Hosts
HA Enclosure (Cage) on a 2-node system
Max RAID-set sizes supporting HA Enclosure

R1, R5 2+1, R6 4+2

R1, R5 3+1, R6 6+2


R1
Enclosures 1) RAID1 RAID5 RAID6 2)

R1, R5 4+1, R6 8+2

R1, R5 5+1, R6 10+2


2 OK NA NA

R1, R5 6+1, R6 10+2


3 OK 2+1 4+2

R1, R5 7+1, R6 14+2

R1, R5 8+1, R6 14+2


4 OK 3+1 6+2

5 OK 4+1 8+2

6 OK 5+1 10+2

7 OK 6+1 10+2

8 OK 7+1 14+2

9 OK 8+1 14+2

1) Including the controller enclosure


2) The preferred, performance enhanced RAID6 set sizes are: 4:2, 8:2, 14:2
3PAR StoreServ 8000 Hardware Architecture
Cost-effective, scalable, resilient, meshed, active-active

Host Ports
Cache
84x0
Disk Ports
with
3PAR ASIC
2 nodes
4
8200
with
2 nodes
HA Enclosure (Cage) on a 4-node system
Max RAID-set sizes supporting HA Enclosure

Enclosures 1) RAID1 RAID5 RAID6 2)

R1, R5 8+1, R6 14+2


4 (2 x 2) OK NA NA

R1, R5 7+1, R6 14+2


R1, R5 6+1, R6 10+2
R1, R5 5+1, R6 10+2
6 (2 x 3) OK 2+1 4+2

R1, R5 4+1, R6 8+2


R1, R5 3+1, R6 6+2
8 (2 x 4) OK 3+1 6+2

R1, R5 2+1, R6 4+2


10 (2 x 5) OK 4+1 8+2

R1
Nodes 2 and 3
12 (2 x 6) OK 5+1 10+2 Nodes 0 and 1

R1, R5 2+1, R6 4+2


R1

R1, R5 3+1, R6 6+2

R1, R5 4+1, R6 8+2


14 (2 x 7) OK 6+1 10+2

R1, R5 5+1, R6 10+2

R1, R5 6+1, R6 10+2

R1, R5 7+1, R6 14+2


16 (2 x 8) OK 7+1 14+2

R1, R5 8+1, R6 14+2


18 (2 x 9) OK 8+1 14+2

1) Including the controller enclosure


2) The preferred, performance enhanced RAID6 set sizes are: 4:2, 8:2, 14:2
3PAR Persistency features
HA Cage / Enclosure Persistent Cache Persistent Ports Persistent Checksum

RAID 50 Volume
R R R R R R
Reads Writes
5 5 5 5 5 5 MPIO
A1 A2 A3 A4 A5 A6
B1 B2 B3 B4 B5 B6 Host HBA
C1 C2 C3 C4 C5 C6
D1 D2 D3 D4 D5 D6
SAN Switch
cage

A1 A2 A5 A3 A6
A4 3PAR Front-end
cage

B1 B4 B2 B5 B3 B6
RAID

3PAR Back-end
cage

C1 C4 C2 C5 C3 C6
0:0:1 0:0:2 1:0:1 1:0:2
0:0:1 0:0:2
cage

D1 D4 D2 D5 D3 D6 Ctrl 0 Ctrl 1 3PAR Drives

“Raidlets” Groups built across Write Cache Re-Mirroring Transparent handling of T10-PI end-to-end
cages (any RAID level) paths or controller loss data protection
3PAR High Availability
Guaranteed Drive Enclosure (Drive Cage) Availability if desired

Traditional Arrays 3PAR StoreServ


RAID 50 Volume
RAID 5 Group R5 A7

cage
A8 A9
A1 AA A2 AB A3 AC R R R R R R
enclosure

A4 A5 A6 5 5 5 5 5 5

A E A
A1 A2 A3 A4 A5 A6
B1 B2 B3 B4 B5 B6
C1 C2 C3 C4 C5 C6
B7

cage
B8 B9 D1 D2 D3 D4 D5 D6
B1 BA B2 BB B3 BC
enclosure

B4 B5 B6

B F B
C7 C8 C9

cage
CA CB CC RAID 10 Volume
C1 C2 C3
enclosure

C4 C5 C6

C G C
R R R R R R R R R R R R
1 1 1 1 1 1 1 1 1 1 1 1
A7 A8 A9 AA AB AC C7 C8 C9 CA CB CC

cage
D7 D8 D9 B7 B8 B9 BA BB BC D7 D8 D9 DA DB DC
D1 DA D2 DB D3 DC
enclosure

D4 D5 D6

D H D
Enclosure-independent RAID
“Raidlets” Groups for any RAID level built across enclosures
Enclosure-dependent RAID Data access preserved with HA Enclosure (Cage)
Enclosure (cage) failure might mean no access to data User selectable per CPG
3PAR High Availability
Write Cache Re-Mirroring
3PAR StoreServ

Traditional Mid-range Arrays


Write-Cache stays on
Ctrl 1 Ctrl 2 thanks to Re-Mirroring

Write Cache Write Cache

Mirror Mirror

Persistent Write-Cache Mirroring


• No write-thru mode  consistent performance
Traditional Write-Cache Mirroring • Standard behavior on all 4-, 6- and 8-node systems
Losing one controller either results in poor performance due
to write-thru mode or risk of write data loss
Local Write
Local Write Definition: A write I/O request from a host that comes into a node
that owns the LD of the request.

Step 3: Step 1: Step 2:


Once the write cache is Write request from server to Write I/0 mirrored to
mirrored, the write I/O is node 2 for LD owned by write cache of node 3
acknowledge to the host node 2, hits write cache
and the I/O is completed

Node 0 Node 1

Node 2 Node 3
Remote Write
Remote Write Definition: A write I/O request from a host that comes into a node
that differs from the LD owner of the request.

Step 4: Step 1: Step 2: Step 3:


Once mirrored, Write request from Write I/0 mirrored Log (16k page) placed
write I/O ACK to server to node 2 from node 2 to write into the cache of node 1
cache for the LD cache of node 0 (backup owner of LD)
host and I/O
owned by node 0
completed

Node 0 Node 1

Node 2 Node 3
Local Read data in Cache
Local Read Definition: A read I/O request from a host that comes into a
node that owns the LD of the request.

Step 1: Step 2
Read DATA in CACHE Data returned to host,
IO complete
Read request from server to node 2 for
LD owned by node 2 (never read from
the backup owner of the LD, node 3)

Node 0 Node 1

Node 2 Node 3

Note:
If the data is not in cache it is
first read from disk into cache
Remote Read
Remote Read Definition: A read I/O request from a host that comes into a node
that does not own the LD of the request

Step 1: Step 2: Step 3: Step 4:


Data returned to host
Read request for Node 2 requests Data is read from
data not local to data from LD owner Cache on owning Node 2 replies to host
node (node 0) node and copied to with data requested.
requesting node Cache page marked
Note: invalid and not used for
If the data is not in cache it
reads
is first read from disk into
cache
Node 0 Node 1

Node 2 Node 3
Write with the failed Node
A Write I/O request from a host that comes into a failed node

Intel Intel Intel Intel


Multi-Core Multi-Core Multi-Core Multi-Core
Node 1 Processor Processor Processor Processor Node 3
Multifunction Multifunction
Controller Controller
L L
D Control Control D
Virtual Volume
Data Cache Data Cache
Data Data Presented
Cache Cache Cache Cache
3PAR Gen4 3PAR Gen4 3PAR Gen4 3PAR Gen4
To The Server
ASIC ASIC ASIC ASIC

PCIe PCIe PCIe PCIe PCIe PCIe


Switch Switch Switch Switch Switch Switch

Intel Intel Intel Intel

Node 0
Multi-Core
Processor
Multi-Core
Processor
Multi-Core
Processor
Multi-Core
Processor
Node 2
Multifunction Multifunction
Controller Controller L
L D
Control Control
D Cache Cache
Data Data Data Data
Cache Cache Cache Cache
3PAR Gen4
ASIC
3PAR Gen4
ASIC
3PAR Gen4
ASIC
3PAR Gen4
ASIC
Node 3 fails and its
Write request from server
LD becomes
Server sends dataowned by
to node
PCIe
Switch
PCIe PCIe PCIe
Switch
PCIe PCIe Processor
IO
Xfer determines
Complete
Ready to
from
Data DMAed to node host which
node
Switch Switch Switch Switch
2 and Node
data 2
is placed in
LD1’s
the write
cache is for
Node 2’s cache

83
Multipathing of traditional arrays
Path loss and controller maintenance or loss behavior

• Path and controller loss have to be


handled by MPIO
MPIO MPIO
• Depending on the settings a failover
can take up to 60 seconds

• Requires regular
maintenance/patching of MPIO
0:0:1

0:0:1 0:0:2 1:0:1 1:0:2 0:0:1


0:0:1 0:0:2 1:0:1 1:0:2
Ctrl 0 Ctrl 1 Ctrl 0 Ctrl 1

A path loss requires an A controller maintenance or


MPIO path failover which loss requires an MPIO path
can be lengthy or even fail failover of all paths
3PAR Persistent Ports
Path loss, controller maintenance or loss behavior of 3PAR arrays

• No user intervention required


• In FC SAN environments all paths
MPIO MPIO stay online in case of loss of signal
of an FC path, during node
maintenance and in case of a node
failure
• For FC, iSCSI and FCoE
deployments all paths stay online
0:0:1

0:0:1
during node maintenance and in
0:0:1 0:0:2 1:0:1 1:0:2 Native port ID 0:0:1 0:0:2 1:0:1 1:0:2 case of a node failure
1:0:1 1:0:2 0:0:1 0:0:2 Guest port ID 1:0:1 1:0:2 0:0:1 0:0:2
Ctrl 0 Ctrl 1 Ctrl 0 Ctrl 1 • Server will not “see” the swap of
the 3PAR port ID thus no MPIO
An FC path loss is handled by A controller maintenance or loss is handled
3PAR Persistent Ports by 3PAR Persistent Ports for all protocols
path failover required
 all server paths stay online  all server paths stay online

Read more in the Persistent Ports whitepaper


3PAR Persistent Ports
SAN requirements

If a device attempts to log in with the same port WWN (PWWN) as another • configshow  switch.login.enforce_login: 0
device on the same switch, you can configure whether the new login or the
− switchdisable
existing login takes precedence.
− configure ->F-Port Login Parameters
• 0 - First login takes precedence over second login
(default behavior). • Enforce FLOGI/FDISC login: (0..2) [0] 2 <==

− switchenable
• 1 - Second login overrides first login.

• 2 - The port type determines whether the first or • configshow  switch.login.enforce_login: 2


second login takes precedence.
3PAR Licensing

Go back to
SW and
Features
StoreServ * License Cap
3PAR 8000 and 20000 Software Details 8200
8400, 8450
8440
48
168
320

Legacy licensing before February 2017 20450


20850
168
320
20800 480
20840 640

Replication SW Suite *
• Virtual Copy (VC)
Recovery Manager Central Suite
• Remote Copy (RC)
• Peer Persistence (PP) • vSphere
• Cluster Extension Windows (CLX) Policy Server • MS SQL
• Policy Manager Software
• Oracle
Data Optimization SW Suite * • SAP HANA
• Dynamic Optimization Data Encryption • 3PAR File Persona
• Adaptive Optimization
• Peer Motion
• Priority Optimization
File Persona SW Suite Application SW Suite for MS Exchange

Security SW Suite *
• Virtual Domains Smart SAN for 3PAR Application SW Suite for MS Hyper-V
• Virtual Lock

• System Reporter 3PAR Operating System SW Suite * • Full Copy


• 3PARInfo • Thin Provisioning
• Online Import license (1 year) • Rapid Provisioning • Adaptive Flash Cache • Thin Copy Reclamation
• System Tuner • Autonomic Groups • Persistent Cache • Thin Persistence
• Host Explorer • Autonomic Replication Groups • Persistent Ports • Thin Conversion
• Multi Path IO SW • Autonomic Rebalance • Management Console • Thin Deduplication for SSD
• VSS Provider • LDAP Support • Web Services API • 3PAR OS Administration Tools
• Scheduler • Access Guard • SMI-S • CLI client
• Host Personas • Real Time Performance Monitor • SNMP
3PAR 8000, 9000 and 20000 Software Details
New All-inclusive licensing after February 2017

• Remote Copy (RC) Recovery Manager Central Suite


All-inclusive Multi-System
• Peer Persistence (PP)
• vSphere
• Cluster Extension Windows (CLX)

Software
Federation (Peer Motion) Policy Server • MS SQL
• Policy Manager Software
• Oracle
• SAP HANA
• Dynamic Optimization
Data Encryption • 3PAR File Persona
• Adaptive Optimization
• Priority Optimization
File Persona SW Suite Application SW Suite for MS Exchange

• Virtual Domains
• Virtual Lock Application SW Suite for MS Hyper-V
Smart SAN for 3PAR
• Virtual Copy All-inclusive Single-System Software
• System Reporter 3PAR Operating System SW Suite * • Full Copy
• 3PARInfo • Thin Provisioning
• Rapid Provisioning • Adaptive Flash Cache • Thin Copy Reclamation
• Online Import license (1 year) • Autonomic Groups • Persistent Cache
• System Tuner • Thin Persistence
• Autonomic Replication Groups • Persistent Ports • Thin Conversion
• Host Explorer • Autonomic Rebalance • Management Console
• Multi Path IO SW • Thin Deduplication for SSD
• LDAP Support • Web Services API • 3PAR OS Administration Tools
• VSS Provider • Access Guard • SMI-S • CLI client
• Scheduler • Host Personas • Real Time Performance Monitor • SNMP
3PAR 8000, 9000 and 20000 Software Details
All-inclusive licensing model
All-inclusive Multi-System Software - optional Policy Server – optional
Policy Manager Software
Remote Copy (RC) Federation
Peer Motion Peer Persistence
Cluster Extension Windows (CLX) Data Encryption – optional
Requires encrypted drives

All-inclusive Single-System Software - included with every system


Dynamic Optimization Recovery Manager Central Suite
Adaptive Optimization File Persona • vSphere
Priority Optimization
• MS SQL
Smart SAN for 3PAR
• Oracle
Virtual Domains Application Suite for MS Hyper-V • SAP HANA
Virtual Lock Application Suite for MS Exchange • 3PAR File Persona
Virtual Copy
System Reporter
3PARInfo Rapid Provisioning Adaptive Flash Cache Full Copy
Online Import license (1 year) Autonomic Groups Persistent Cache Thin Provisioning
System Tuner Autonomic Replication Groups Persistent Ports Thin Copy Reclamation
Host Explorer Autonomic Rebalance Management Console Thin Persistence
Multi Path IO SW LDAP Support Web Services API Thin Conversion
VSS Provider Access Guard SMI-S Thin Deduplication for SSD
Scheduler Host Personas Real Time Performance Monitor 3PAR OS Administration Tools
New All-inclusive 3PAR Software Suites

• Dramatically simplifies 3PAR software licensing


• No more drive or capacity licensing and capping
• For all new 8000, 9000 and 20000 systems
• Required 3PAR OS releases
• Any 3.2.2 (All single-system SW except Online Import and RMC App Suite)
• 3.2.2 MU3 (All single-system SW)
• 3.3.1 (All single-system SW)
• Transition licenses for installed systems available
3PAR Full and Virtual Copy

Go back to
SW and
Features
3PAR Full Copy V1– restorable copy (offline copy)
Part of the base 3PAR OS
Base Volume

• Full physical point-in-time copy


• Provisionable after copy ends
• Independent of base volume’s RAID
and physical layout properties Intermediate
Snapshot
• Fast resynchronization capability (opt.)
• Thin Provisioning-aware Full Copy
• Full copies can consume same physical
capacity as thinly provisioned base volume

Full Copy
3PAR Full Copy V1– restorable copy (offline copy)
CLI cmds
Base Volume

• createvvcopy -p hmax -pri high hmax.phy_copy

Restorable
• createvvcopy -p hmax -s -pri high hmax.phy_copy
Intermediate
Snapshot

Full Offline Copy


3PAR Full Copy V2 – instantly accessible copy (online copy)
Part of the base 3PAR OS
Base Volume

• Share data quickly and easily


• Full physical point-in-time copy
• Immediately provisionable to hosts
• Independent of base volume’s RAID Intermediate
Snapshot
and physical layout properties
• No resynchronization capability Full Copy
• Thin Provisioning-aware
• Full copies can consume same physical
capacity as thinly provisioned base volume
Full Copy
3PAR Full Copy V1– restorable copy (online copy)
CLI cmds
Base Volume

• createvvcopy -p hmax -online -snp_cpg CPG_SSD_R5


CPG_SSD_R5 hmax.phy_copyonline

Intermediate
Snapshot

Full Online Copy


3PAR Virtual Copy – Snapshot at its best
Smart # of
Model
• Individually erasable and promotable Snapshots
• Scheduled creation/deletion Up to 128’000 Virtual Volumes 14435 7440c
• Consistency groups 10448 10400
• Copy on Write (CoW) for full and thinly provisioned VV and Snapshots 10023 7400c
• Redirect on Write (RoW) for deduped VV on Flash 9642 10400

Thin 100s of Snaps per Base Volume… 9044 10400


8910 10400
• No reservation, non-duplicative … but only one CoW required 8597 7400
• Variable QoS
8092 7200
Ready 7777 10800
• Instantaneously readable and/or writeable 7653 10400
• Snapshots of snapshots of … 7648 7200
• Virtual Lock for retention of read-only snaps 7612 10400
• Automated erase option Virtual Copies 7279 10400
7236 10400
Integrated 7141 10400
• HPE Recovery Managers for
7097 10400
• MS Hyper-V
7080 10800
• MS SQL & Exchange
7041 7400
• Oracle Base Volume 6899 10800
• vSphere
6761 7400
• Backup Apps from HPE, Symantec, VEEAM, CommVault 6696 7200c
• Through GUI, CLI, WEB API, SMI-S etc.
Top Arrays WW
as of February 2016
Also see the Virtual Copy Whitepaper
3PAR Virtual Copy creation
Copy on Write (CoW) for fully or thinly provisioned Virtual Volumes (FPVV and TPVV)

Base RO Snapshot RW Snapshot

• Create Snapshot
1. RO Snapshot
Virtual
Volumes 2. RW Snapshot

1 2 3 4 5
CPGs
3PAR Virtual Copy writes
Copy on Write (CoW) for fully or thinly provisioned Virtual Volumes (FPVV and TPVV)

Base RO Snapshot RW Snapshot


Write Write
• Write to Base VV
1. Copy block to Snapshot CPG
Virtual
Volumes 2. Redirect Snapshots
3. Update block

• Write to RW Snapshot
1. Copy block to Snapshot CPG
2. Redirect Snapshots

1 2 3 4 5
6 1 7
CPGs
Creating a Virtual Copy in the SSMC

– On the Virtual Volume screen


• Select your VV
• Hit Actions > Create snapshot
• Select snapshot name
• Hit Create
– By default one R/W snapshot will be
taken

– Optionally you can define


• Retention time (Edit snapshot options)
• Expiration time (Edit snapshot options)
• Snapshot schedule
3PAR view of Virtual Copies in SSMC
The GUI gives a very easy to read graphical view of virtual copies

Base Volume with (1) snapshot


Snapshot presented to 1 host
3PAR Virtual Copy for Backup use case
One week based on hourly snaps and an average daily change rate of ~10%

Monday Tuesday Wednesday Thursday Friday Saturday Sunday

Base Volume
of 2 TB 24 Copies 48 Copies 72 Copies 96 Copies 120Copies 144Copies 168 Copies
~200 GB ~200 GB ~200 GB ~200 GB ~200 GB ~200 GB ~200 GB

Results in 168 Virtual Copies and only


~1.4 TB Snapshot Space needed
Go back to
SW and
Features
3PAR System Reporter

Go back to
SW and
Features
Data types

Historical data: Charts that collect stored data in DB for the defined object.

• Histogram
• Performance
• Capacity
 Most objects

Real time data: Charts that collect data, which has been collected on an object in
the last 5 seconds.

• Performance

 Exported volumes
 Physical drive
 Port data
 Port control
Categories - Histogram

4 histograms
available to report
• Enclosure port
• Exported volumes
• Host port
• Physical drive
Categories - Performance
Aggregated historical data collected for the performance object selected

• Node cache
• Node CPU
• Cumulative IO Density - CPG
• Cumulative IO Density - AOCPG
• Exported Volumes
• Ports
• Physical Drive
• Remote Copy Links
Categories - Capacity
System Reporter reports on the following capacity objects:
• Virtual Volume
• System
• CPG space
• Physical drive
Templates 1/2 - a key feature within System Reporter
Templates 2/2 - a key feature within System Reporter
Real Time Data Example
System Reporter Data Base .srdata
The .srdata VV is mounted on the non-master node. The size of the .srdata is

• 60 GB for 2-node systems


• 80 GB for 4-node systems
• 100 GB for 8-node systems.
Show Size of .srdata
CLI command showsr

SS8400_3_Local cli% showsr


Node Total(MB) Used(MB) Used%
-----------------------------
3 84552 43006 54

Filetype info:
FileType Count TotalUsage(MB) TypeUsed% EarliestDate EndEstimate
--------------------------------------------------------------------------------------------------
aomoves 2 0.0 --- --- ---
baddb 0 0.0 --- --- ---
daily 2 120.6 1.34 2016-07-12 02:00:00 CEST 55 year(s), 142 day(s) from now
hires 7 13231.9 98.14 2017-01-17 17:05:00 CET Full or almost full
hourly 3 2836.1 25.24 2016-07-11 15:00:00 CEST 2 year(s), 71 day(s) from now
ldrg 2709 24577.4 102.08 2017-03-02 06:30:00 CET Full or almost full
perfsample 3 1.5 --- --- ---
srmain 1 0.0 --- --- ---
system 0 0.0 --- --- ---
unused --- 37251.0 46.41 --- ---
SS8400_3_Local cli%
Increase the size of .srdata, backup, export
CLI command to increase 10% of size
cli% controlsr grow -pct 10

CLI command to create a snapshot of .srdata


cli% setvv -snp_cpg FC_R1 .srdata
cli% createsv -ro srdata_backup .srdata

CLI command to export data drom .srdata


cli% controlsr export -all -btsecs -1d -csv -save -file c:\test

11
Dziękuję

18

You might also like