Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 27

LinuxVirtualServerTutorial

Horms(SimonHorman)[email protected]
VALinuxSystemsJapan,K.K.www.valinux.co.jp
July2003.RevisedMarch2004

http://www.ultramonkey.org/
withassistancefrom

Abstract:
TheLinuxVirtualServerProject(LVS)allowsloadbalancingofnetworkedservicessuchasweband
mailserversusingLayer4Switching.Itisextremelyfastandallowssuchservicestobescaledto
service10sor100softhousandsofsimultaneousconnections.Thepurposeofthistutorialisto
demonstratehowtousevariousfeaturesofLVStoloadbalanceInternetservices,andhowthiscanbe
madehighlyavailableusingtoolssuchassuchasheartbeatandkeepalived.Itwillalsocovermore
advancedtopicswhichhavebeenthesubjectofrecentdevelopmentincludingmaintainingactive
connectionsinahighlyavailableenvironmentandusingactivefeedbacktobetterdistributeload.

Introduction
TheLinuxVirtualServerProject(LVS)implementslayer4switchingintheLinuxKernel.Thisallows
TCPandUDPsessionstotobeloadbalancedbetweenmultiplerealservers.Thusitprovidesawayto
scaleInternetservicesbeyondasinglehost.HTTPandHTTPStrafficfortheWorldWideWebis
probablythemostcommonuse.Thoughitcanalsobeusedformoreorlessanyservice,fromemailto
theXWindowsSystem.
LVSitselfrunsonLinux,howeveritisabletoloadbalanceconnectionsfromendusersrunningany
operatingsystemtorealserversrunninganyoperatingsystem.AslongastheconnectionsuseTCPor
UDP,LVScanbeused.
LVSisveryhighperformance.Itisabletohandleupwardsof100,000simultaneousconnections.Itis
easilyabletoloadbalanceasaturated100Mbitethernetlinkusinginexpensivecommodityhardware.It
isalsoabletoloadbalancesaturated1Gbitlinkandbeyondusinghigherendcommodityhardware.

LVSBasics
ThissectionwillcoverthebasicsofhowLVSworks.HowtoobtainandinstallLVS,andhowto
configureforitsmainmodesofoperation.InshortitwillcoverhowtosetupLVStoloadbalanceTCP
andUDPservices.

Terminology
LinuxDirector:HostwithLinuxandLVSinstalledwhichreceivespacketsfromendusersand
forwardsthemtorealservers.
EndUser:Hostthatoriginatesaconnection.
RealServer:Hostthatterminatesaconnection.Thiswillberunningsomesortofdaemonsuchas
Apache.
Asinglehostmaybeactinmorethanoneoftheaboverolesatthesametime.
VirtualIPAddress(VIP):TheIPaddressassignedtoaservicethataLinuxDirectorwillhandle.
RealIPAddress(RIP):TheIPaddressofaRealServer.

Layer4Switching

Figure1:LVSNAT
Layer4SwitchingworksbymultiplexingincomingTCP/IPconnectionsandUDP/IPdatagramstoreal
servers.PacketsarereceivedbyaLinuxDirectorandadecisionismadeastowhichrealserverto
fowardthepacketto.Oncethisdecisionismadesubsequentpacketstoforthesameconnectionwillbe
senttothesamerealserver.Thus,theintegrityoftheconnectionismaintained.

ForwardingPackets
TheLinuxVirtualServerhasthreedifferentwaysofforwardingpackets;networkaddresstranslation
(NAT),IPIPencapsulation(tunnelling)anddirectrouting.

NetworkAddressTranslation(NAT):Amethodofmanipulatingthesourceand/ordestination
portand/oraddressofapacket.ThemostcommonuseofthisisIPmasqueradingwhichisoften
usedtoenableRFC1918[2]privatenetworkstoaccesstheInternet.Inthecontextoflayer4
switching,packetsarereceivedfromendusersandthedestinationportandIPaddressare
changedtothatofthechosenrealserver.Returnpacketspassthroughthelinuxdirectorat

whichtimethemappingisundonesotheenduserseesrepliesfromtheexpectedsource.

DirectRouting:Packetsfromendusersareforwardeddirectlytotherealserver.TheIPpacket
isnotmodified,sotherealserversmustbeconfiguredtoaccepttrafficforthevirtualserver'sIP
address.Thiscanbedoneusingadummyinterfaceorpacketfilteringtoredirecttraffic
addressedtothevirtualserver'sIPaddresstoalocalport.Therealservermaysendreplies
directlybacktotheenduser.Thus,thelinuxdirectordoesnotneedtobeinthereturnpath.

IPIPEncapsulation(Tunnelling):AllowspacketsaddressedtoanIPaddresstoberedirectedto
anotheraddress,possiblyonadifferentnetwork.Inthecontextoflayer4switchingthe
behaviourisverysimilartothatofdirectrouting,exceptthatwhenpacketsareforwardedthey
areencapsulatedinanIPpacket,ratherthanjustmanipulatingtheethernetframe.Themain
advantageofusingtunnellingisthatrealserverscanbeonadifferentnetworks.

Figure2:LVSDirectRouting

VirtualServices
OntheLinuxDirectoravirtualserviceisdefinedbyeitheranIPaddress,portandprotocol,ora
firewallmark.Avirtualservicemayoptionallyhaveapersistancetimeoutassociatedwithit.Ifthisis
setandaconnectionisreceivedfromthesameIPaddressbeforethetimeouthasexpired,thenthe
connectionwillbeforwardedtothesamerealserverastheoriginalconnection.

IPAddress,PortandProtocol:Avirtualservermaybespecifiedby:
AnIPAddress:TheIPaddressthatenduserswillusetoaccesstheservice.
Aport:Theportthatenduserswillconnectto.
Aprotocol.EitherUDPorTCP.

FirewallMark:Packetsmaybemarkedwitha32bitunsignedvalueusingipchainsoriptables.
TheLinuxVirtualServerisabletouseusethismarktodesignatepacketsdestinedforavirtual
serviceandroutethemaccordingly.Thisisparticularlyusefulifalargenumberofcontiguous
IPbasedvirtualservicesarerequiredwiththesamerealservers.Ortogrouppersistence
betweendifferentports.Forinstancetoensurethatagivenenduserissenttothesamereal
serverforbothHTTPandHTTPS.

Scheduling
Thevirtualserviceisassignedaschedulingalgorithmthatisusedtoallocateincomingconnectionsto
therealservers.InLVStheschedulersareimplementedasseparatekernelmodules.Thusnew
schedulerscanbeimplementedwithoutmodifyingthecoreLVScode.

Therearemanydifferentschedulingalgorithmsavailabletosuitavarietyofneeds.Thesimplestare
roundrobinandleastconnected.Theseworkusingasimplestrategyofallocatingconnectionstoeach
realserverinturnandallocatingconnectionstotherealserverwiththeleastnumberofconnections
respectively.Weightedvariantsoftheseschedulersallowconnectionstobeallocatedproportionalto
theweightingoftherealserver,morepowerfulrealserverscanbesetwithahigherweightandthus,
willbeallocatedmoreconnections.
Morecomplexschedulingalgorithmshavebeendesignedforspecialisedpurposes.Forinstanceto
ensurethatrequestsforthesameIPaddressaresenttothesamerealserver.Thisisusefulwhenusing
LVStoloadbalancetransparentproxies.

InstallingLVS
Somedistributions,suchasSuSEshipwithkernelsthathaveLVScompiledin.Inthesecases
installationshouldbeaseasyasinstallingthesuppliedipvsadmpackage.Atthetimeofwriting
UltraMonkeyprovidespackagesbuiltagainstDebianSid(Unstable)andWoody(Stable/3.0)and
RedHat7.3and8.0.Detailedinformationonhowtoobtainandinstallthesepackagescanbefoundon
www.ultramonkey.org.TherestofthissectionwilldiscusshowtoinstallLVSfromsourceasitis
usefultounderstandhowthisprocessworks.
EarlyversionsofLVSworkedwithLinux2.2serieskernels.Thisimplementationinvolvedextensive
patchingoftheKernelsources.Thus,eachversionofLVSwascloselytiedtoaversionoftheKernel.
Thenetfilterpacketfilteringarchitecture[4]whichispartofthe2.4kernelshasallowedLVStobe
implementedalmostexclusivelyasasetofkernelmodules.TheresultisthatLVSisnolongertied
closelytoanindividualkernelrelease.LVSmayalsobecompileddirectlyintothekernel.However,
thisdiscussionwillfocusonusingLVSasamoduleasthisapproachiseasierandmoreflexible.
1. ObtainandUnpackKernel
Itisalwayseasiesttostartwithafreshkernel.Youcanobtainthisfromwww.kernel.org.This
examplewillusethe2.4.20kernel.Itcanbeunpackedusingthefollowingcommandwhich
shouldunpackthekernelintothelinux2.4.20directory.
tarjxvflinux2.4.20.tar.bz2

2. ObtainandUnpackLVS
LVScanbeobtainedfromwww.linuxvirtualserver.org.Thisexamplewilluse1.0.9.Itcanbe
unpackedusingthefollowingcommandwhichshouldpackthekernelintotheipvs1.0.9
directory.
tarzxvfipvs1.0.9.tar.gz

3. ApplyLVSPatchestoKernel
TwominorkernelpatchesarerequiredinorderfortheLVSmodulestocompile.Toapplythese
patchesusethefollowing:
cdlinux2.4.20/
patchpq<../ipvs1.0.9/linuxkernel_ksyms_c.diff
patchpq<../ipvs1.0.9/linuxnet_netsyms_c.diff

Athirdpatchisappliedtoallowinterfacestobehidden.Hiddeninterfacesdonotrespondto
ARPrequestsandareusedonrealserverswithLVSdirectrouting.
patchpq<../ipvs1.0.9/contrib/patches/hidden2.4.20pre101.diff

4. Configurethekernel
Firstensurethatthetreeisclean:
makemrproper

Nowconfigurethekernel.Thereareavarietyofwaysofdoingthisincluding
makemenuconfig,makexconfigandmakeconfig.Regardlessofthemethodthat
youuse,besuretocompileinnetfiltersupport,withatleastthefollowingoptions.Itis
suggestedthatwherepossibletheseoptionsarebuiltasmodules.
Networkingoptions>
Networkpacketfiltering(replacesipchains)
<m>IP:tunnelling
IP:NetfilterConfiguration>
<m>Connectiontracking(requiredformasq/NAT)
<m>FTPprotocolsupport
<m>IPtablessupport(requiredforfiltering/masq/NAT)
<m>Packetfiltering
<m>REJECTtargetsupport
<m>FullNAT
<m>MASQUERADEtargetsupport
<m>REDIRECTtargetsupport
<m>NAToflocalconnections(READHELP)(NEW)
<m>Packetmangling
<m>MARKtargetsupport
<m>LOGtargetsupport

5. BuildandInstalltheKernel
Asthekernelhasbeenreconfiguredthebuilddependenciesneedtobereconstructed.
makedep

Thekernelandmodulesmaynowbebuildusing:
makebzImagemodules

Toinstallthenewlybuiltmodulesandkernelrunthefollowingcommand.Thisshouldinstall
themodulesunder/lib/modules/2.4.20/andthekernelin/boot/vmlinuz2.4.20
makeinstallmodules_install

6. Updatebootloader
Inthecaseofgrubisusedasthebootloaderthenanewentryshouldbeaddedto
/etc/grub.conf.Thisexampleassumesthatthe/bootpartitionis/dev/hda3.Existing
entriesin/etc/grub.confshouldbeusedasaguide.
title2.4.20LVS
root(hd0,0)

kernel/vmlinuz2.4.20roroot=/dev/hda3

Ifthebootloaderislilothenanewentryshouldbeaddedto/etc/lilo.conf.This
exampleassumesthatthe/partitionis/dev/hda2.Existingentriesin/etc/lilo.conf
shouldbeusedasaguide.
image=/boot/vmlinuz2.4.20
label=2.4.20lvs
readonly
root=/dev/hda2

Once/etc/lilo.confhasbeenupdatedrunlilo.
lilo
AddedLinuxLVS*
AddedLinux
AddedLinuxOLD

7. Rebootthesystem.
Atyourbootloader'spromptbesuretobootthenewlycreatedkernel.
8. BuildandInstallLVS
ThecommandstobuildLVSshouldberunfromtheipvs1.0.9/ipvs/directory.Tobuild
andinstallusethefollowingcommands./kernel/source/linux2.4.20shouldbethe
rootdirectorythatthekernelwasjustbuiltin.
makeKERNELSOURCE=/kernel/source/linux2.4.20all
makeKERNELSOURCE=/kernel/source/linux2.4.20modules_install

9. BuildandInstallIpvsadm
IpvsadmistheuserspacetoolthatisusedtoconfigureLVS.Thesourcecanbefoundinthe
ipvs1.0.9/ipvs/ipvsadm/directory.Tobuildandinstallusethefollowingcommands.
makeall
makeinstall

LVSNAT
LVSNATisarguablythesimplestwaytoconfigureLVS.Packetsfromrealserversarereceivedbythe
linuxdirectorandthedestinationIPaddressisrewrittentobeoneoftherealservers.Thereturn
packetsfromtherealserverhavetheirsourceIPaddresschangedfromthatoftherealservertothe
VIP.

Figure3:LVSNATExample

LinuxDirector

EnableIPforwarding.Thiscanbedonebyaddingthefollowingto/etc/sysctl.confand
thenrunningsysctlp.
net.ipv4.ip_forward=1

Bringup172.17.60.201oneth0:0.Thisisbestdoneaspartofthenetworkingconfigurationof
yoursystem.Butitcanalsobedonemanually.
ifconfigeth0:0172.17.60.201netmask255.255.0.0broadcast172.17.255.255

ConfigureLVS
ipvsadmAt172.17.60.201:80
ipvsadmat172.17.60.201:80r192.168.6.4:80m
ipvsadmat172.17.60.201:80r192.168.6.5:80m

RealServers

Makesurereturnpacketsareroutedthroughlinuxdirector.Typicallythisisdonebysettingthe
VIPontheservernetworkthedefaultgateway.

Makesurethatthedesireddaemonislisteningonport80tohandleconnectionsfromendusers.

TestingandDebugging
Testingcanbedonebyconnectingto172.17.60.201:80fromoutsidetheservernetwork.
Runningapackettracingtoolonthelinuxdirectorsandrealserversisveryusefulfordebugging
purposes.Manysetupproblemscanberesolvedbytracingthepathofaconnectionandobservingat
whichsteppacketsfailtoappear.UsingTcpdumpwillbediscussedhereasanexample,thereare
varietyoftoolsavailableforvariousoperatingsystems.
Thefollowingtraceshowsaconnectionbeingopenedbyanenduser10.2.3.4totheVIP172.17.60.201
whichisforwardedtotherealserver192.168.6.5.Itshowspacketsbeingreceivedbythelinuxdirector

andthenforwardedtotherealserverandviceversa.Notethatthepacketsforwardedtotherealserver
stillhavetheenduser'sipaddressasthesourceaddress.Thelinuxdirectoronlychangesthe
destinationIPaddressofthepacket.Similarlyrepliesfromtherealservershavethedestinationaddress
settothatoftheenduser.ThelinuxdirectoronlyrewritesthesourceIPaddressofreplypacketssothat
itistheVIP.
tcpdumpnianyport80
12:40:40.96549910.2.3.4.34802>172.17.60.201.80:
S2555236140:2555236140(0)win5840
<mss1460,sackOK,timestamp166909970,nop,wscale0>
12:40:40.96764510.2.3.4.34802>192.168.6.5.80:
S2555236140:2555236140(0)win5840
<mss1460,sackOK,timestamp166909970,nop,wscale0>
12:40:40.966976192.168.6.5.80>10.2.3.4.34802:
S2733565972:2733565972(0)ack2555236141win5792
<mss1460,sackOK,timestamp12871109116690997,nop,wscale0>(DF)
12:40:40.968653172.17.60.201.80>10.2.3.4.34802:
S2733565972:2733565972(0)ack2555236141win5792
<mss1460,sackOK,timestamp12871109116690997,nop,wscale0>(DF)
12:40:40.97124110.2.3.4.34802>172.17.60.201.80:
.ack1win5840<nop,nop,timestamp16690998128711091>
12:40:40.97138710.2.3.4.34802>192.168.6.5.80:
.ack1win5840<nop,nop,timestamp16690998128711091>
ctrlc

ipvsadmLncanbeusedtoshowthenumberofactiveconnections.
ipvsadmLn
IPVirtualServerversion1.0.9(size=4096)
ProtLocalAddress:PortSchedulerFlags
>RemoteAddress:PortForwardWeightActiveConnInActConn
TCP172.17.60.201:80rr
>192.168.6.5:80Masq173
>192.168.6.4:80Masq184

ipvsadmLstatswillshowthenumberofpacketsandbytessentandreceivedpersecond.
ipvsadmLnstats
IPVirtualServerversion1.0.9(size=4096)
ProtLocalAddress:PortConnsInPktsOutPktsInBytesOutBytes
>RemoteAddress:Port
TCP172.17.60.201:8011417161153193740112940
>192.168.6.5:80578215679464255842
>192.168.6.4:80578955869909857098

ipvsadmLratewillshowthetotalnumberofpacketsandbytessentandreceived.
ipvsadmLnrate
IPVirtualServerversion1.0.9(size=4096)
ProtLocalAddress:PortCPSInPPSOutPPSInBPSOutBPS
>RemoteAddress:Port
TCP172.17.60.201:80562752751873941283
>192.168.6.5:8028137137934420634
>192.168.6.4:8028138137939520649

ipvsadmLzerowillzeroallthestatisticscounters.

LVSDirectRouting

Figure4:LVSDirectRoutingExample
LVSDirectRoutingworksbyforwardingpackets,unchanged,totheMACaddressesofrealservers.
Asthepacketisunmodifiedtherealserversneedtobeconfiguredtoaccepttrafficaddressedtothe
VIP.Thisismostcommonlydonebyusingahiddeninterface.
Astheincomingpacketsarenotmodifiedbythelinuxdirectorthereturnpacketsdonotneedtopass
throughthelinuxdirector.Thus,higherthroughputcanbeobtained.Itisalsoeasiertoloadbalance
servicesforendusersonthesamelocalnetworkasthereturnpacketscanbesentdirectlytotheend
userratherthanforcingthemtogothroughthelinuxdirector.

LinuxDirector

EnableIPforwarding.Thiscanbedonebyaddingthefollowingto/etc/sysctl.confand
thenrunningsysctlp.
net.ipv4.ip_forward=1

Bringup172.17.60.201oneth0:0.Thisisbestdoneaspartofthenetworkingconfigurationof
yoursystem.Butitcanalsobedonemanually.
ifconfigeth0:0172.17.60.201netmask255.255.0.0broadcast172.17.255.255

ConfigureLVS
ipvsadmAt172.17.60.201:80
ipvsadmat172.17.60.201:80r172.17.60.199:80g
ipvsadmat172.17.60.201:80r172.17.60.200:80g

Therealserverscansendreplypacketsdirectlytotheenduserswithoutthemneedingtobe
alteredbythelinuxdirector.Thus,thelinuxdirectordoesnotneedtobethegatewayforthe
realservers.
However,insomesituations,forinstancebecausethelinuxdirectorreallyisthegatewaytothe
realserver'snetwork,itisdesirabletoroutereturnpacketsfromtherealserversviathelinux
director.ThesourceaddressofthesepacketswillbetheVIP.HowevertheVIPbelongstoan
interfaceonthelinuxdirector.Thus,itwilldropthepacketsasbeingbogus.
Thereareseveralapproachestothisproblem.Probablythebestistoapplyakernelpatch
suppliedbyJulianAnastasovwhichaddprocentriesthatallowthispacketdroppingbehaviour
tobedisabledonaperinterfacebasis.Thispatchcanbeobtainedfrom
http://www.ssi.bg/~ja/#lvsgw

RealServers

Makesurereturnpacketsarenotroutedthroughlinuxdirectorunlessyouhavepatchedthe
kernelasdescribedabove.

Makesurethatthedesireddaemontohandleconnectionsfromendusersislisteningonport80

Bringup172.17.60.201ontheloopbackinterface.Thisisbestdoneaspartofthenetworking
configurationofyoursystem.Butitcanalsobedonemanually.OnLinuxthiscanbedone
usingthefollowingcommand.
ifconfiglo:0172.17.60.201netmask255.255.255.255

Notethatthenetmaskshouldbe255.255.255.255,regardlessoftheactualnetmaskofthe
networkthat172.17.60.201belongsto.Thisisbecauseontheloopbackinterfacetheall
addressescoveredbythenetmaskareboundtotheinterface.Thetypicalcaseis127.0.0.1with
anetmaskof255.0.0.0whichsetsuptheloopbackinterfacetoacceptallof127.0.0.0/8.Thus,
asweonlywantlo:0toacceptpacketsfor172.17.60.201thenetmaskmustbe255.255.255.255.

Hideloopback.OnLinuxrealserversitisneccessarytohidetheloopbackinterfacetoprevent
themfromrespondingtoARPrequestsfortheVIP.Thiscanbedonebyapplyingthehidden
interfacepatchdiscussedintheInstallingLVSsection.Toactivatethepatch,addthefollowing
linesto/etc/sysctl.confandthenrunsysctlp.
#Enableconfigurationofhiddendevices
net.ipv4.conf.all.hidden=1
#Maketheloopbackinterfacehidden
net.ipv4.conf.lo.hidden=1

TestingandDebugging
Testingcanbedonebyconnectingto172.17.60.201:80fromanynetwork.
DebuggingcanbedoneusingipvadmandpackettracingasperLVSNAT.However,notethatwhen
thepacketsareforwardednoaddresstranslationtakesplace.Alsonotethatasthereturnpacketsarenot
handledbyLVStheyaresentdirectlytotheenduserbytherealservertheoutgoingpacketandbyte
statisticswillbezero.

LVSTunnel

Figure5:LVSTunnelExampleSameTopologyastheLVSDirectRoutingExample
LVStunnellingworksinaverysimilarmannertodirectrouting.Themaindifferenceisthatpackets
areforwardedtotherealserversusingIPencapsulatedinIP,ratherthanjustsendinganewethernet
frame.Themainadvantageofthisisthatrealserversmaybeonadifferentnetworktothelinux
director.

LinuxDirector

EnableIPforwarding.Thiscanbedonebyaddingthefollowingto/etc/sysctl.confand

thenrunningsysctlp.
net.ipv4.ip_forward=1

Bringup172.17.60.201oneth0:0.Again,thisisbestdoneaspartofthenetworking
configurationofyoursystem.Butitcanalsobedonemanually.
ifconfigeth0:0172.17.60.201netmask255.255.0.0broadcast172.17.255.255

ConfigureLVS
ipvsadmAt172.17.60.201:80
ipvsadmat172.17.60.201:80r172.17.60.199:80i
ipvsadmat172.17.60.201:80r172.17.60.200:80i

Ifyouwishtousethelinuxdirectorasagatewayrouterfortherealservers,whichisnot
necessary,pleaseseeinformationonhowtopatchthekerneltodothisinthedirectrouting
section.

RealServers

Makesurereturnpacketsarenotroutedthroughlinuxdirectorunlessyouhavepatchedthe
kernelasdescribedinthedirectroutingsection.

Makesurethatthedesireddaemonisrunningonport80toacceptconnectionsfromtheend
users.

Bringup172.17.60.201ontunl0.Again,thisisbestdoneaspartofthenetworking
configurationofyoursystem.Butitcanalsobedonemanually.
ifconfigtunl0172.17.60.201netmask255.255.255.255

Enableforwardingandhideloopback.Thiscanbedonebyaddinglinesto
/etc/sysctl.confandthenrunningsysctlp.
net.ipv4.ip_forward=1
#Enableconfigurationofhiddendevices
net.ipv4.conf.all.hidden=1
#Makethetunl0interfacehidden
net.ipv4.conf.tunl0.hidden=1

TestingandDebugging
Testingcanbedonebyconnectingto172.17.60.201:80fromanynetwork.DebuggingisasperLVS
directrouting.

HighAvailability
LVSisaneffectivewaytoloadbalancenetworkedservices.Typicallythismeansthatseveralservers
willact,asfarasendusersareconcerned,asiftheywereasingleserver.Unfortunately,themore
serversthatareinthesystem,thegreaterthechancethatasingleserverwillfail.Thus,itisimportant

tomakeuseofhighavailabilitytechniquestoensurethatthevirtualserviceismaintainedevenif
individualserversfail.

Heartbeat
HeartbeatbeusedtomonitorapairoflinuxdirectorsandensurethatoneofthemownstheVIPatany
giventime.Itworksbyeachhostperiodicallysendingaheartbeatmessage.Ifnoheartbeatmessageis
receivedforapredeterminedperiodoftimethenthehostisconsideredtohavefailed.Whenthisoccurs
resourcescanbetakenover.Heartbeathasamodulardesignthatallowsarbitraryresourcestobe
defined.
ForthesakeofthisdiscussionwewillbeusinganIPaddressasaresource.Whenfailoveroccursthe
IPaddressisobtainedusingamethodknownasIPaddresstakeover.Thisworksbythenewly
activatedlinuxdirectorsendinggratuitousARPpacketsfortheVIP.Allhostsonthenetworkshould
receivetheseARPpacketsandthussendsubsequentpacketsfortheVIPtothenewlinuxdirector.
Heartbeatcanbeobtainedfromwww.linuxha.org.Itcanalsobeinstalledbyusingthepackages
providedorbuiltfromsourceusingthefollowingcommands.
./ConfigureMebuild
make
makeinstall

SampleConfiguration

Figure6:HeartbeatExample
Configurationisdoneusingthreefilesthatcanbefoundin/etc/ha.d.

ha.cf:Thisconfiguresthebaseparametersforheartbeatsuchaswhichinterfacestousefor
communication,howoftentosendmessagesandwheretowritelogsto.Notethatthenode
namesusedmustmatchtheoutputofunamenonthemembernodes.
logfacilitylocal0
keepalive2
deadtime10
warntime10
initdead10
nice_failbackon
mcasteth0225.0.0.769411
nodewalter
nodewendy

haresources:Setstheresourcesthataremanagedbyheartbeat.
walter172.17.60.201/24/eth0

authkeys:Setsthesecuritymechanismforinterheartbeatcommunication.Thisfilemustbe
mode600.
auth2
2sha1ultramonkey

LVSshouldbeconfiguredthesamewayonbothlinuxdirectors.ForthisexampletheLVStunnel
configurationdiscussedearlierwillbeused.DirectRoutingandNATmayalsobeused.
AstheVIP,172.17.60.201ismanagedbyheartbeatitshouldnotbebroughtuponthelinuxdirectors
byothermeans.
Heartbeatshouldbestartedonbothlinuxdirectors.AfterafewmomentstheVIPshouldbebroughtup
onwhicheverlinuxdirectoristhemaster.

TestingandDebugging
Reboottheactivelinuxdirectorandobservethattheotherlinuxdirectortakesover.Youcanexamine
theprogressofthetakeoverbyexaminingthelogssenttosyslog,typicallyfoundin
/var/log/messages.Asnice_failbackison,thecurrentlyactivelinuxdirectorwillnowact
asthemasterandwhenthefailedlinuxdirectorcomesbackonlineitwillactasastandby.

Ipfail
Thedesignofheartbeatissuchthatifanycommunicationchannelisavailabletoahost,thenitwillbe
consideredtobeavailable.Thisisnotalwaysthedesiredbehaviour.Forexampleifapairofhostshave
linksontheinternalandexternalnetwork,itmaybedesirableforfailovertooccurifeitherlinkfails
ononehost.Afterallitcannolongercommunicateroutetrafficbetweenendusersandtherealservers.

Figure7:HeartbeatwithoutIPfail
Theipfailpluginforheartbeatmakesthispossiblebymonitoringoneormoreexternalhostsknownas
apingnode.Typicallythiswouldbearouterortheswitchitself.Thepingnodeistreatedasaquorum
device.Thatis,ifahostcannotaccessapingnode,itisnoteligibletoholdanyresources.Thus,ifan
interfacefailsontheactivelinuxdirector,thenoneofthepingnodesshouldbecomeunavailableand
failoverwilloccur.

Figure8:HeartbeatwithIPfail
Theipfailmoduleisshippedaspartofheartbeat.Additionalinformationisavailablefrom
http://pheared.net/devel/c/ipfail/.Inthelongtermthiswillbeintegratedintothe
heartbeatdocumentation.

SampleConfiguration
Touseipfailwiththeheartbeatsetupdiscussedpreviously,thefollowingshouldbeaddedtoheartbeat's
ha.cffile.
ping172.17.0.254
respawnhacluster/usr/lib/heartbeat/ipfail

Apingdirectiveshouldbeaddedforeachpingnode.Ihaveonlydefinedonefortheexternalnetwork,
astherearenosuitablequorumdevicesontheinternalnetworkinthisdemonstration.
Therespawndirectivetellsheartbeattorun/usr/lib/heartbeat/ipfailasuserhacluster.Torerunitifit

exitswithastatusotherthan100,andtokillitwhenheartbeatexits.
Afteraddingtheseoptionsheartbeatneedstoberestarted.
/etc/init.d/heartbeatrestart

TestingandDebugging
TestinganddebuggingcanbedoneasperHeartbeatitself.

Ldirectord
Heartbeatisusedtomonitorthehealthoflinuxdirectors.Ldirectordcanbeusedtomonitorthehealth
ofrealserversandmanipulatestheLVSkerneltableaccordingly.Ldirectordandheartbeatareoften
usedintandemtocreateahighavailabilityLVScluster.
Ldirectordchecksservicesontherealserversbyconnectingtothem,makingaknownrequestand
checkingtheresultforaknownstring.ChecksareprovidedforHTTP,HTTPS,FTP,IMAP,POP,
SMTP,LDAPandNNTP.Additionalcheckscanbeaddedbymodifyingthecode,whichisusually
quitestraightforward.Infactmanyofthechecksincorporatedbyldirectordhavebeensuppliedas
patchesbyusers.
Thechecksemanticsaboveareknownasanegotiatecheck.Anothertypeofcheck,theconnectcheck,
simplycheckstomakesureaconnectioncanbeopenedtotheserviceontherealserver.Thisisuseful
ifthereisnotacheckfortheprotocolsuppliedbyldirectord.

SampleConfiguration
Ldirectordisconfiguredusingtheldirectord.cffile.Ithasglobaldirectiveswhicheithersetglobal
options,suchaswheretologerrorsto,ordefaultsforthevirtualservices.Thevirtualservices
encapsulateavirtualserviceprovidedbyLVS.Thevirtualservicescontaintherealserverswhichare
checked.
#GlobalDirectives
checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=yes
#VirtualServerforHTTP
virtual=172.17.60.201:80
fallback=127.0.0.1:80
real=192.168.6.4:80masq
real=192.168.6.5:80masq
service=http
request="index.html"
receive="TestPage"
scheduler=rr
protocol=tcp
checktype=negotiate

Ldirectordmaybestartedbyrunningtheldirectordcommand,theldirectordinitscriptorbyaddingit
asaresourcetoheartbeat.Thereisnoparticularadvantagetothelatterasldirectordcanhappilyrunon
themasterandstandbylinuxdirectorsatthesametime.
OnceldirectordhasstartedtheLVSkerneltablewillbepopulated.
ipvsadmLn
IPVirtualServerversion1.0.7(size=4096)
ProtLocalAddress:PortSchedulerFlags
>RemoteAddress:PortForwardWeightActiveConnInActConn
TCP172.17.60.201:80rr
>192.168.6.5:80Masq100
>192.168.6.4:80Masq100
>127.0.0.1:80Local000

BydefaultldirectordusesthequiescentfeatureofLVStoaddandremoverealservers.Thatis,whena
realserveristoberemoveditsweightissettozeroanditremainspartofthevirtualservice.Thishas
theeffectthatexitingconnectionstotherealservermaycontinue,butnonewconnectionswillbe
allocated.Thisisparticularlyusefulforgracefullytakingrealserversoffline.Thisbehaviourcanbe
changedtoremovetherealserverfromthevirtualservicebysettingtheglobalconfigurationoption
quiescent=no.

TestingandDebugging
Testingcanbedonebybringingtherealserversupanddown.Bychangingthecontentsoftheknown
URLthatisbeingrequestedsuchthatitdoesnotcontaintheexpectedstring.Bykillingthedaemon
thatservesendusers'requests.Orbypoweringdownthehostalltogether.
IneachcaseldirectordshouldupdatetheLVSkerneltableaccordinglywhichcanbeexaminedusing
ipvsadmLn.Ldirectordalsologsitsactivities,theconfigurationabovesetstheselogstobe
writtentosyslog,typicallytheywillshowupin/var/log/syslog.
Forextradebugginginformationldirectordcanberunindebuggingmode,inwhichcaseitwilllog
verboselytotheterminalandwillnotdetachfromtheterminal.Thisisdonebyusingthedcommand
lineoption.Thisexamplestartsldirectordindebuggingmodewiththeconfigurationfile
ldirectord.cf,whichshouldbein/etc/ha.d/.Debuggingcanbeterminatedusingctrlc.
ldirectorddldirectord.cfstart

Keepalived
KeepalivedprovidesanimplementationoftheVRRPv2protocolwhichisspecifiedinRFC2338[1].It
isanalternativemethodofmanagingaVIPonanetworksothatitisownedbyonlyonehostatany
giventime.Thiscanbeusedtoswitchbetweenactiveandstandbylinuxdirectors.
VRRPv2worksonasimplestateengine.Hostsadvertisetheiravailability.Thehighestpriorityhost
winstheresourceandadvertisesthisfact.Allothernodesthengointothebackupstate.
ThereisanotherimplementationofVRRPv2forLinuxfromhttp://off.net/jme/vrrpd/.However,atthe
timeofwritingthekeepalivedimplementationappearstobemuchmorecomplete.
KeepalivedalsofeaturesservicelevelmonitoringofrealserversandmanipulatestheLVSkerneltable

accordingly.Theserviceteststhatareimplementedare:

TCP_CHECK:Checktomakesureaconnectioncanbeopenedtotheserviceontherealserver.
HTTP_GET:FetchaknownURLfromtherealserverandcomparethechecksumofthepageto
theexpectedchecksum.
SSL_GET:SSLversionofHTTP_GET
MISC_CHECK:Checkusinganexternalscript.

ItalsoprovidesanAPItoimplementnewchecks.
TheVRRPDandLVS/HealthCheckfeaturescanbeusedindividuallyorincombination.
Keepalivedisavailablefromkeepalived.sourceforge.net.Itcompilationisquitestraightforwardusing
./configureandmake.A.specfileforRedHatisalsoprovided.PackagesforDebianare
availableinthemainDebiantree.
Toconfigurekeepalived/etc/keepalived/keepalived.confshouldbemodified.Thisfileis
dividedupintosections.

global_defs:Globaldefinitionssuchaswheretosendemailalerts,ifatall,andthenameofthe
cluster.
vrrp_instance:EncapsulatesasetofvirtualIPaddressesassociatedwithaparticularinterface.
Eachinstanceshouldhaveauniqueid.
vrrp_sync_group:Groupstogethervrrp_instancessuchthatalltheinstanceswillbeownedbya
singlehostatanygiventime.ThiscanbeusedtoensurethatvirtualIPaddressesondifferent
interfacesalwaysenduponthesamemachine.
virtual_server:AvirtualservicehandledbyLVS.
real_server:Arealservertocheck.Containedwithinavirtual_server.

NotethattheVRRPimplementationworksonamaster/slavesystem.Soeachvrrp_instanceshouldbe
markedasa"MASTER"ononenodeanda"SLAVE"ontheothernodes.Duringtesting,itdidnot
appearpossibletoconfigurekeepalivedtohavebehaviouranalogoustoheartbeat'snice_failback.That
isanodewillholdaresourceuntilitfails,inwhichcaseanothernodewilltakeitoveruntilitinturn
fails.Itwasalsofoundthattheslavenodesshouldbegivenalowerprioritythanthemastertoavoid
spuriousfailovers.

SampleConfiguration
Forthesakeofbrevity,theexampleconfigurationfilesareinAppendixA.
Tocreatethechecksumsfortheconfigurationfile,thegenhashprogrammecanbeused.Genhashwill
connecttotheserverandrequesttheURL.Itwillthenproducealotofoutput,showingyouhowthe
datathatisbeingusedtoconstructthechecksum.Thefinallineisthechecksumwhichshouldbe
includedinkeepalived.conf.Forexample,togeneratethehashfortheURLhttp://192.168.6.5:80/the
followingcommandisused.
genhashs192.168.6.5p80u/
[lotsofoutputomitted]
90bfbce6bc089a41f1fddca9aeaba452

Tostartkeepalivedrunthekeepaliveddaemonorinitscript.Messagesareloggedtosyslogand
typicallycanbefoundin/var/log/message.AfterafewmomentstheLVSkerneltableshouldbe

populatedonbothmachines.Thiscanbeinspectedusingipvsadm.
ipvsadmLn
IPVirtualServerversion1.0.7(size=4096)
ProtLocalAddress:PortSchedulerFlags
>RemoteAddress:PortForwardWeightActiveConnInActConn
TCP172.17.60.201:80lc
>192.168.6.5:80Masq100
>192.168.6.4:80Masq100

Onthemastermachinethevirtualipaddressesshouldhavebeenadded.Thiscanbecheckedusingthe
ipcommand.
ipaddrsh
[lo:omitted]
2:eth0:<BROADCAST,MULTICAST,UP>mtu1500qdiscpfifo\_fastqlen100
link/ether00:50:56:4f:30:19brdff:ff:ff:ff:ff:ff
inet172.17.60.207/16brd172.17.255.255scopeglobaleth0
inet172.17.60.201/32scopeglobaleth0
3:eth1:<BROADCAST,MULTICAST,UP>mtu1500qdiscpfifo\_fastqlen100
link/ether00:50:56:4f:30:1abrdff:ff:ff:ff:ff:ff
inet192.168.6.3/24brd192.168.6.255scope
globaleth1inet192.168.6.1/32scopeglobaleth1

Ifafailoveroccursthesameaddressesshouldappearontheslave,andthenbackonthemasteronceit
isrestored.

NewDevelopments
ActiveFeedback
Ldirectord,keepalivedandothertoolsmonitorthehealthofrealservers.Theweightparameterallows
therelativecapacityofrealserverstobetakenintoaccount.However,thesetoolsdonotmonitorthe
realtimeservingcapacityoftherealserversanddonotallocateconnectionsproportionaltothis.
Thiscanbeparticularlyproblematicinsituationswheresomeconnectionsrequiresignificantlymore
resourcesonarealserverthanothers.Forinstance,ifsomeconnectionsareaplainHTMLfilefetched
fromdisk,ormorelikelymemory.Whileotherconnectionsinvolveprocessingofinformation,sucha
scalinganimageorretrievingpartofthepagefromadatabase.
Feedbackdimplementsaframeworkthatallowsrealtimeinformationfromfromtherealserversto
determinehowmanyconnectionstheyshouldbeallocatedrelativetoeachother.Assuch,feedbackd
implementsanactivefeedbacksystem.Feedbackdisavailablefrom
http://www.redfishsoftware.com.au/projects/feedbackd/
Feedbackdhastwokeycomponents,feedbackdagentwhichrunsontherealserversandmonitorstheir
servingcapacity.Themonitoringismodularsoarbitrarycheckscanbedefined.Thedefaultcheck
suppliedsimplymonitorsCPUloadusing/proc/stat.Thesecondcomponent,feedbackdmasterrunson
thelinuxdirectors.Itcollatesinformationfromthefeedbackdagent'swhichconnectandmanipulates
theweightsoftherealserversintheLVSkerneltableaccordingly.

Itwasfoundthatalittlebitofmassagingwasrequiredtogetittocompile.Alsomademinor
enhancementsweremadetoallowfeedbackdmastertoberestartedwithoutgiving"addressinuse"
errorsandtoallowfeedbackdagenttotimeoutthemaster.Thelatterisaworkaroundtoallow
feedbackdtoworkwithActive/StandByLinuxDirectors.Bothofthesechangeshavebeenforwarded
totheauthorandwillhopefullyshowupinthenextversion.
TheonlyconfigurationrequiredforfeedbackdmasteristoestablishtheLVSvirtualservicesthatwill
beused.Thisisdoneusingipvsadm.Thereisnoneedtoaddtherealserversasthiswillbedoneby
feedbackdmasterbymatchingtheprotocolandportinformationsentbythefeedbackagentsrunning
onrealservers.Assuchfeedbackdcanbeusedtoaddandremoverealserversontheflywithoutany
configurationofthelinuxdirector.Forexample:
ipvsadmAt172.17.60.201:80

Tostartfeedbackdmastersimplyrunthedaemononthecommandline.Noinitscriptissuppliedwith
thecurrentdistribution.
FeedbackdAgentisconfiguredbymodifying/etc/feedbackdagent.conf.Inthisfilethe
LinuxDirectorrunningfeedbackdmasterisspecifiedasaretheservicesthattherealservershould
join.
director=192.168.6.1
service=http
protocol=TCP
port=80
module=cpuload.so
forwarding=NAT

Again,torunfeedbackdagentsimplyrunthecommandonthecommandline.

Testing
Asaprimitivetest,oneoftherealserverscanbeloadedmanuallyandtheeffectsofthisontheLVS
tableonthelinuxdirectorcanbeobservedusingipvsadm.Anindepthanalysisoftheeffectsofusing
feedbackdcanbefoundinJeremyKerr'spaperonthefeedbackd[3].

ConnectionSynchronisationExistingSolution
Configuringtwolinuxdirectorsinanactive/standbyconfigurationisausefulwaytoprovidehigh
availability.Iftheactivelinuxdirectorfails,thestandbycanautomaticallytakeovertheIPaddressof
thevirtualservicesandtheclustercancontinuetofunction.However,whensuchafailoveroccurs
connectionsthatarecurrentlyinprogressareterminated.
Thisisbecausethestandbylinuxdirectordoesnowknowanythingabouttheseconnections.By
synchronisingconnectioninformationbetweentheactiveandstandbylinuxdirectorsthisproblemcan
beaverted.Thus,whenastandbylinuxdirectorbecomestheactivelinuxdirector,itwillhave
informationaboutthecurrentlyactiveconnectionsandwillbeabletocontinuetoforwardtheirpackets.
Thecriticalpieceofinformationrequirediswhichrealservertoforwardpacketsforagivenconnection
to.Thisinformationisquitesmallandthuscanbesynchronisedwithlittleoverhead.
ThereisanimplementationofconnectionsynchronisationwithinthecurrentLVScode.Itworksona

master/slavesystemwherebythelinuxdirectorconfiguredasthemastersendssynchronisation
informationforconnections.Thelinuxdirectorsconfiguredasslavesreceivethisinformationand
updatetheirLVSconnectiontableaccordingly.
Aconnectionissynchronisedoncethenumberofpacketspassesathreshold(3)andthenevery
frequency(50)packets.Thesynchronisationinformationfortheconnectionsareaddedtoaqueueand
periodicallyflushed.Thesynchronisationinformationforupto50connectionscanbepackedintoa
singlepacket.Thepacketsaresenttotheslavesusingmulticast.
Sendingandreceivingsynchronisationinformationbythemasterandslavesrespectivelyisdonebya
kernelthread.Thekernelsynchronisationthreadisstartedonthemasterandslavesusingthefollowing
commands.
ipvsadmstartdaemonmaster#RunontheMasterLinuxDirector
ipvsadmstartdaemonbackup#RunontheSlaveLinuxDirector

TestingandDebugging
ThesynchronisationofconnectionscanbemonitoredusingipvsadmLcn,whichlistsLVS
connectiontable.Connectionsshouldfirstappearonthemasterlinuxdirector.Thenafterafew
moments,whensynchronisationhasoccurs,theyshouldalsoappearontheslaves.
ipvsadmLcn#OntheMasterLinuxDirector
IPVSconnectionentries
proexpirestatesourcevirtualdestination
TCP01:00TIME_WAIT172.16.4.222:34939172.17.60.201:80192.168.6.5:80
TCP01:01TIME_WAIT172.16.4.222:34940172.17.60.201:80192.168.6.4:80
TCP15:00ESTABLISHED172.16.4.222:34941172.17.60.201:80192.168.6.5:80
ipvsadmLcn#OntheSlaveLinuxDirector
IPVSconnectionentries
proexpirestatesourcevirtualdestination
TCP01.20ESTABLISHED172.16.4.222:34939172.17.60.201:80192.168.6.5:80
TCP01.23ESTABLISHED172.16.4.222:34940172.17.60.201:80192.168.6.4:80
TCP08.99ESTABLISHED172.16.4.222:34941172.17.60.201:80192.168.6.5:80

TheoutputshowstwoconnectionsonthemasterlinuxdirectorthatareintheTIME_WAITstate,that
istheyhavebeenclosedbytheenduser.ItalsoshowsoneconnectionintheESTABLISHEDstate,
thatistheenduserandtherealserverstillhaveanopenconnectiontoeachother.
Eachoftheseconnectionshavebeensynchronisedtotheslave.Notethatontheslave,allthe
connectionsareintheESTABLISHEDstate.ThisisduetoanoptimisationintheLVScodewhereby
connectionsareonlysynchronisedwhentheyareintheESTABLISHEDstate.Thiscutsdown
unnecessarysynchronisationoverheadasthestateoftheconnectionsontheslaveisnotcritical.
Youcanfurthermonitorwhichlinuxdirectorishandlingconnectionsbyaddingthefollowingiptables
ruletoeachlinuxdirector.
iptablesAINPUTd172.17.60.201jACCEPT

ThiscanbemonitoredusingiptablesLINPUTvn.
iptablesLINPUTvn#OntheActiveLinuxDirector
ChainINPUT(policyACCEPT1553packets,211Kbytes)

pktsbytestargetprotoptinoutsourcedestination
51551ACCEPTall**0.0.0.0/0172.17.60.201
iptablesLINPUTvn#OntheStandByLinuxDirector
ChainINPUT(policyACCEPT2233packets,328Kbytes)
pktsbytestargetprotoptinoutsourcedestination
00ACCEPTall**0.0.0.0/0172.17.60.201

Totestthatconnectionsynchronisationisworkingcorrectlyopenaconnectiontothevirtualservice
whilethemasterlinuxdirectorisactive.Thencausefailovertooccur,thiscanbedonebyavarietyof
meansincludingpoweringdownthemasterlinuxdirector.Atthispointtheconnectionshouldstall.
OncetheVIPhasfailedovertotheslavelinuxdirectortheconnectionshouldcontinue.
Streamingisausefulwaytotestthis,asstreamingconnectionsbytheirnatureareopenforalongtime.
Italsoprovidesintuitivefeedbackasthevideoand/ormusicpauseandthencontinue.Itisofnotethat
byincreasingthebuffersizeofthestreamingclientsoftwarethepausecanbeeliminated.

Problems
Themainproblemwiththisimplementationisthemaster/slaverelationship.IfthemasterLinux
Directorfailsandthencomesbackonline,thenconnectionstotheslavewillnotbesynchronisedtothe
master.Thenexttimethatafailoveroccurs,thiswillcausecauseconnectionstobeterminated.This
couldbeavoidedbystartingandstoppingthemasterandbackupdaemonsasfailoversoccur.Buta
peertopeerrelationshipbetweenthesynchronisationdaemonswouldbeacleanerapproach.

ConnectionSynchronisationNewSolution
ToimprovethissituationIhavewrittenanewsynchronisationdaemonforLVS.Itworksonapeerto
peerbasiswhereanynodemaysendorreceivesynchronisationinformation.
Thenewsynchronisationdaemonrunsinuserspaceratherthanthekernel.Informationisreceived
fromLVSinthekernelviaanetlinksocket.ItisthensenttoothernodesusingmulticastUDP.Whena
daemonreceivesinformationovermulticastitreversesthisprocessbysendingtheinformationinto
LVSinthekernelviaanetlinksocket.
Theideaofmovingthecodetotheuserspacewastoallowmoresophisticatedsynchronisation
processingtotakeplace.Thisiseasiertoimplementandinmanywaysmoreappropriatelydoneinuser
spacethanthekernel.Giventhatsynchronisationisnotaparticularlyintensivetask,thereisno
particularadvantagetokeepingitinthekernel.
Thecodecomprises:

ModifiedLVSkernelmodulestoallowthesynchronisationdaemontogetinformationabout
connections.ThishasbeendonebyallowingLVStohavearbitrarysynchronisationmethods
definedandinsertedasmodules.Thedefaultbehaviouristheexistingmaster/slaveinkernel
daemons.

KernelPatchtoregisterthenewnetlinksocket

libip_vs_user_sync:Conveniencelibraryforcommunicatingusingthenetlinksocket.

ip_vs_user_sync_simple:Simplesynchronisationdaemonimplementedusingthisframework.

Availablefromwww.ultramonkey.org

Running
Installingandcompilingisabittrickyasthisisnewcodeandthereareanumberofsupportlibraries
required.Oncebuilt,makesurethattheLVSkernelsynchronisationdaemonsarenotrunningusing
ipvsadmstopdaemonandstarttheuserspacedaemonfromthecommandlineorusingthe
ip_vs_user_sync_simpleinitscript.

TestingandDebugging
Debuggingmessagesforip_vs_user_sync_simplearesenttosyslogbydefaultandaretypicallywritten
to/var/log/messages.Ifthedaemonisnotfunctioningcorrectly,itisrecommendedtorunit
withthedebugoptionenabledandhavemessagesloggedtotheterminal.Thiscanbedonemy
modifyingip_vs_user_sync_simple.conforonthecommandline.
ip_vs_user_sync_simpledebuglog_facility

Testingisasfortheexistingconnectionsynchronisationcodedescribedpreviously.However,asthere
isnomaster/backuprelationshipconnectionscanbemaintainedthroughmultiplefailovers.

ActiveActive
Active/StandByLinuxDirectorsoffergoodwaytoprovidehighavailability.However,ifoneassumes
thatalinuxdirectordoesnotfailorgettakendownformaintenanceveryoften,thenmostofthetime
onelinuxdirectorwillbeidle.Arguablythisisawasteofresources.Italsomeansthatthemaximum
throughputofthenetworkislimitedtothatofonelinuxdirector.
HavingActiveActivelinuxdirectorsaddressesthisproblembyallowingmorethanonelinuxdirector
toloadbalanceconnections,forthesamevirtualservices,atthesametime.

Figure9:ActiveActiveBlockDiagram
Ihavemadeanimplementationofthiswhichworksasfollows:

EachlinuxdirectorisgiventhesamehardwareandIPaddress
Thismeansthatallthelinuxdirectorswillreceivepacketsforconnectionsforthevirtual
service.
ItalsomeansthatthereisnolongeranyneedforipaddressfailoverorVRRPv2.

Aheartbeathelper,Sarurunswithheartbeatoneachlinuxdirector.
Heartbeatdoesn'tallocateanyresources,justprovidesamechanismtodeterminewhich
linuxdirectorsareavailable.
Saruusesthisinformationtodividethespaceofallpossibleincomingconnections
betweenthelinuxdirectors.
Thisisdonebyelectingamasterwhichwillmaketheallocations.
Theallocationsaredonebydividingupblocksofsourceordestinationportsor
addresses.

AnetfilterkernelmoduleisusedtoonlyacceptpacketsasdictatedbySaru.

Running
Sarucanberundirectlyfromthecommandlineorusingitsowninitscript.Messagesareloggedto
syslogafterstartupandthesetypicallyappearin/var/log/messages.Sarucanlogmore
verboselybysettingthedebugoption,eitherinsaru.confordirectlyonthecommandline.For
debuggingpurposesthisoptionisrecommendedinconjunctionwithhavingsarulogtotheterminal.
sarudebuglog_facility

Bydefaultsaruwaits30secondsafterstartupbeforejoiningthecluster.Thisistoallowtimefor
connectionsynchronisationtooccurwhenaLinuxDirectorbootsup.Thiscanbeconfiguredatrun
time,againeitherinsaru.confordirectlyonthecommandline.
TheMACandIPaddressofaninterfacecanbesetusingtheipcommand.
iplinkseteth0down
iplinkseteth0address00:50:56:14:03:40
iplinkseteth0up
iprouteadddefaultvia172.16.0.254
ipaddradddeveth0192.168.20.40/24broadcast255.255.255.0

RulestofilteroutalltraffictotheVIPthatarenotacceptedbySaruareinsertedusingtheiptables
command.Theserulesassumethatconnectionsynchronisationwillbeused,Ifthisisnotthecasethen
netfilter'sconnectiontrackingshouldbeusedtoensurethatagivenconnectionwillalwaysbehandled
bythesamelinuxdirector.
iptablesF
iptablesAINPUTd172.17.60.201ptcpmsaruid1jACCEPT
iptablesAINPUTd172.17.60.201pudpmsaruid1jACCEPT
iptablesAINPUTd172.17.60.201picmpmicmpicmptypeechorequest\
msaruid1sensesrcaddrjACCEPT
iptablesAINPUTd172.17.60.201picmpmicmpicmptype!echorequest\
jACCEPT
iptablesAINPUTd172.17.60.201jDROP

IfLVSNATisbeingusedthenthefollowingrulesarealsorequiredtopreventalltheLinuxDirectors
sendingrepliesonbehalfofthetherealservers.
iptablestnatAPOSTROUTINGs192.168.6.0/24d192.168.6.0/24jACCEPT
iptablestnatAPOSTROUTINGs192.168.6.0/24mstatestateINVALID\
jDROP
iptablestnatAPOSTROUTINGs192.168.6.0/24mstatestateESTABLISHED\
jACCEPT
iptablestnatAPOSTROUTINGs192.168.6.0/24mstatestateRELATED\
jACCEPT
iptablestnatAPOSTROUTINGs192.168.6.0/24ptcpmstatestateNEW\
tcpflagsSYN,ACK,FIN,RSTSYNmsaruid1jMASQUERADE
iptablestnatAPOSTROUTINGs192.168.6.0/24pudpmstatestateNEW\
msaruid1jMASQUERADE
iptablestnatAPOSTROUTINGs192.168.6.0/24picmp\
micmpicmptypeechorequestmstatestateNEW\
msaruid1sensedstaddrjMASQUERADE
iptablestnatAPOSTROUTINGs192.168.6.0/24picmp\
micmpicmptype!echorequestmstatestateNEW\
jMASQUERADE

iptablestnatAPOSTROUTINGs192.168.6.0/24jDROP

TestingandDebugging
Whichlinuxdirectorisacceptingpacketsforanindividualconnectioncanbemonitoredusing
ipvsadmLINPUTnv.Theoutputbelowshowsaconnectionthatwasloadbalancedby
LinuxDirectorA.
ipvsadmLINPUTnv#OnLinuxDirectorA
ChainINPUT(policyACCEPT92541packets,14Mbytes)
pktsbytestargetprotoptinoutsourcedestination
51551ACCEPTtcp**0.0.0.0/0172.17.60.201saruid1sensesrcport
00ACCEPTudp**0.0.0.0/0172.17.60.201saruid1sensesrcport
00ACCEPTicmp**0.0.0.0/0172.17.60.201icmptype8saruid1
sensesrcaddr
00ACCEPTicmp**0.0.0.0/0172.17.60.201icmp!type8
00DROPall**0.0.0.0/0172.17.60.201
ipvsadmLINPUTnv#OnLinuxDirectorB
ChainINPUT(policyACCEPT92700packets,15Mbytes)
pktsbytestargetprotoptinoutsourcedestination
00ACCEPTtcp**0.0.0.0/0172.17.60.201saruid1sensesrcport
00ACCEPTudp**0.0.0.0/0172.17.60.201saruid1sensesrcport
00ACCEPTicmp**0.0.0.0/0172.17.60.201icmptype8saruid1
sensesrcaddr
00ACCEPTicmp**0.0.0.0/0172.17.60.201icmp!type8
51551DROPall**0.0.0.0/0172.17.60.201

Conclusion
LVSisaneffectivewaytoimplementclusteringofInternetservices.Toolssuchasheartbeat,
ldirectordandkeepalivedcanbeusedtogivetheclusterhighavailability.Thereareanumberofother
techniquesthatcanbeusedtofurtherenhanceLVSclustersincludingusingactivefeedbackto
determinetheproportionofconnectionsallocatedtoeachoftherealservers.Aswellasconnection
synchronisationandactiveactivetechniquestomultiplelinuxdirectorstobetterworktogether.
LVSitselfisaverypowerfultoolandhasmanyfeaturesthatwerenotwithinthescopeofthis
presentation.Theseinclude;firewallmarkstogroupvirtualservices,specialisedschedulingalgorithms
andvarioustuningparameters.Beyondthatthereismuchscopeforfurtherexpandingthefunctionality
ofLVStomeetthenewneedsofusersandtoreflecttheeverincreasingcomplexityoftheInternet.

SampleConfigurationfilesforkeepalived
Sampleconfigurationfileforkeepalivedmaster.
global_defs{
notification_email{
[email protected]
}
[email protected]

smtp_server210.128.90.2
smtp_connect_timeout30
lvs_idLVS_DEVEL
}
vrrp_sync_groupVG1{
group{
VI_1
VI_2
}
}
vrrp_instanceVI_1{
stateMASTER
interfaceeth0
virtual_router_id51
priority100
advert_int1
authentication{
auth_typePASS
auth_pass1111
}
virtual_ipaddress{
172.17.60.201
}
}
vrrp_instanceVI_2{
stateMASTER
interfaceeth1
virtual_router_id52
priority100
advert_int1
authentication{
auth_typePASS
auth_pass1111
}
virtual_ipaddress{
192.168.6.1
}
}
virtual_server172.17.60.20180{
delay_loop6
lb_algolc
lb_kindNAT
nat_mask255.255.255.0
!persistence_timeout50
protocolTCP
real_server192.168.6.480{
weight1
HTTP_GET{
url{
path/
digest55fd843c4e99e96c1ef28e7dbb10c51b
}
connect_timeout3
nb_get_retry3
delay_before_retry3
}

}
real_server192.168.6.580{
weight1
HTTP_GET{
url{
path/
digest90bfbce6bc089a41f1fddca9aeaba452
}
connect_timeout3
nb_get_retry3
delay_before_retry3
}
}
sorry_server127.0.0.180
}

SampleConfigurationfileforkeepalived(Slave)
global_defs{
notification_email{
[email protected]
}
[email protected]
smtp_server210.128.90.2
smtp_connect_timeout30
lvs_idLVS_DEVEL
}
vrrp_sync_groupVG1{
group{
VI_1
VI_2
}
}
vrrp_instanceVI_1{
stateSLAVE
interfaceeth0
virtual_router_id51
priority100
advert_int1
authentication{
auth_typePASS
auth_pass1111
}
virtual_ipaddress{
172.17.60.201
}
}
vrrp_instanceVI_2{
stateSLAVE
interfaceeth1
virtual_router_id52
priority100
advert_int1
authentication{
auth_typePASS
auth_pass1111

}
virtual_ipaddress{
192.168.6.1
}
}
virtual_server172.17.60.20180{
delay_loop6
lb_algolc
lb_kindNAT
nat_mask255.255.255.0
!persistence_timeout50
protocolTCP
real_server192.168.6.480{
weight1
HTTP_GET{
url{
path/
digest55fd843c4e99e96c1ef28e7dbb10c51b
}
connect_timeout3
nb_get_retry3
delay_before_retry3
}
}
real_server192.168.6.580{
weight1
HTTP_GET{
url{
path/
digest90bfbce6bc089a41f1fddca9aeaba452
}
connect_timeout3
nb_get_retry3
delay_before_retry3
}
}
sorry_server127.0.0.180
}

Bibliography
1
S.Knightetal.
Rfc2338:Virtualrouterredundancyprotocol.
http://www.ietf.org/,April1998.
2

Y.Rekhteretal.
Rfc1918:Addressallocationforprivateinternets.
http://www.ietf.org/,February1996.

JeremyKerr.
Usingdynamicfeebacktooptimiseloadbalancingdecisions.
http://www.redfishsoftware.com.au/projects/feedbackd/lcapaper.pdf,January2003.
4
NetfilterCoreTeam.
Netfilterfirewalling,natandpacketmanglingforlinux2.4.
http://www.netfilter.org/,2003.

Horms20040623

You might also like