Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 14

1

Design and Analysis of


Algorithms
CHAPTER SIX: PROBABILISTIC ANALYSIS AND RANDOMIZED
ALGORITHMS
Definition 2
 A probabilistic analysis is an approach to estimate the computational complexity of
an algorithm based on an assumption about a probabilistic distribution of the set of
all possible inputs.
 This assumption is then used to design an efficient algorithm or to derive the
complexity of a known algorithm. This approach is not the same as that of
probabilistic algorithms, but the two may be combined.
 A probabilistic algorithms aka, randomized algorithms are.
 algorithms that employs a degree of randomness as part of its logic or procedure.
 The algorithm typically uses uniformly random bits as an auxiliary input to guide its
behavior, in the hope of achieving good performance in the "average case" over all
possible choices.
Cont... 3
Non-deterministic (Randomized)
Deterministic Algorithm
Algorithm

Input Algorithm Output Input Algorithm Output

Random Number
Generator

 A randomized algorithm is one that makes random choices during its execution. The behavior of
such an algorithm may thus, be random even on a fixed input.
 The design and analysis of a randomized algorithm focuses on establishing that it is likely to
behave “well” on every input; the likelihood in such a statement depends only on the probabilistic
choices made by the algorithm during execution and not on any assumptions about the input.
 Two benefits of randomized algorithms have made them popular: simplicity and efficiency. For
many applications, a randomized algorithm is the simplest algorithm available, or the fastest, or
both.
Hiring Problem 4
 Suppose that you need to hire a new office assistant. Your previous attempts at
hiring have been unsuccessful, and you decide to use an employment agency.
 The agency sends you one candidate per day: interview and decide.
 Cost to interview is ci per candidate (fee to agency).
 Cost to hire is ch per candidate (includes firing prior assistant and fee to agency).
 ch > ci
 You always hire the best candidate seen so far.
Cont… 5
HIRE-ASSISTANT(n)
1. best = 0 // candidate 0 is a least-qualified dummy candidate
2. for i = 1 to n
3. interview candidate i
4. if candidate i is better than candidate best
5. best = i
6. hire candidate i
 What is the cost of this strategy?
 If we interview n candidates and hire h of them, cost is O(ci.n + ch.m)
 We interview all n and ci is small, so we focus on chm.
 ch.m varies with each run and depends on interview order
 This is a common paradigm: finding the maximum or minimum in a sequence by examining
each element, and changing the winner m times.
Cont… 6
 Best Case
 If each candidate is worse than all who came before, we hire one candidate:
 O(cin + ch) = O(cin)
 Worst Case
 If each candidate is better than all who came before, we hire all n (m = n):
 O(cin + chn) = O(chn) since ch > ci
 But this is pessimistic. What happens in the average case? That's where, the probabilistic analysis
comes into mind.
 For probabilistic analysis
 We must know or make assumptions about the distribution of inputs.
 The expected cost is over this distribution.
 The analysis will give us average case running time.
Cont… 7
 We don't have this information for the Hiring Problem, but suppose we could assume that candidates
come in random order. Then the analysis can be done by counting permutations:
 Each ordering of candidates (relative to some reference ordering such as a ranking of the candidates) is
equally likely to be any of the n! permutations of the candidates.
 In how many do we hire once? twice? three times? ... n−1 times? n times?
 It depends on how many permutations have zero, one two ... n−2 or n−1 candidates that come before a
better candidate.
 This is complicated!
 Instead, we can do this analysis with indicator variables…
Randomized Hiring 8
 We might not know the distribution of inputs or be able to model it. Instead we randomize within the algorithm
to impose a distribution on the inputs.
 An algorithm is randomized if its behavior is determined in parts by values provided by a random number
generator.
 This requires a change in the hiring problem scenario: The employment agency sends us a list of n candidates in
advance and lets us choose the interview order. We choose to interview them in random order. Thus we take
control of the question of whether the input is randomly ordered: we enforce random order, so the average case
becomes the expected value.
HIRE-ASSISTANT(Randomize(n)) //How?
1. best = 0 // candidate 0 is a least-qualified dummy candidate
2. for i = 1 to n
3. interview candidate I
4. if candidate i is better than candidate best
5. best = i
6. hire candidate i
Probabilistic Analysis with Indicator Random Variables 9
 Here we introduce technique for computing the expected value of a random variable, even when
there is dependence between variables. Two informal definitions will get us started:
 A random variable (e.g., X) is a variable that takes on any of a range of values according to a
probability distribution.
 The expected value of a random variable (e.g., E[X]) is the average value we would observe if we
sampled the random variable repeatedly.
 Indicator Random Variables
 Given sample space S and event A in S, define the indicator random variable
I{A}=
 We will see that indicator random variables simplify analysis by letting us work with the probability of the
values of a random variable separately.
Cont… 10
 Lemma 1
 For an event A, let XA = I{A}. Then the expected value E[XA] = Pr{A} (the probability of event A).
Proof: Let ¬A be the complement of A. Then
E[XA] = E[I{A}] (by definition)
= 1*Pr{A} + 0*Pr{¬A} (definition of expected value)
= Pr{A}.
Example: Consider the scenario of flipping a coin.
 What is the expected number of heads when flipping a fair coin once?
 Sample space S = {H, T}
 Pr{H} = Pr{T} = 1/2
 Define indicator random variable XH= I{H}, which counts the number of heads in one flip.
 Since Pr{H} = 1/2, Lemma 1 says that E[XH] = 1/2.
Cont… 11
 What is the expected number of heads when we flip a fair coin n times?
 Let X be a random variable for the number of heads in n flips.
 We could compute E[X] = that is, compute and add the probability of there being 0 heads total, 1 head total,
2 heads total ... n heads total.
 Instead use indicator random variables to count the probability of getting heads :
 For i = 1, 2, ... n define Xi = I{the ith flip results in event H}.
 Then X = and Lemma 1 says that E[Xi] = Pr{H} = 1/2 for i = 1, 2, ... n.
 Expected number of heads is E[X] = E[ ]
 Problem: We don't have ; we only have E[X1], E[X2], ... E[Xn].
 Solution:
 Linearity of expectation: expectation of sum equals sum of expectations. So
For hiring… 12
 Assume that the candidates arrive in random order.
 Let X be the random variable for the number of times we hire a new office assistant.
 Define indicator random variables X1, X2, ... Xn where Xi = I{candidate i is hired}.
 We will rely on these properties:
 X = X1 + X2 + ... + Xn (The total number of hires is the sum of whether we did each individual hire (1) or
not (0).)
 Lemma 1 implies that E[Xi] = Pr{candidate i is hired}.
 We need to compute Pr{candidate i is hired}:
 Candidate i is hired iff candidate i is better than candidates 1, 2, ..., i−1
 Assumption of random order of arrival means any of the first i candidates are equally likely to be the best
one so far.
 Thus, Pr{candidate i is the best so far} = 1/i.
For hiring… 13
 By Lemma 1, E[Xi] = 1/i, a fact that lets us compute E[X]:

 The sum is a harmonic series.

 Therefore, the expected hiring cost is O(ch ln n), much better than worst case O(chn)!
Category of RA… 14
 As we have seen, randomized algorithm is an algorithm that employs a degree of randomness as part
of its logic or procedure in the hope of achieving good performance in the "average case" over all
possible choices.
 We have two category of randomized algorithms:
1. Las-Vegas RA: A randomized algorithm is called a Las Vegas algorithm if it always returns the correct
answer, but its runtime bounds hold only in expectation. In Las Vegas algorithms, runtime is at the mercy of
randomness, but the algorithm always succeeds in giving a correct answer.
Example: Randomized Quicksort
2. Monte-Carlo RA: A randomized algorithm is called a Monte Carlo algorithm if it may fail or return
incorrect answers, but has runtime independent of the randomness.
Example: Randomized Primality Testing

You might also like