Open Mind Tree: Lecture Notes on Skip Lists

Introduction to Algorithms	October 25, 2005
Massachusetts Institute of Technology	6.046J/18.410J
Professors Erik D. Demaine and Charles E. Leiserson	Handout 17

Lecture Notes on Skip Lists

Lecture 12 — October 26, 2005

Erik Demaine

•Balanced tree structures we know at this point: red-black trees, B-trees, treaps.

•Could you implement them right now? Probably, with time. . . but without looking up any details?

•Skip lists are a simple randomized structure you’ll never forget.

Starting from scratch

•Initial goal: just searches — ignore updates (Insert/Delete) for now

•Simplest data structure: linked list

•Sorted linked list: (n) time

•2 sorted linked lists:

–Each element can appear in 1 or both lists

–How to speed up search?

–Idea: Express and local subway lines

–Example: 14 , 23, 34 , 42 , 50, 59, 66, 72 , 79, 86, 96 , 103, 110, 116, 125

(What is this sequence?)

–Boxed values are “express” stops; others are normal stops

–Can quickly jump from express stop to next express stop, or from any stop to next normal stop

–Represented as two linked lists, one for express stops and one for all stops:

103

110

116

125

–Every element is in bottom linked list (L2); some elements also in top linked list (L1)

–Link equal elements between the two levels

–To search, ﬁrst search in L1 until about to go too far, then go down and search in L2

2					Handout 17: Lecture Notes on Skip Lists
– Cost:			+ \|L2\| = L					n
		L	+ \|L2\| = L			+		n
– Minimized when	\|		1\|	\|L1\|	\| 1\|		\|L1\|
– Minimized when
					n
				\|L1\| = \|L1\|
		=		\|L1\|2 = n
		=		\|L1\| =	8
		=		\|L1\| =	n			8
		= search cost = 2						n
– Resulting 2-level structure:
sqrt n
sqrt n	sqrt n				sqrt n			sqrt n
• 3 linked lists: 3 · 83 n
• k linked lists: k · 8k n
• lg n linked lists: lg n · lg8n n = lg n · n1/ lg n = (lg n)
			=2

– Becomes like a binary tree:

14								79
14				50				79				110
14		34		50		66		79		96		110		125
14	23	34	42	50	59	66	72	79	86	96	103	110	116	125

–(In fact, a level-linked B+-tree; see Problem Set 5.)

–Example: Search for 72

Level 1: 14 too small, 79 too big; go down 14

Level 2: 14 too small, 50 too small, 79 too big; go down 50Level 3: 50 too small, 66 too small, 79 too big; go down 66Level 4: 66 too small, 72 spot on

Handout 17: Lecture Notes on Skip Lists

Insert

•New element should certainly be added to bottommost level (Invariant: Bottommost list contains all elements)

•Which other lists should it be added to?

(Is this the entire balance issue all over again?)

•Idea: Flip a coin

–With what probability should it go to the next level?

–To mimic a balanced binary tree, we’d like half of the elements to advance to the next- to-bottommost level

–So, when you insert an element, ﬂip a fair coin

–If heads: add element to next level up, and ﬂip another coin (repeat)

•Thus, on average:

–1/2 the elements go up 1 level

–1/4 the elements go up 2 levels

–1/8 the elements go up 3 levels

–Etc.

•Thus, “approximately even”

Example

•Get out a real coin and try an example

•You should put a special value -* at the beginning of each list, and always promote this special value to the highest level of promotion

•This forces the leftmost element to be present in every list, which is necessary for searching

.. . many coins are ﬂipped . . .

(Isn’t this easy?)

•The result is a skip list.

•It probably isn’t as balanced as the ideal conﬁgurations drawn above.

•It’s clearly good on average.

•Claim it’s really really good, almost always.

4	Handout 17: Lecture Notes on Skip Lists

Analysis: Claim of With High Probability

•Theorem: With high probability, every search costs O(lg n) in a skip list with n elements

•What do we need to do to prove this? [Calculate the probability, and show that it’s high!]

•We need to deﬁne the notion of “with high probability”; this is a powerful technical notion, used throughout randomized algorithms

•Informal deﬁnition: An event occurs with high probability if, for any → 1, there is an appropriate choice of constants for which E occurs with probability at least 1 - O(1/n )

•In reality, the constant hidden within O(lg n) in the theorem statement actually depends on c.

•Precise deﬁnition: A (parameterized) event E occurs with high probability if, for any→1, E occurs with probability at least 1 - c /n , where c is a “constant” depending only on .

•The term O(1/n ) or more precisely c /n is called the error probability

•The idea is that the error probability can be made very very very small by setting to something big, e.g., 100

Analysis: Warmup

•Lemma: With high probability, skip list with n elements has O(lg n) levels

•(In fact, the number of levels is (log n), but we only need an upper bound.)

•Proof:

– Pr{element x is in more than c lg n levels} = 1/2c lg n = 1/nc

– Recall Boole’s inequality / union bound:

Pr{E1 = E2 = · · · = Ek } Pr{E1} + Pr{E2} + · · · + Pr{Ek}

–Applying this inequality:

Pr{any element is in more than c lg n levels} n · 1/nc = 1/nc-1

–Thus, error probability is polynomially small and exponent ( = c - 1) can be made arbitrarily large by appropriate choice of constant in level bound of O(lg n)

Handout 17: Lecture Notes on Skip Lists

Analysis: Proof of Theorem

•Cool idea: Analyze search backwards—from leaf to root

–Search starts at leaf (element in bottommost level)

–At each node visited:

If node wasn’t promoted higher (got TAILS here), then we go [came from] leftIf node was promoted higher (got HEADS here), then we go [came from] up

–Search stops at root of tree

•Know height is O(lg n) with high probability; say it’s c lg n

•Thus, the number of “up” moves is at most c lg n with high probability

•Thus, search cost is at most the following quantity:

How many times do we need to ﬂip a coin to get c lg n heads?

• Intuitively, (lg n)

Analysis: Coin Flipping

•Claim: Number of ﬂips till c lg n heads is (lg n) with high probability

•Again, constant in (lg n) bound will depend on

•Proof of claim:

–Say we make 10c lg n ﬂips

–When are there at least c lg n heads?

1 c lg n 1 9c lg n

–Pr{exactly c lg n heads} = c lg n · 2 · 2

HHHTTTordersvs. HTHTHT

heads

tails

– Pr{at most c lg n heads} 10cclglgnn ·

9c lg n

overestimateon orders

tails

– Recall bounds on xy :

x xy e xy

6	Handout 17: Lecture Notes on Skip Lists

– Applying this formula to the previous equation:

Pr{at most c lg n heads}		10cclglgnn	21 9c lg n
		e ·c10lgcnlg n c lg n · 21		9c lg n

	=	(10e)c lg n ·	21 9c lg n
	=	2lg(10e)·c lg n · 21 9c lg n
	=	2(lg(10e)-9)c lg n
	=	2- lg n
	=	1/n

–The point here is that, as 10 *, = 9 - lg(10e) *, independent of (for all) c

•End of proof of claim and theorem

Acknowledgments

This lecture is based on discussions with Michael Bender at SUNY Stony Brook.

Open Mind Tree

Lecture Notes on Skip Lists

0 comments:

Popular Posts

Visitors

Archives

Infolinks In Text Ads

Featured Posts

Blogger Tips

14								79
14				50				79				110
14		34		50		66		79		96		110		125
14	23	34	42	50	59	66	72	79	86	96	103	110	116	125

14								79
14				50				79				110
14		34		50		66		79		96		110		125
14	23	34	42	50	59	66	72	79	86	96	103	110	116	125

14								79
14				50				79				110
14		34		50		66		79		96		110		125
14	23	34	42	50	59	66	72	79	86	96	103	110	116	125