Showing posts with label Random Selection. Show all posts
Showing posts with label Random Selection. Show all posts

Sunday, September 9, 2007

Online selection algorithm. Random Selection in Linked Lists with Unknown Length

Question: Given a finite linked list with unknown length(in terms of nodes count), select one item(multiple nodes) with equal possibility for every node in the linked list.

* One item only.
Assuming the input list has k nodes. The possibility for each node to be selected is 1/k. We need to guarantee that after we visiting m nodes, the possibility for each node is 1/m.



Proof of correctness:
When i = 0, node 0 is selected with 100% possibility.
When i = 1, node 1 has 50% chance to replace node 0 as selection result. Both of node 1 and node 0 will have equal chance to be chosen.
When i = n(which means we are testing the (n+1)th node), before line 3 is executed, all nodes from node 0 to node n-1 have possibility of 1/n to be selected. The (n+1)th node has 1/(n+1) possibility to be selected at line 3. For the previous selected node, it has a possibility of n/(n+1) to survive this round. Multiplying with its chance to be selected: 1/n, we get 1/(n+1) as well,

*Extended: Multiple items.
Assuming we have k nodes in the list and we are targeting a result set of size m,
each node will have possibility of m/k to be added to the result set.