Welcome

首页 / 软件开发 / 数据结构与算法 / Geeks 面试题之Optimal Binary Search Tree

Geeks 面试题之Optimal Binary Search Tree2015-09-24

Dynamic Programming | Set 24 (Optimal Binary Search Tree)

Given a sorted array keys[0.. n-1] of search keys and an array freq[0.. n-1] of frequency counts, where freq[i] is the number of searches to keys[i]. Construct a binary search tree of all keys such that the total cost of all the searches is as small as possible.

Let us first define the cost of a BST. The cost of a BST node is level of that node multiplied by its frequency. Level of root is 1.

Example 1Input:keys[] = {10, 12}, freq[] = {34, 50}There can be following two possible BSTs 10 12 /12 10I IIFrequency of searches of 10 and 12 are 34 and 50 respectively.The cost of tree I is 34*1 + 50*2 = 134The cost of tree II is 50*1 + 34*2 = 118 Example 2Input:keys[] = {10, 12, 20}, freq[] = {34, 8, 50}There can be following possible BSTs1012 20 1020  / /  /1210 20 12 20 10 / /  201012 12 I II III IV VAmong all possible BSTs, cost of the fifth BST is minimum.Cost of the fifth BST is 1*50 + 2*34 + 3*8 = 142
1) Optimal Substructure:
The optimal cost for freq[i..j] can be recursively calculated using following formula.

We need to calculate optCost(0, n-1) to find the result.

The idea of above formula is simple, we one by one try all nodes as root (r varies from i to j in second term). When we make rth node as root, we recursively calculate optimal cost from i to r-1 and r+1 to j.
We add sum of frequencies from i to j (see first term in the above formula), this is added because every search will go through root and one comparison will be done for every search.

http://www.geeksforgeeks.org/dynamic-programming-set-24-optimal-binary-search-tree/

关键:

如何计算整个优化BST的查找次数,这里不用按上面说的一层一层地去计算,

而是巧妙地计算:每增加一个根节点的时候,根节点下面的所有节点的访问次数都需要增加1倍,那么只需要把这些节点的总和加起来就可以了,因为根节点的访问次数也是一倍,所以可以连根节点也加进球,就得到最后的总访问频率了。

经验:一些看起来简单的小函数,要用最优的方法实现就不是那么容易了。

原网站的时间效率是O(n*n*n*n),原文也提到可以提高到O(n*n*n)。

下面给出两个程序,时间效率都提高到了O(n*n*n)。

第一个程序只需要修改一下计算sum的位置就可以了,

第二个程序增加计算了一个sumTable,记录了所有sum,实际运行效率会稍微快点。