Optimization Of Numeric Data Using General Tree English Language Essay

Computer systems are used to hive away big sum of informations from which single records must be retrieved harmonizing to hunt standard. Thus the efficient storage of informations to ease fast searching is one sort of an of import issue. The following of import issue is that bringing the information and arrangement in some format as per the user demand, but these are all done within the limited or short clip continuance. This paper presents a new internal Searching Algorithm used for treating a peculiar whole number informations from Random Access Memory ( RAM ) . The entrance informations are arranged like a tree construction which makes the hunt simpler by cut downing the figure of comparings made than the comparings involved in a Binary Search tree. The agreement of the informations autumn on basic mathematical categorization of the figure system. The tree agreement will get down as the user starts giving the information one by one or with the aid of a file. It overcomes the restrictions of the Binary Search Tree ( BST ) . The information are reduced to its simpler signifier and divided into degrees depending on the topographic point value of each figure in the information. Once the informations are arranged, seeking procedure starts and following way will be in right order like BST, as the input informations is given for hunt. The maximal figure of comparings is decided by the largest figure ‘s figures in the list and non simply on the entire figure of information ‘s.

Keywords: internal searching algorithm, Binary hunt tree, figure system


Let ‘s analyze how long it will take to happen an point fiting a key in the aggregations we have discussed so far. We ‘re interested in:

the mean clip

the worst-case clip and

the best possible clip.

However, we will by and large be most concerned with the worst-case clip as computations based on worst-case times can take to vouch public presentation anticipations. Handily, the worst-case times are by and large easier to cipher than mean times.

If there are n points in our aggregation – whether it is stored as an array or as a linked list – so it is obvious that in the worst instance, when there is no point in the aggregation with the coveted key, so n comparings of the key with keys of the points in the aggregation will hold to be made.

To simplify analysis and comparing of algorithms, we look for a dominant operation and number the figure of times that dominant operation has to be performed. In the instance of seeking, the dominant operation is the comparing, since the hunt requires n comparings in the worst instance, we say this is a O ( N ) ( pronounce this “ big-Oh-n ” or “ Oh-n ” ) algorithm. The best instance – in which the first comparing returns a lucifer – requires a individual comparing and is O ( 1 ) . The mean clip depends on the chance that the key will be found in the aggregation – this is something that we would non anticipate to cognize in the bulk of instances. Therefore in this instance, as in most others, appraisal of the mean clip is of small public-service corporation. If the public presentation of the system is critical, i.e. it ‘s portion of a life-critical system, so we must utilize the worst instance in our design computations as it represents the best guaranteed public presentation.

However, if we place our points in an array and screen them in either go uping or falling order on the key foremost, so we can obtain much better public presentation with an algorithm called binary hunt.

In binary hunt, we foremost compare the key with the point in the in-between place of the array. If there ‘s a lucifer, we can return instantly. If the key is less than the in-between key, so the point sought must lie in the lower half of the array ; if it ‘s greater so the point sought must lie in the upper half of the array. So we repeat the process on the lower ( or upper ) half of the array.

Our FindInCollection map can now be implemented:

Points to observe:

1.bin_search is recursive: it determines whether the hunt key lies in the lower or upper half of the array, so calls itself on the appropriate half.

2.There is a expiration status ( two of them in fact! )

i.If low & gt ; high so the divider to be searched has no elements in it and

two. If there is a lucifer with the component in the center of the current divider, so we can return instantly.

3.AddToCollection will necessitate to be modified to guarantee that each point added is placed in its right topographic point in the array. The process is simple:

I. Search the array until the right topographic point to infix the new point is found,

two. Move all the following points up one

place and

iii.Insert the new point into the empty place

therefore created.

4.bin_search is declared inactive. It is a local map and is non used outside this category: if it were non declared inactive, it would be exported and be available to all parts of the plan. The inactive declaration besides allows other categories to utilize the same name internally.


Each measure of the algorithm divides the block of points being searched in half. We can split a set of n points in half at most log2 n times.

Therefore the running clip of a binary hunt is relative to log n and we say this is a O ( log N ) algorithm.

Binary hunt requires a more complex plan than our original hunt and therefore for little Ns it may run slower than the simple additive hunt. However, for big Ns, Plot of N and log n vs N.

Therefore at big N, log N is much smaller than Ns, accordingly an O ( log N ) algorithm is much faster than an O ( n ) one

We will analyze this behavior more officially in a ulterior subdivision. First, allow ‘s see what we can make about the interpolation ( AddToCollection ) operation.

In the worst instance, interpolation may necessitate n operations to infix into a sorted list.

We can happen the topographic point in the list where the new point belongs utilizing binary hunt in O ( log N ) operations.

However, we have to scuffle all the following points up one topographic point to do manner for the new one. In the worst instance, the new point is the first in the list, necessitating n move operations for the shuffling!

A similar analysis will demo that omission is besides an O ( n ) operation.

If our aggregation is inactive, ie it does n’t alter really frequently – if at all – so we may non be concerned with the clip required to alter its contents: we may be prepared for the initial physique of the aggregation and the occasional interpolation and omission to take some clip. In return, we will be able to utilize a simple information construction ( an array ) which has small memory operating expense.

However, if our aggregation is big and dynamic, ie points are being added and deleted continually, so we can obtain well better public presentation utilizing a information construction called a tree.

The disadvantage of BSTs is that in the worst-case their asymptotic running clip is reduced to additive clip. This happens if the points inserted into the BST are inserted in order or in near-order. In such a instance, a BST performs no better than an array.

A balanced tree is a tree that maintains some predefined ratio between its tallness and comprehensiveness. Different information constructions define their ain ratios for balance, but all have it close to log2 n. A self-balancing BST, so, exhibits log2 n asymptotic running clip. There are legion self-balancing BST information constructions in being, such as AVL trees, red-black trees, 2-3 trees, 2-3-4 trees, splay trees, B-trees, and others. In the following two subdivisions, we ‘ll take a brief expression at two of these self-balancing trees-AVL trees and red-black trees.

Difference between a Linked list and Binary hunt tree

A binary hunt tree ( BST ) is a binary tree informations construction which has the undermentioned belongingss:

each node ( point in the tree ) has a distinguishable value ;

both the left and right subtrees must besides be binary hunt trees ;

the left bomber tree of a node contains merely values less than the node ‘s value ;

The right subtree of a node contains merely values greater than or equal to the node ‘s value.

In computing machine scientific discipline, a linked list is one of the cardinal information constructions, and can be used to implement other informations constructions.

So a Binary Search tree is an abstract construct that may be implemented with a linked list or an array. While the coupled list is a cardinal information construction.

Linked List is consecutive Linear information with next nodes connected with each other e.g. A- & gt ; B- & gt ; C. You can see it as a consecutive fencing.

BST is a hierarchal construction merely like a tree with the chief bole connected to subdivisions and those subdivisions in-turn connected to other subdivisions and so on. The “ Binary ” word here means each subdivision is connected to a upper limit of two subdivisions.


This algorithm is used merely for treating the numeral informations and it comes under the categorization of Internal Searching Technique and has the following virtues:

Data distribution can be in any order.

Entire Comparisons is decided by the largest Numberss digit ‘s.

Since the information ‘s are arranged harmonizing to a simple categorization of figure system, there is no opportunity for traveling in a incorrect way when the searching is done.

Time for the operation can be calculated in the initial phase itself.


Whenever the information comes, the information is classified as positive or negative sphere. Then farther, the figure is

classified as an uneven whole number or and even integer in that sphere. The first measure of categorization terminals here. Then the present informations should undergo a modulo operation with 10 and the figure which is coming out after modulo operation will be classified as odd or even and placed in a construction which is formed dynamically at that case and all the self-referential construction of the soon created construction will be made NULL and this is calculated to be the first phase, farther the information is divided by 10 boulder clay it encounters nothing. Then figure is farther spitted if any, and the following figure which is got by modulo division, dynamically the memory is allocated to it and the figure is stored & A ; that figure is categorized as odd or even and the late created construction reference is taken to the old construction memory ‘s odd or even self-referential construction and in every phase the position flag of the corresponding created construction will be zero and will be made one to the last phase construction when the figure ca n’t be farther splitted.

This procedure will be continued for all incoming informations ‘s and if the figure is already present in the list in each phase or degree ( uneven or even side ) , the figure will non be added once more. If the figure is non present in the list, so the dynamically memory is allocated and it is added to the beginning of the list and the Head node reference is changed in the old self-referential construction. Therefore the same procedure will go on. Once the tree formation is finished so the seeking procedure starts.

figure 1: Diagrammatic representation of the flow.


When the information to be searched is got, it is first categorized into positive or negative sphere and farther as odd or even sphere. After puting the way, the figure to be searched undergoes modulo operation by 10 and checked whether the peculiar figure is present in the categorised list in that phase in the tree. If the figure is non present, so user will be intimated with a information that data non found and the plan will be exited. If the figure is at that place in the list, so informations will undergo farther modulo operation and checked for following figure to be in even or uneven class and the corresponding way or reference will be taken from the present construction member. If the member has NULL in it, the user will be intimated with proper information that data non found or the procedure repetitions till the given informations ranges zero by spliting by 10.

The sample informations ‘s for the below figure is:

-63, 572, 872, -963, -93, -47, 6, -363.


figure 2: Searching for a information 872 from the tree

Here 872 is got from the user for seeking, foremost this figure is classified into positive or negative, here being positive the reference stored in the corresponding self-referential construction and after acquiring the reference, it is farther checked for odd or even class, here it is an uneven figure. So the corresponding reference will be fetched from the construction member and the figure undergoes modulo operation for acquiring the last figure and figure is divided by 10. Then that is checked with the list nowadays in the phase 1 as mentioned in figure 1 in ruddy coloring material squares. The figure 2 is present in the list, so the present figure will undergo modulo division and farther gets split up and checked for odd or even class and the corresponding reference will be fetched from the construction. In the travel way, if any figure is even or odd in the class positive or negative is NULL, so the corresponding information that data non found will be intimated to the user. If it is non NULL, same stairss are repeated as above till the figure ca n’t be farther divided.

If the last figure is found in the way travelled successfully, so the position flag nowadays in that present node is checked. If it is one, the information is successfully found or non found will be intimated to the user.


Space Complexity: memory used is really big in the agreement.

Retrieval of informations: one time the informations are arranged it ca n’t be retrieved back in the same order.

Note: worst behavior for really little informations. It can non manage Hexadecimal values.


In the worst instance, in every phase the sum of memory used will be increasing in multiples of 5.

The complexness graph can be traced as 5a. The clip complexness is of logarithmic order.




Difficult Disk ( Secondary storage )

160 GB

RAM ( Primary memory )

2 GB


AMD Turion 64 X2 TL -58

Processor Speed

1.90 GHz

Operating System

Windows Vista – Basic ( 32 Bit )


DevC++ ( 9.0.0 ) , Turbo-C




From the graph there are three different seeking attack ‘s are applied for the majority of same informations on the same system and the same environment. It is found that from the graph:

Linear Search shown in bluish line, which is

taking the least clip among three and hence it is

represented in underside of the graph.

Time salvaging hunt shown in green line, which

is taking Moderate clip among three and therefore

it is represented in center of the graph.

Binary Search shown in ruddy line, which is

taking comparably more among three and

therefore represented in top in the graph.

The ground behind the least clip taken for the additive hunt procedure is that it does n’t affect any complexnesss like set uping the information ‘s in some predefined order and so to do the hunt easier with the lesser clip. As a fact of consequence, this Linear hunt could non be compared with other seeking methods which are holding the screening agreements since its Worst instance complexness is O ( N ) .

The Binary hunt every bit good as Time salvaging hunt are holding the same methodological analysiss like given majority of informations are to be shorted and so to be searched. Between these two methods the clip compatibility is uncomparable.For illustration, at 30K informations load the clip difference is 0.2 MS and the last informations burden point is 150K status the clip difference is 0.25ms. From this work, the optimal of clip is obtained in Time Saving Search than the Binary searching Technique.


The illation of this algorithm is that it overcomes the restrictions of the binary hunt tree and reduces the figure of comparings made for immense sets of informations given in any order. However the memory taken is high, the public presentation of this technique is fast compared to binary hunt tree or any other seeking techniques used for numeral hunt.

Leave a Reply

Your email address will not be published. Required fields are marked *