Traveling object sensing is one of the active researches in computing machine vision. It is widely used in surveillance applications, counsel of independent vehicles, picture compaction, tracking of traveling objects, automatic mark acknowledgment and so on [ 1 ] . The purpose of traveling object sensing is to divide the traveling objects from the background. Harmonizing to the motion of the camera, the methods of traveling object sensing can be divided into two
types: traveling object sensing in inactive scenes and in dynamic scenes. Traveling object sensing in inactive scenes is comparatively simple. It has been a mature engineering and applied in assorted systems successfully. While traveling object sensing in dynamic scenes still has many key jobs which have non been solved, particularly when the background is comparatively complicated, the jobs become more hard. Therefore, the research of the methods in dynamic scene is going the hot topographic point of the current application in computing machine vision.
At present, there are two dominant methods for traveling object sensing, optical flow method and planetary gesture compensation method. Optical flow method [ 2 ] has high computational complexness and hapless anti noise capableness, and iti??i??s used merely with particular hardware. So the planetary gesture compensation method [ 3 ] has been widely used in the field. The chief thought of this method is to gauge the planetary gesture parametric quantity of a camera between frames through image matching, and so counterbalance the gesture of the camera. In this manner, the sensing in dynamic scenes is transformed into inactive scenes. The trouble of this method is to gauge the planetary gesture parametric quantities robustly, including characteristic points extraction and matching, taking the invalid matching points and the optimum solution of planetary gesture parametric quantities. In the paper, we adopt the latter method.
Several popular characteristic sensors including Harris corner [ 4 ] , SIFT [ 5-6 ] and SURF [ 7 ] have been widely used in the planetary gesture compensation. Harris corner does non hold the graduated table invariability and has many different alterations in footings of grey and light. SIFT ( Scale-Invariant Feature Transform ) algorithm is widely used in many applications because the characteristic form is comparatively invariant to alterations in orientation, graduated table, light and contrast. But, SIFT algorithm cani??i??t satisfy the petition of existent clip because of the big sum of computation, and high clip complexness. SURF ( Speeded Up Robust Features ) algorithm, the accelerated version of SIFT, have a greater publicity in existent clip. In order to accomplish the real-time demand, we choose the SURF algorithm. Furthermore, to further better the velocity and the preciseness, we make some betterments based on SURF algorithm.
The proposed algorithm procedure is as follows: foremost, pull out the characteristic points and fit them between next frames by utilizing the improved SURF algorithm and fiting method, so take the affine transmutation theoretical account to depict the planetary gesture, usage RANSAC [ 8 ] to take the invalid matching points and least square method [ 9 ] to obtain the optimum planetary gesture parametric quantities ( affine transmutation matrix ) , eventually counterbalance the old frame by utilizing the parametric quantities, and obtain the objects by the difference between the current frame and the remunerated frame. After morphological image processing, we get the accurate traveling objects. The overall procedure of the proposed algorithm is summarized in Fig. 1.
Fig.1. Flowchart of the proposed algorithm.
2 Improve SURF algorithm and matching method
2.1 SURF algorithm
This subdivision reviews the original SURF algorithm. It is proposed by Bay H, Tuytelaars T, Gool L V in 2006 [ 7 ] . This algorithm is similar with SIFT algorithm. But, it is faster than SIFT in computation velocity. It relies on built-in images to cut down the calculation clip and we besides call it the i??i??Fast-Hessiani??i?? sensor.
The SURF sensor is based on the determiner of the Hessian matrix. Based on Integral Image, we can cipher the Hessian matrix. Give a point in an image I, the Hessian matrix in at graduated table i??i?? is defined as formulai??i??1i??i?? :
Here refers to the whirl of the 2nd order Gaussian derived function with the image at point and likewise for and. As Gaussian filters are non-ideal in any instance, and given Lowei??i??s success with LoG estimates, Bay push the estimate even further with box filters. These approximative second order Gaussian derived functions, and can be evaluated really fast utilizing built-in images, independently of size. The undermentioned expression i??i??2i??i??as an accurate estimate for the Hessian determiner utilizing the approximated Gaussians-box filters:
refers to the whirl of the box filters with the image at point and likewise for and.
Scale infinites are normally implemented as image pyramids. The image pyramids in SURF is constructed by altering the size of box filters instead than iteratively cut downing the image size. The end product of the 9i??i??9? lter is considered as the initial graduated table bed, to which we will mention as
graduated table i??i??=1.2.The following beds are obtained by? ltering the image with bit by bit bigger masks, such as 9i??i??9,15i??i??15,21i??i??21,27i??i??27, etc. If the size of box filter is Ni??i??N, the corresponding graduated table i??i??=1.2i??i??N/9. In order to place involvement points in the image and over graduated tables, a non-maximum suppression in a 3 i??i?? 3 i??i?? 3 vicinity is applied. To make this, each pel in the scale-space is compared to its 26 neighbours, comprised of the 8 points in the native graduated table and the 9 in each of the graduated tables above and below. The upper limit of the determiner of the Hessian matrix are so interpolated in graduated table and image infinite.
Rotation invariability is achieved by observing the dominant orientation of each characteristic point utilizing Haar ripple responses in ten and y waies within a round vicinity of radius 6s around the characteristic point. Here s is the graduated table at which the characteristic point was detected. The size of the Haar filter meat is scaled to be 4si??i??4s. The responses are weighted with a Gaussian centered at the characteristic point. The Gaussian is dependent on the graduated table of the point and chosen to hold standard divergence 2.5i??i?? . The dominant orientation is estimated by ciphering the amount of the horizontal and perpendicular Haar ripple responses within a sliding orientation window covering an angle of i??i??/3. The two summed responses constitute a vector, and the longest vector lends its orientation to the characteristic point.
When pull outing form, the first measure is to build a square window with size 20i??i?? and the window is oriented along the dominant orientation. Then divide the window into 4i??i??4 regular sub-regions. For each sub-region, compute Haar ripple responses of size 2i??i?? at 5i??i??5 on a regular basis spaced sample points. refers the amount of responses in horizontal way and refers the amount of responses in perpendicular way. and severally refers the amount of the absolute values of the responses in horizontal and perpendicular way. Hence, each sub-region has a four-dimension form vector for its implicit in strength construction. For the window holding 4i??i??4 sub-regions, each characteristic point has a 64-dimension form vector. Last, we turn the form into aunit vector to accomplish the invariability to contrast.
2.2 Improve SURF algorithm
i??i??1i??i?? Limit the figure of detected characteristic points
SURF algorithm focal point on the detection consequence, without sing the figure of the characteristic points and place. However, if we detect much more feature points in the image frame, it non merely increases the clip of ciphering the characteristic pointsi??i?? form, but besides increases the duplicate clip and the complexness of ciphering the optimum planetary gesture parametric quantities. As we know, the affine transmutation matrix merely need three braces of fiting points at least to accomplish the image geometry transform. Hence, cut downing some fiting points will non impact the concluding consequence, and it can better the efficiency of the whole algorithm.
When utilizing SURF detects characteristic points, it applies a non-maximum suppression in a 3 i??i?? 3 i??i?? 3 vicinity. Each pel in the scale-space is compared to its 26 neighbours, comprised of the 8 points in the native graduated table and the 9 in each of the graduated tables above and below. But when the image celebrity is more complex, it can observe a big figure of characteristic points, which increases the calculation in the subsequent processing. Therefore, when observing characteristic points, we apply the non-maximum suppression in a 7 i??i?? 7 i??i?? 7 vicinity. In the centre of the point for 7 i??i?? 7 part, we compare the determiner to its 146 neighbours, comprised of the 48 points in the native graduated table and the 49 in each of the graduated tables above and below. In the 7 i??i?? 7 i??i?? 7 vicinity, we can observe the appropriate figure of characteristic points which have stronger hardiness, and the efficiency of the whole algorithm is promoted.
i??i??2i??i?? A fast method for ciphering the characteristic pointi??i??s dominant orientation
The method for ciphering characteristic point ‘s dominant orientation in SURF is utilizing a sliding window covering an angle with 60 grades shift around a circle part, and so ciphering the amount of the horizontal and perpendicular Haar ripple responses in it. The two summed responses constitute a vector, and the longest vector lends its orientation to the characteristic point. The switching measure of skiding window is chosen 5i??i?? . When the sliding window shifting, there are many overlap parts generated. Therefore, we will cipher the amount of the responses repeatedly, which influence the algorithmi??i??s efficiency. For illustration, among 0-60i??i??region and 5-65i??i??region, 5-60i??i??is an overlap part, and the amount of responses is repeatedly calculated which made the algorithm procedure more complexness.
We adopt a fast method for ciphering the characteristic pointi??i??s dominant orientation to increase the efficiency of the algorithm [ 10 ] . The process is as follows:
i??i??1i??i?? Calculate the amount of horizontal and perpendicular Haar
ripple responses at each whole grades ( 0-360i??i?? ) , and hive away them in and.
i??i??2i??i?? Calculate the integral of and, defined as and:
The computation of is similar to.
i??i??3i??i?? Calculate the amount of Haar ripple responses in 60i??i??sensor part with the terminal of any angle I.
The computation of is similar to. The local orientation vector could be calculated as formulai??i??5i??i?? :
At the terminal we choose the longest local orientation vector over all Windowss as the dominant orientation of the characteristic point.
Using this algorithm to cipher the dominant orientation, the repeated computation are wiped off and the shifting measure of skiding window is changed into 1i??i??from 5i??i?? . Comparing with the original algorithm, the improved algorithm decreases the complexness and increases the truth.
2.3 Improve the characteristic points fiting method
Matching two feature points is done by comparing the corresponding characteristic point forms. In the procedure of seeking for fiting points, the planetary hunt method and KD-Tree algorithm are widely used at present. Global hunt method is easy to implement, but it needs to cipher the distance of all points in the two point sets. So this method has big sum of calculation and it will observe many invalid fiting points. KD-Tree algorithm takes full advantage of the information construction information of characteristic points. It merely calculates a portion of pointsi??i?? distance in the two point sets by building KD-Tree. Though KD-tree algorithm reduces the computational complexness and improves the truth, it costs extra clip to build KD-Tree. The survey shows that when the figure of characteristic points is little, the velocity of KD-Tree is non evidently increased.
Therefore, we propose an improved matching method based on the planetary hunt method. When seeking for a corresponding point in the next frame, the point is searched in a certain part around the characteristic point of the current frame alternatively of the scope of full image. The size of the certain part is decided by the velocity of the background. This method reduces the computation of the distance between the matching points, decreases the figure of invalid matching points, and reduces the complexness of the subsequent measure. In a word, it improves the velocity and truth at the same clip. When mensurating the similarity between the matching points, there are two stairss. The first measure is to make the initial lucifer by utilizing the mark of the Laplacian ( based on the determiner of the Hessian matrix ) . When the matching points have the same mark of the Laplacian, we do the subsequent similarity step, or we judge the two points are non matched. Hence, this minimum information allows for faster matching and gives a little addition in public presentation. The last measure is to cipher the Euclidian distance between the two characteristic pointsi??i?? 64-dimension forms, and a duplicate brace is detected if its nearest neighbour ( closest Euclidean distance in form infinite ) is closer than 0.65 times the distance to the 2nd close neighbour. Over here, 0.65 is a threshold that it can be changed.The smaller it is, the less match points brace we get. The procedure of characteristic points fiting is summarized in Fig. 2.
Fig.2. Flowchart of characteristic points fiting.
3 Global gesture compensation and objects sensing
In this subdivision, the first measure is to counterbalance the planetary gesture of camera by utilizing the matching points. This measure converts the sensing in dynamic scenes into inactive scenes.
We choose the affine transmutation theoretical account to depict the planetary gesture. Affine transmutation, with six parametric quantities, is suited for interlingual rendition, rotary motion, graduated table and stretch. In the planar infinite, the affine transmutation can be expressed as formulai??i??6i??i?? :
Here, refers to the characteristic points in the old frame, and refers to the characteristic points in the current frame. represents interlingual rendition and represents rotary motion, graduated table, stretch and so on.
The duplicate brace we get in subdivision 2 must hold invalid matching points. We adopt the RANSAC algorithm to take them and acquire the best set of interior points. Then we use the least square method in the best set of interior points to cipher the optimum planetary gesture parametric quantities ( affine transmutation matrix ) . Next we compensate the old frame by utilizing these parametric quantities. After this measure, the backgrounds of the old frame and the current frame are unified, and the sensing in dynamic scenes has been transformed into inactive scenes. So we use the frame difference method between the current frame and the affine transformed frame to observe the moving objects. Finally, the binary image is processed by morphological method to cut down the little holes and residuary noise points, and to smooth the objectsi??i?? contour. At this point, we get the accurate traveling objects.
4 Experimental consequences and analysis
To do our experimental consequences have more persuasion, we did all simulation experiments in the undermentioned state of affairss: Hardware environment: CPU Intel ( R ) Core ( TM ) i5 M520 @ 2.40GHz, RAM 4G, NVIDIA NVS 3100M ; Software development tools: Microsoft VS 2008, OpenCV 2.3. The size of picture frame in the paper is 720i??i?? 480, and the frame rate is 25fps.
4.1 The consequences of improved SURF
In the paper, we chiefly make some betterments on SURF in two ways: One is to restrict the figure of detected characteristic points by altering the scope of non-maximum suppression. The consequences are listed in Fig.3 and Tabel1.The other one is to follow a fast method for ciphering the featur pointi??i??s dominant orientation. The consequences are listed in Fig.4 and Table2.
( a )
( B )
( degree Celsius )
Fig.3 Results of detected characteristic points
Table1. Results of restricting characteristic pointsi??i?? figure
The scope of
non-maximum suppression Number of characteristic points Time/ms
3i??i??3i??i??3 551 315
5i??i??5i??i??3 387 209
7i??i??7i??i??3 234 137
Fig.3 ( a ) shows the observing consequence when we apply the non-maximum suppression to observe the characteristic points in the 3 i??i?? 3 i??i?? 3 vicinity which is the original SURF algorithm. Fig.3 ( B ) and Fig.3 ( degree Celsius ) severally show the detection consequences in the 5i??i??5i??i??3 vicinity and in the 7 i??i?? 7 i??i?? 3 vicinity. As shown in Fig.3, most of the detected characteristic points administer on the background, which is favour of patterning the background. Furthermore, the improved SURF efficaciously limit the figure of characteristic points, and characteristic points administer on the background equally with strong hardiness. Table1 besides shows the consequences of confining characteristic pointsi??i?? figure. Fig.3 ( a ) detects 551 characteristic points, bing 315ms, Fig.3 ( B ) detects 387 characteristic points, bing 209ms, and Fig.3 ( degree Celsius ) detects 234 characteristic points, bing 137ms. Comparing these informations, we know the improved SURF efficaciously limits the figure of characteristic points and decreases the detection clip. However, excessively few characteristic points will act upon the truth of object sensing, so we choose the 7 i??i?? 7 i??i?? 3 vicinity to acquire about 200 characteristic points.
( a ) Before bettering
( B ) After bettering
Fig.4 Results of the fast method for dominant orientation
Table2 Results of the fast method for dominant orientation
Time/ms Matching braces
SURF 137 177
Improved SURF 121 144
The Improved SURF wipes off the perennial computation and the shifting measure of skiding window is changed into 1i??i??from 5i??i?? . Comparing with the original algorithm, the improved algorithm decreases the complexness and increases the truth. As shown in Table2, the detection clip is saved 16ms by following the fast method to cipher the characteristic pointsi??i?? dominant orientation, and this method removes a portion of invalid fiting braces, dropped from 177 to 144. Fig.4 shows the contrast consequence of characteristic points fiting by utilizing the original SURF and the improved SURF. Comparing the Fig.4 ( a ) and Fig.4 ( B ) , it is obvious that Fig.4 ( B ) removes some invalid matching braces which demonstrates the efficiency of the improved algorithm.
4.2 The consequences of improved matching method
( a ) Before bettering
( B ) After bettering
Fig.5 Results of improved matching method
Table3 Results of improved matching method
Global hunt method Improved matching method
Matching braces 156 160
Matching time/ms 32 16
Best set of interior points by RANSAC 51 130
In the portion of characteristic points fiting, we chiefly make some betterments based on the planetary hunt method, including restricting the hunt range, judging the mark of the Laplacian and comparing the nearest neighbour and the 2nd close neighbour. In the paper, we search for a characteristic pointi??i??s matching points in the next frame within a square of 60 i??i?? 60 centered on the characteristic point. Table3 indicates the improved matching method has a great publicity on clip, and duplicate clip dropped from 32ms to 16ms. Two methods about detect the same figure of fiting braces. However, analysing the best set of interior points obtained by RANSAC, there are merely 51 fiting braces in the best set by utilizing the original planetary hunt method which shows that 105 invalid matching braces are removed by RANSAC. From these Numberss, it can be seen that the old duplicate measure has detected excessively many invalid fiting brace which influences the truth of the planetary gesture theoretical account. But in the improved matching method, there are 130 fiting braces in the best set of interior points, and merely 30 invalid matching braces are removed by RANSAC. Iti??i??s obvious that the improved matching method efficaciously reduces the figure of invalid fiting braces. Furthermore, with more duplicate braces in the best set of interior points, the planetary gesture theoretical account established by the least square method is more precise and the consequence of traveling object sensing is more accurate. As is shown in Fig.5, for the same frame, with the original matching method we cani??i??t observe the traveling object, but with the improved matching method, we detect the traveling object successfully.
4.3 The consequences of planetary gesture compensation and object sensing
We adopt the proposed method in the paper based on improved SURF to observe the traveling object in dynamic scenes. Fig.6 severally shows the consequences of the 4th frame, the 6th frame and the 8th frame. Fig.6 ( a ) shows the original frames and Fig.6 ( B ) shows the consequences of planetary gesture compensation for the old frame. Comparing ( a ) and ( B ) , the backgrounds of the current frame and the old frame are unified which indicates the consequence of planetary gesture compensation for the old frame is good, and it realizes the transmutation from dynamic scenes to inactive senses. Fig.6 ( degree Celsius ) shows the detected object. Experimental consequences show that the proposed method in the paper is able to finish traveling object sensing in dynamic scenes.
( a ) ( B ) ( degree Celsius )
Fig.6 Results of planetary gesture compensation and object sensing
Table4 Costing clip of different algorithm
4 6 8
SITF Feature points 308 276 323
Time/ms 703 621 723
SURF Feature points 551 513 578
Time/ms 367 331 382
Improved SURF Feature points 234 212 245
Time/ms 152 144 155
Table4 severally shows the costing clip of different method based on SIFT, SURF and improved SURF. The method based on SIFT costs an norm of 700ms to treat a picture frame, and the method based on SURF costs an norm of 350ms. While the proposed method in the paper based on improved SURF merely costs an norm of 150ms to finish the object sensing.
These consequences indicate that the proposed moving object sensing method based on improved SURF non merely has high truth and hardiness, but besides has a good advantage of clip.
For the traveling object sensing in dynamic scenes and real-time demand, an effectual method based on improved SURF algorithm was proposed. First, infusion featuer points by the improved SURF algorithm and fit them by the improved mtching method based on the planetary lucifer method. We chiefly made some betterments on SURF in two ways: one is to restrict the figure of detected characteristic points by altering the scope of non-maximum suppression, the other 1 is to follow a fast method for ciphering the characteristic pointi??i??s dominant orientation. Then cipher the optimum planetary gesture parametric quantities ( affine transmutation matrix ) by utilizing RANSAC and the least square method. Finally, counterbalance the old frame with the parametric quantities, and obtain the object by the frame difference method. After morphological image processing, we got the accurate traveling object. The experimental consequences showed that the proposed method can successfully observe the traveling object in dynamic scenes. It non merely has high truth and hardiness, but besides has a good advantage of clip comparing with the method based on SIFT and SURF.