Object Detection [PDF]

A Project report On

OBJECT DETECTION USING MATLAB

Submitted in partial fulfillment of the requirement for the award of

29 0 901 KB

Report DMCA / Copyright

DOWNLOAD FILE

File loading please wait...

Citation preview

A Project report On

OBJECT DETECTION USING MATLAB

Submitted in partial fulfillment of the requirement for the award of degree of

BACHELOR OF TECHNOLOGY in ELECTRONICS & COMMUNICATION ENGINEERING

Submitted by

R. SADANAND

(16311A04V3)

CH. AMAN PRASAD

(16311A04W3)

Y. MUKESH REDDY

(16311A04W4)

Under the Guidance of

Dr.CN SUJATHA Professor

Ms. E.LAVANYA Assistant Professor

Department of ECE

Department of ECE

1

Department of Electronics and Communication Engineering SREENIDHI INSTITUTE OF SCIENCE AND TECHNOLOGY (Affiliated to Jawaharlal Nehru Technological University, Hyderabad)

Yamnampet (V), Ghatkesar (M), Hyderabad – 501301, A.P. 2019-2020 Department Of Electronics and Communication Engineering SREENIDHI INSTITUTE OF SCIENCE AND TECHNOLOGY (Affiliated to Jawaharlal Nehru Technological University, Hyderabad)

Yamnampet (V), Ghatkesar (M), Hyderabad – 501301, A.P.

CERTIFICATE

This is to certify that the project entitled “OBJECT DETECTION USING MATLAB” is being submitted by

R. SADANAND

(16311A04V3)

CH. AMAN PRASAD 2

(16311A04W3)

Y. MUKESH REDDY

(16311A04W4)

in partial fulfillment of the requirements for the award of BACHELOR OF TECHNOLOGY to JNTU, Hyderabad. This record is a bonafide work carried out by them under my guidance and supervision. The result embodied in this project report has not been submitted to any other university or institute for the award of any degree of diploma.

Internal guide

Project Co-ordinator

Ms. E. Lavanya

Dr.C.N.Sujatha

Assistant Professor

Professor

Department of ECE

Department of ECE

Head of Department Dr.S.P.V.Subba Rao Professor- HOD Department of ECE

3

DECLARATION

This is to certify that the work reported in the present thesis titled “OBJECT DETECTION USING MATLAB” is a record work done by me/us in the Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology, Yamnampet, Ghatkesar. No part of the thesis is copied from books/ journals/ internet and wherever the portion is taken; the same has been duly referred in the text. The report is based on the project work done entirely by me/ us and not copied from any other source.

R. SADANAND – 16311A04V3 CH. AMAN PRASAD-16311A04W3 Y. MUKESH REDDY-16311A04W4

1

ACKNOWLEDGMENT We would like to express our sincere gratitude and thanks to Ms.E.Lavanya, Internal Guide, Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology for allowing us to take up this project. We would specially like to express our sincere gratitude and thanks to, Mrs. C.N. Sujatha, Project Coordinator, Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology for guiding us throughout the project. We are very grateful to Dr. S.P.V. Subba Rao, Head of the Department of Electronics and Communication Engineering, Sreenidhi Institute of Science and Technology for allowing us to take up this project. We are very grateful to Dr.Ch.Shiva Reddy, Principal of Sreenidhi Institute of Science and Technology for having provided the opportunity for taking up this project. We are very grateful to Dr. P. Narasimha Reddy, Executive Director of Sreenidhi Institute of Science and Technology for having provided the opportunity for taking up this project. We also extend our sincere thanks to our parents and friends for their moral support throughout the project work.

2

ABSTRACT Object detection is most prevalent step of video analytics. Performance at higher level is greatly depends on accurate performance of object detection. Various platforms are being used for designing and implementation of object detection algorithm. It includes C programming, MATLAB and Simulink, open cv etc. Among these, MATLAB programming is most popular in students and researchers due to its extensive features. These features include data processing using matrix, set of toolboxes and Simulink blocks covering all technology fields, easy programming, and Help topics with numerous examples. This paper presents the implementation of object detection and tracking using MATLAB. It demonstrates the basic block diagram of object detection and explains various predefined functions and object from different toolboxes that can be useful at each level in object detection. Useful tool boxes include image acquisition, image processing, and computer vision. This study helps new researcher in object detection field to design and implement algorithms using MATLAB.

3

CONTENTS

Abstract

3

Contents

4

Chapter 1

Introduction 1.1 1.2 1.3 1.4 1.5

6-8

Introduction to the project What is Object Detection? Why Object Detection matters? How is it currently being used? What Potential does it have?

6 6 7 8 8

Chapter 2

Literature Survey

9-10

Chapter 3

Block Diagram

11-12

3.1

Chapter 4

Significance of Block Diagram 3.1.1 Video Input 3.1.2 Pre-Processing 3.1.3 Object Detection 3.1.4 Post-Processing

Introduction to MATLAB 4.1 4.2 4.3 4.4

11 11 11 11 11

13-24

Matlab History Why to use Matlab? Syntax

13 13 13 14 4

4.5 4.6

Chapter 5

Variables Functions 4.6.1 Anonymous Functions 4.6.2 Primary and Sub Functions 4.6.3 Nested Functions 4.6.4 Private Functions 4.6.5 Global Variables 4.6.6 Function Handlings

Matlab Toolboxes 5.1

5.2 5.3 5.4 5.5 5.6

14 17 18 19 20 21 22 23 25-30

Computer Vision 5.1.1 Applications 5.1.2 CV toolbox Image Processing Image Sensors Image Compression Digital Signal Processing(DSP) 5.5.1 Medical Imaging Image Acquisition Toolbox

25 25 27 28 28 28 29 29 30

Chapter 6

Matlab Implementation

31-37

Chapter 7

Source Code

38-40

Chapter 8

Result

40

Chapter 9

Applications

41-44

Chapter 10

Conclusion

45

10.1 10.2 Chapter 11

Conclusion Future Scope

45 45

References

45

5

Chapter 1 Introduction to Object Detection 1.1 INTRODUCTION: Video analytics is popular segment of computer vision. It has enormous applications such as traffic monitoring, parking lot management, crowd detection, object recognition, unattended baggage detection, secure area monitoring, etc. Object detection is critical step in video analytics. The performance at this step is important for scene analysis, object matching and tracking, activity recognition. Over the years, research is flowing towards innovating new concept and improving or extending the established research for performance improvement of object detection and tracking. Various object detection approaches has been developed based on statistic, fuzzy, neural network etc. Most approaches involve complex theory. These approaches can be evolved further by thorough understanding, implementation and experimentation. All these approaches can be learned by reading, reviewing, and taking professor’s expert guidance. However, implementation and experimentation requires good programmer. Various platforms are being used for the design and implementation of object detection and tracking algorithm. These platforms include C programming, Open CV, MATLAB etc. The object detection system to be used in real time should satisfy two conditions. First, system code must be short in terms of execution time. Second, it must efficiently use memory. However, programmer must have good programming skill in case of programming in C and OpenCV. Moreover, it is time intensive too for new researcher to develop such efficient code for real time use.Assuming all these facts, the MATLAB is found as better platform to design and implementation of algorithm. It contains more than seventy toolboxes covering all possible fields in technology. All toolboxes are rich with predefined functions, system 6

objects and simulink blocks. This feature helps to write short code and saves time in logic development at various steps in system. MATLAB supports matrix operation which is huge advantage during processing of an image or frame in video sequence. MATLAB coding is simple and easily learned by any new researcher. This paper presents implementation of object detection system using MATLAB and its toolboxes. This study explored various toolboxes and identified useful functions and objects that can be used at various levels in object detection and tracking. Toolboxes mainly include computer vision, image processing, and image acquisition. MATLAB 2012 version is used for this study. This paper organized in four section second section describe general block diagram of object detection. Third section involves MATLAB functions and objects that are useful in implementation of object detection system. Sample coding is presented for object detection and tracking in section four. Paper is concluded in fifth section.

1.2 What is Object Detection? Object Detection is a task of finding and identifying object in an image or video sequence.The goal of instance level recognition is to recognize a specific object or scene.It is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class(such as humans, buildings, or cars) in digital images and videos.

1.3 Why object detection matters? Object detection is a key technology behind advanced driver assistance systems (ADAS) that enable cars to detect driving lanes or perform pedestrian detection to improve road safety. Object detection is also useful in applications such as video surveillance or image retrieval systems.

Today, images and video are everywhere. Online photo sharing sites and social networks have them in the billions. The field of vision research[1,] has been dominated by machine learning and statistics. Using images and video to detect, classify, and track objects or events in order to ”understand” a real-world scene. Programming a computer and designing algorithms for understanding what is in these images is the field of computer vision. Computer vision powers applications like image search, robot navigation, medical image analysis, photo management and 7

many more. From a computer vision point of view, the image is a scene consisting of objects of interest and a background represented by everything else in the image. The relations and interactions among these objects are the key factors for scene understanding. Object detection and recognition are two important computer vision tasks. Object detection determines the presence of an object and/or its scope, and locations in the image. Object recognition identifies the object class in the training database, to which the object belongs to. Object detection typically precedes object recognition. It can be treated as a two-class object recognition, where one class represents the object class and another class represents non-object class. Object detection can be further divided into soft detection, which only detects the presence of an object, and hard detection, which detects both the presence and location of the object. Object detection field is typically carried out by searching each part of an image to localize parts. This can be accomplished by scanning an object template across an image at different locations, scales, and rotations, and a detection is declared if the similarity between the template and the image is sufficiently high. The similarity between a template and an image region can be measured by their correlation (SSD). Over the last several years it has been shown that image based object detectors are sensitive to the training data.

1.4 How is it currently being used? Object detection is breaking into a wide range of industries, with use cases ranging from personal security to productivity in the workplace. Facial detection is one form of it, which can be utilized as a security measure to let only certain people into a highly classified area of a government building, for example. It can be used to count the number of people present within a business meeting to automatically adjust other technical tools that will help streamline the time dedicated to that particular meeting. It can also be used within a visual search engine to help consumers find a specific item they’re on the hunt for – Pinterest is one example of this, as the entire social and shopping platform is built around this technology. These features utilize people and object detection to create big data for a variety of applications in the workplace.

1.5 What potential does it have? The possibilities are endless when it comes to future use cases for object detection. Sports broadcasting will be utilizing this technology in instances such as detecting when a football team is about to make a touchdown and notifying fans via their mobile phone or at-home virtual reality setup in a highly creative way. In video collaboration, business leaders will be able to count the number of participants within a meeting to help them automate the process further and monitor room usage to ensure spaces are being used properly. A relatively new “people counting” method that detects heads rather than bodies and motion will allow for more accurate detecting to take place, in crowded places specifically (IEEE), which will enable even more applications for the security industry.

8

The future of object detection has massive potential across a wide range of industries. We are thrilled to be one of the main drivers behind real time intelligent vision, high performance computing, artificial intelligence and machine learning, which has allowed us to create a solution that will never distort video, allowing for various AI capabilities which other companies simply cannot enable.

CHAPTER 2 LITERATURE SURVEY The object detection task can be addressed by considering the video as an unrelated sequence of frames and perform static object detection In 2009, Felzenszwalb et al. [1] described an object detection system based on mixtures of multiscale deformable part models. Their system was able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges. They combined a margin-sensitive approach for data-mining hard negative examples with a formalism we call latent SVM. This led to an iterative training algorithm that alternates between fixing latent values for positive examples and optimizing the latent SVM objective function. Their system relied heavily on new methods for discriminative training of classifiers that make use of latent information. It also relied heavily on efficient methods for matching deformable models to images. The described framework allows for exploration of additional latent structure. For example, one can consider deeper part hierarchies (parts with parts) or mixture models with many components. Leibe et al. [2] in 2007, presented a novel method for detecting and localizing objects of a visual category in cluttered real-world scenes. Their approach 9

considered object categorization and figure-ground segmentation as two interleaved processes that closely collaborate towards a common goal. The tight coupling between those two processes allows them to benefit from each other and improve the combined performance. The core part of their approach was a highly flexible learned representation for object shape that could combine the information observed on different training examples in a probabilistic extension of the Generalized Hough Transform. As they showed, the resulting approach can detect categorical objects in novel images and automatically infer a probabilistic segmentation from the recognition result. This segmentation was then in turn used to again improve recognition by allowing the system to focus its efforts on object pixels and to discard misleading influences from the background. Their extensive evaluation on several large data sets showed that the proposed system was applicable to a range of different object categories, including both rigid and articulated objects. In addition, its flexible representation allowed it to achieve competitive object detection performance already from training sets that were between one and two orders of magnitude smaller than those used in comparable systems. Recently in last decade, methods based on local image features have shown promise for texture and object recognition tasks. Zhang et al. [3] in 2006, presented a large-scale evaluation of an approach that represented images as distributions (signatures or histograms) of features extracted from a sparse set of key-point locations and learnt a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions. They first evaluated the performance of the proposed approach with different key-point detectors and descriptors, as well as different kernels and classifiers. Then, they conducted a comparative evaluation with several modern recognition methods on 4 texture and 5 object databases. On most of those databases, their implementation exceeded the best reported results and achieved comparable performance on the rest. Additionally, we also investigated the influence of background correlations on recognition performance. In 2001, Viola and Jones [4] in a conference on pattern recognition described a machine learning approach for visual object detection which was capable of processing images extremely rapidly and achieving high detection rates. Their work was distinguished by three key contributions. The first was the introduction of a new image representation called the "integral image" which allowed the features used by their detector to be computed very quickly. The second was a learning algorithm, based on AdaBoost, which used to select a small number of critical visual features from a larger set and yield extremely efficient classifiers. The third contribution was a method for combining increasingly more complex classifiers in a "cascade" which allowed background regions of the image to be quickly discarded while spending more computation on promising object-like regions. The cascade could be viewed as an object specific focus-of-attention mechanism which unlike some of the previous approaches provided statistical guarantees that discarded regions were unlikely to contain the object of interest. They had done some testing over face detection where the system yielded detection rates comparable to the best of previous systems. Used in real-time applications, the detector runs at 15 frames per second without resorting to image differencing or skin color detection. In 2000, Weber et al. [5] proposed a method to learn heterogeneous models of object classes for visual recognition. The training images, that they used, contained a preponderance of clutter and the learning was 10

unsupervised. Their models represented objects as probabilistic constellations of rigid parts (features). The variability within a class was represented by a join probability density function on the shape of the constellation and the appearance of the parts. Their method automatically identified distinctive features in the training set. The set of model parameters was then learned using expectation maximization. When trained on different, unlabeled and non-segmented views of a class of objects, each component of the mixture model could adapt to represent a subset of the views. Similarly, different component models could also specialize on sub-classes of an object class. Experiments on images of human heads, leaves from different species of trees, and motorcars demonstrated that the method works well over a wide variety of objects.

CHAPTER 3 BLOCK DIAGRAM

This section explains general block diagram of object detection and significance of each block in the system. Common object detection mainly includes video input, preprocessing, object segmentation, post processing. It is shown in Fig.

11

The significance of each block is as follows

Video Input:- It can be stored video or real time video.

Preprocessing:-It mainly involves temporal and spatial smoothing such as intensity adjustment, removal of noise. For real-time systems, frame-size and frame-rate reduction are commonly used. It highly reduces computational cost and time[1].

Object detection: It is the process of change detection and extracts appropriate change for further analysis and qualification. Pixels are classified as foreground, if they changed. Otherwise, they are considered as background. This process is called as back ground subtraction. The degree of "change" is a key factor in segmentation and can vary depending on the application. The result of segmentation is one or more foreground blobs, a blob being a collection of connected pixels [1].

Post processing: Remove false detection caused due to dynamic condition in background using morphological and speckle noise removal. BMC 2012 Dataset[6]: This dataset include real and synthetic video. It is mainly used for comparison of different background subtraction techniques. 12

Fish4knowledge Dataset[7]: The Fish4 knowledge 35 dataset is an underwater benchmark dataset for target detection against complex background. Carnegie Mellon Dataset[8]: The sequence of CMU25 by Sheikh and Shah involves a camera mounted on a tall tripod. The wind caused the tripod to sway back and forth causing vibration in the scene. This dataset is useful while studying camera jitter background Situation. Stored video need to be read in appropriate format before processing. Various related functions from image processing(IP) and computer vision(CV) toolbox can be used for this purpose.

13

Chapter 4 Introduction to MATLAB 4.1 MATLAB:(matrix laboratory) is a multi-paradigm numerical computing environment and proprietary programming language developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages. Although MATLAB is intended primarily for numerical computing, an optional toolbox uses the MuPAD symbolic engine allowing access to symbolic computing abilities. An additional package, Simulink, adds graphical multi-domain simulation and model-based design for dynamic and embedded systems. As of 2018, MATLAB has more than 3 million users worldwide.MATLAB users come from various backgrounds of engineering, science, and economics.

4.2 HISTORY: Cleve Moler, the chairman of the computer science department at the University of New Mexico, started developing MATLAB in the late 1970s.He designed it to give his students access to LINPACK and EISPACK without them having to learn Fortran. It soon spread to other universities and found a strong audience within the applied mathematics community. Jack Little, an engineer, was exposed to it during a visit Moler made to Stanford University in 1983. Recognizing its commercial potential, he joined with Moler and Steve Bangert. They rewrote MATLAB in C and founded MathWorks in 1984 to continue its development. These rewritten libraries were known as JACKPAC. In 2000, MATLAB was rewritten to use a newer set of libraries for matrix manipulation, LAPACK. MATLAB was first adopted by researchers and practitioners in control engineering, Little's specialty, but quickly spread to many other domains. It is now also used in education, in particular the teaching of linear algebra and numerical analysis, and is popular amongst scientists involved in image processing.

4.3 Why to use MATLAB? 



Fast development: Fast and good programming with fewer bugs compared with OpenCV since a wide range of functions are available and has support for displaying and manipulate data. Fast coding is a positive side of Matlab that allows you to develop quickly vision applications, but it is slower at execution time, which is a disadvantage point. Fast debugging: Matlab doesn’t have specific programming problems like memory allocation and it can stop automatically the script when encountered a problem. Also, it 14

 

allows users to execute code using command lines even an error occurs and fix the error while the code is still in execution mode. Due to the fact that Matlab can execute code during debugging is an advantage compared with other IDE tools. Clear code: Matlab has a concise code that makes easier to write code, understand, and for debugging. Documentation: Matlab has a comprehensive documentation with a lot of examples and explanations.

4.4 SYNTAX The MATLAB application is built around the MATLAB programming language. Common usage of the MATLAB Assignment Help application involves using the "Command Window" as an interactive mathematical shell or executing text files containing MATLAB code.

4.5 VARIABLES Variables are defined using the assignment operator, =. MATLAB is a weakly typed programming language because types are implicitly converted. It is an inferred typed language because variables can be assigned without declaring their type, except if they are to be treated as symbolic objects, and that their type can change. Values can come from constants, from computation involving values of other variables, or from the output of a function. For example: >> x = 17 x= 17

>> x = 'hat' x= hat

>> x = [3*4, pi/2] x= 12.0000

1.5708

>> y = 3*sin(x) 15

y= -1.6097

3.0000

A simple array is defined using the colon syntax: initial:increment:terminator. For instance: >> array = 1:2:9 array = 13579

defines a variable named array (or assigns a new value to an existing variable with the name array) which is an array consisting of the values 1, 3, 5, 7, and 9. That is, the array starts at 1 (the initial value), increments with each step from the previous value by 2 (the increment value), and stops once it reaches (or to avoid exceeding) 9 (the terminator value). >> array = 1:3:9 array = 147

the increment value can actually be left out of this syntax (along with one of the colons), to use a default value of 1. >> ari = 1:5 ari = 12345

assigns to the variable named ari an array with the values 1, 2, 3, 4, and 5, since the default value of 1 is used as the increment. Indexing is one-based,which is the usual convention for matrices in mathematics, unlike zerobased indexing commonly used in other programming languages such as C, C++, and Java. Matrices can be defined by separating the elements of a row with blank space or comma and using a semicolon to terminate each row. The list of elements should be surrounded by square brackets []. Parentheses () are used to access elements and subarrays (they are also used to denote a function argument list). >> A = [16 3 2 13; 5 10 11 8; 9 6 7 12; 4 15 14 1] 16

A= 16 3 2 13 5 10 11 8 9 6 7 12 4 15 14 1

>> A(2,3) ans = 11

Sets of indices can be specified by expressions such as 2:4, which evaluates to [2, 3, 4]. For example, a submatrix taken from rows 2 through 4 and columns 3 through 4 can be written as: >> A(2:4,3:4) ans = 11 8 7 12 14 1

A square identity matrix of size n can be generated using the function eye, and matrices of any size with zeros or ones can be generated with the functions zeros and ones, respectively. >> eye(3,3) ans = 100 010 001

>> zeros(2,3) ans = 17

000 000

>> ones(2,3) ans = 111 111

Transposing a vector or a matrix is done either by the function transpose or by adding dot-prime after the matrix (without the dot, prime will perform conjugate transpose for complex arrays): >> A = [1 ; 2], B = A.', C = transpose(A) A= 1 2 B= 1

2

C= 1

2

>> D = [0 3 ; 1 5], D.' D= 0

3

1

5

ans = 0

1

3

5

Most MATLAB functions accept arrays as input and operate element-wise on each element. For example, mod(2*J,n) will multiply every element in J by 2, and then reduce each element modulo 18

n. MATLAB does include standard for and while loops, but (as in other similar applications such as R), using the vectorized notation is encouraged and is often faster to execute. The following code, excerpted from the function magic.m, creates a magic square M for odd values of n (MATLAB function meshgrid is used here to generate square matrices I and J containing 1:n). [J,I] = meshgrid(1:n); A = mod(I + J - (n + 3) / 2, n); B = mod(I + 2 * J - 2, n); M = n * A + B + 1;

Structures MATLAB supports structure data types. Since all variables in MATLAB are arrays, a more adequate name is "structure array", where each element of the array has the same field names. In addition, MATLAB supports dynamic field names(field look-ups by name, field manipulations, etc.).

4.6 FUNCTIONS A function is a group of statements that together perform a task. In MATLAB, functions are defined in separate files. The name of the file and of the function should be the same. Functions operate on variables within their own workspace, which is also called the local workspace, separate from the workspace you access at the MATLAB command prompt which is called the base workspace. Functions can accept more than one input arguments and may return more than one output arguments. Syntax of a function statement is − function [out1,out2, ..., outN] = myfun(in1,in2,in3, ..., inN)

Example The following function named mymax should be written in a file named mymax.m. It takes five numbers as argument and returns the maximum of the numbers. Create a function file, named mymax.m and type the following code in it − 19

function max = mymax(n1, n2, n3, n4, n5)

%This function calculates the maximum of the % five numbers given as input max = n1; if(n2 > max) max = n2; end if(n3 > max) max = n3; end if(n4 > max) max = n4; end if(n5 > max) max = n5; end The first line of a function starts with the keyword function. It gives the name of the function and order of arguments. In our example, the mymax function has five input arguments and one output argument. The comment lines that come right after the function statement provide the help text. These lines are printed when you type − help mymax

MATLAB will execute the above statement and return the following result − This function calculates the maximum of the five numbers given as input

20

You can call the function as − mymax(34, 78, 89, 23, 11)

MATLAB will execute the above statement and return the following result − ans = 89

4.6.1 Anonymous Functions An anonymous function is like an inline function in traditional programming languages, defined within a single MATLAB statement. It consists of a single MATLAB expression and any number of input and output arguments. You can define an anonymous function right at the MATLAB command line or within a function or script. This way you can create simple functions without having to create a file for them. The syntax for creating an anonymous function from an expression is f = @(arglist)expression

Example In this example, we will write an anonymous function named power, which will take two numbers as input and return first number raised to the power of the second number. Create a script file and type the following code in it −

power = @(x, n) x.^n; result1 = power(7, 3) result2 = power(49, 0.5) result3 = power(10, -10) result4 = power (4.5, 1.5) When you run the file, it displays − result1 = 343 21

result2 = 7 result3 = 1.0000e-10 result4 = 9.5459

4.6.2 Primary and Sub-Functions Any function other than an anonymous function must be defined within a file. Each function file contains a required primary function that appears first and any number of optional sub-functions that comes after the primary function and used by it. Primary functions can be called from outside of the file that defines them, either from command line or from other functions, but sub-functions cannot be called from command line or other functions, outside the function file. Sub-functions are visible only to the primary function and other sub-functions within the function file that defines them. Example Let us write a function named quadratic that would calculate the roots of a quadratic equation. The function would take three inputs, the quadratic co-efficient, the linear co-efficient and the constant term. It would return the roots. The function file quadratic.m will contain the primary function quadratic and the sub-function disc, which calculates the discriminant. Create a function file quadratic.m and type the following code in it − function [x1,x2] = quadratic(a,b,c)

%this function returns the roots of % a quadratic equation. % It takes 3 input arguments % which are the co-efficients of x2, x and the %constant term % It returns the roots d = disc(a,b,c); 22

x1 = (-b + d) / (2*a); x2 = (-b - d) / (2*a); end % end of quadratic

function dis = disc(a,b,c) %function calculates the discriminant dis = sqrt(b^2 - 4*a*c); end % end of sub-function You can call the above function from command prompt as − quadratic(2,4,-4) MATLAB will execute the above statement and return the following result − ans = 0.7321

4.6.3 Nested Functions You can define functions within the body of another function. These are called nested functions. A nested function contains any or all of the components of any other function. Nested functions are defined within the scope of another function and they share access to the containing function's workspace. A nested function follows the following syntax − function x = A(p1, p2) ... B(p2) function y = B(p3) ... end ... end 23

Example Let us rewrite the function quadratic, from previous example, however, this time the disc function will be a nested function. Create a function file quadratic2.m and type the following code in it − function [x1,x2] = quadratic2(a,b,c) function disc % nested function d = sqrt(b^2 - 4*a*c); end % end of function disc

disc; x1 = (-b + d) / (2*a); x2 = (-b - d) / (2*a); end % end of function quadratic2 You can call the above function from command prompt as − quadratic2(2,4,-4) MATLAB will execute the above statement and return the following result − ans = 0.73205

4.6.4 Private Functions A private function is a primary function that is visible only to a limited group of other functions. If you do not want to expose the implementation of a function(s), you can create them as private functions. Private functions reside in subfolders with the special name private. They are visible only to functions in the parent folder. Example Let us rewrite the quadratic function. This time, however, the disc function calculating the discriminant, will be a private function. 24

Create a subfolder named private in working directory. Store the following function file disc.m in it − function dis = disc(a,b,c) %function calculates the discriminant dis = sqrt(b^2 - 4*a*c); end

% end of sub-function

Create a function quadratic3.m in your working directory and type the following code in it − function [x1,x2] = quadratic3(a,b,c)

%this function returns the roots of % a quadratic equation. % It takes 3 input arguments % which are the co-efficient of x2, x and the %constant term % It returns the roots d = disc(a,b,c);

x1 = (-b + d) / (2*a); x2 = (-b - d) / (2*a); end

% end of quadratic3

You can call the above function from command prompt as − quadratic3(2,4,-4) MATLAB will execute the above statement and return the following result − ans = 0.73205

4.6.5 Global Variables

25

Global variables can be shared by more than one function. For this, you need to declare the variable as global in all the functions. If you want to access that variable from the base workspace, then declare the variable at the command line. The global declaration must occur before the variable is actually used in a function. It is a good practice to use capital letters for the names of global variables to distinguish them from other variables. Example Let us create a function file named average.m and type the following code in it − function avg = average(nums) global TOTAL avg = sum(nums)/TOTAL; end Create a script file and type the following code in it − global TOTAL; TOTAL = 10; n = [34, 45, 25, 45, 33, 19, 40, 34, 38, 42]; av = average(n) When you run the file, it will display the following result − av = 35.500

4.6.6 Function handles MATLAB supports elements of lambda calculus by introducing function handles, or function references, which are implemented either in .m files or anonymous]/nested functions. Classes and object oriented programming MATLAB supports object-oriented programming including classes, inheritance, virtual dispatch, packages, pass-by-value semantics, and pass-by-reference semantics]However, the syntax and calling conventions are significantly different from other languages. MATLAB has value classes and reference classes, depending on whether the class has handled as a super-class (for reference classes) or not (for value classes).[31] 26

Method call behavior is different between value and reference classes. For example, a call to a method object.method(); can alter any member of object only if object is an instance of a reference class, otherwise value class methods must return a new instance if it needs to modify the object. An example of a simple class is provided below. classdef Hello methods function greet(obj) disp('Hello!') end end end

When put into a file named hello.m, this can be executed with the following commands: >> x = Hello(); >> x.greet(); Hello!

Interface with other languages MATLAB can call functions and subroutines written in the programming languages C or Fortran.] A wrapper function is created allowing MATLAB data types to be passed and returned. MEX files (MATLAB executables) are the dynamically loadable object files created by compiling such functions. Since 2014 increasing two-way interfacing with Python was being added. Libraries written in Perl, Java, ActiveX or .NET can be directly called from MATLAB, and many MATLAB libraries (for example XML or SQL support) are implemented as wrappers around Java or ActiveX libraries. Calling MATLAB from Java is more complicated, but can be done with a MATLAB toolbox which is sold separately by MathWorks, or using an undocumented mechanism called JMI (Java-to-MATLAB Interface), (which should not be confused with the unrelated Java Metadata Interface that is also called JMI). Official MATLAB API for Java was added in 2016.

27

As alternatives to the MuPAD based Symbolic Math Toolbox available from MathWorks, MATLAB can be connected to Maple or Mathematica. Libraries also exist to import and export MathML.

Chapter 5 MATLAB Toolboxes

5.1 Computer Vision Computer vision is an interdisciplinary scientific field that deals with how computers can be made to gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to automate tasks that the human visual system can do. Computer vision tasks include methods for acquiring, processing, analyzing and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the forms of decisions.Understanding in this context means the transformation of visual images (the input of the retina) into descriptions of the world that can interface with other thought processes and elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory. The scientific discipline of computer vision is concerned with the theory behind artificial systems that extract information from images. The image data can take many forms, such as video sequences, views from multiple cameras, or multi-dimensional data from a medical scanner. The

28

technological discipline of computer vision seeks to apply its theories and models to the construction of computer vision systems. Sub-domains of computer vision include scene reconstruction, event detection, video tracking, object recognition, 3D pose estimation, learning, indexing, motion estimation, and image restoration. 5.1.1 Applications Applications range from tasks such as industrial machine vision systems which, say, inspect bottles speeding by on a production line, to research into artificial intelligence and computers or robots that can comprehend the world around them. The computer vision and machine vision fields have significant overlap. Computer vision covers the core technology of automated image analysis which is used in many fields. Machine vision usually refers to a process of combining automated image analysis with other methods and technologies to provide automated inspection and robot guidance in industrial applications. In many computer-vision applications, the computers are preprogrammed to solve a particular task, but methods based on learning are now becoming increasingly common. Examples of applications of computer vision include systems for: ●

Automatic inspection, e.g., in manufacturing applications;

●

Assisting humans in identification tasks, e.g., a species identification system;

●

Controlling processes, e.g., an industrial robot;

●

Detecting events, e.g., for visual surveillance or people counting, e.g., in the restaurant industry Interaction, e.g., as the input to a device for computer-human interaction;

●

Modeling objects or environments, e.g., medical image analysis or topographical modeling;

●

Navigation, e.g., by an autonomous vehicle or mobile robot; and

●

Organizing information, e.g., for indexing databases of images and image sequences.

The classical problem in computer vision, image processing, and machine vision is that of determining whether or not the image data contains some specific object, feature, or activity. Different varieties of the recognition problem are described in the literature ●

Object recognition (also called object classification) – one or several pre-specified or learned objects or object classes can be recognized, usually together with their 2D positions in the image or 3D poses in the scene. Blippar, Google Goggles and LikeThat provide stand-alone programs that illustrate this functionality.

●

Identification – an individual instance of an object is recognized. Examples include identification of a specific person's face or fingerprint, identification of handwritten digits, or identification of a specific vehicle. 29

●

Detection – the image data are scanned for a specific condition. Examples include detection of possible abnormal cells or tissues in medical images or detection of a vehicle in an automatic road toll system. Detection based on relatively simple and fast computations is sometimes used for finding smaller regions of interesting image data which can be further analyzed by more computationally demanding techniques to produce a correct interpretation.

Currently, the best algorithms for such tasks are based on convolutional neural networks. An illustration of their capabilities is given by the ImageNet Large Scale Visual Recognition Challenge; this is a benchmark in object classification and detection, with millions of images and hundreds of object classes. Performance of convolutional neural networks, on the ImageNet tests, is now close to that of humans.[26] The best algorithms still struggle with objects that are small or thin, such as a small ant on a stem of a flower or a person holding a quill in their hand. They also have trouble with images that have been distorted with filters (an increasingly common phenomenon with modern digital cameras). By contrast, those kinds of images rarely trouble humans. Humans, however, tend to have trouble with other issues. For example, they are not good at classifying objects into fine-grained classes, such as the particular breed of dog or species of bird, whereas convolutional neural networks handle this with ease. Several specialized tasks based on recognition exist, such as: ●

Content-based image retrieval – finding all images in a larger set of images which have a specific content. The content can be specified in different ways, for example in terms of similarity relative a target image (give me all images similar to image X), or in terms of high-level search criteria given as text input (give me all images which contain many houses, are taken during winter, and have no cars in them).

Computer vision for people counter purposes in public places, malls, shopping centres ●

Pose estimation – estimating the position or orientation of a specific object relative to the camera. An example application for this technique would be assisting a robot arm in retrieving objects from a conveyor belt in an assembly line situation or picking parts from a bin.

●

Optical character recognition (OCR) – identifying characters in images of printed or handwritten text, usually with a view to encoding the text in a format more amenable to editing or indexing (e.g. ASCII). 30

●

2D code reading – reading of 2D codes such as data matrix and QR codes.

●

Facial recognition

●

Shape Recognition Technology (SRT) in people counter systems differentiating human beings (head and shoulder patterns) from objects

The aim of image restoration is the removal of noise (sensor noise, motion blur, etc.) from images. The simplest possible approach for noise removal is various types of filters such as low-pass filters or median filters. More sophisticated methods assume a model of how the local image structures look, to distinguish them from noise. By first analysing the image data in terms of the local image structures, such as lines or edges, and then controlling the filtering based on local information from the analysis step, a better level of noise removal is usually obtained compared to the simpler approaches. An example in this field is inpainting. 5.1.2 Computer vision toolbox in matlab Computer Vision Toolbox™ provides algorithms, functions, and apps for designing and testing computer vision, 3D vision, and video processing systems. You can perform object detection and tracking, as well as feature detection, extraction, and matching. For 3D vision, the toolbox supports single, stereo, and fisheye camera calibration; stereo vision; 3D reconstruction; and lidar and 3D point cloud processing. Computer vision apps automate ground truth labeling and camera calibration workflows.You can train custom object detectors using deep learning and machine learning algorithms such as YOLO v2, Faster R-CNN, and ACF. For semantic segmentation you can use deep learning algorithms such as SegNet, U-Net, and DeepLab. Pretrained models let you detect faces, pedestrians, and other common objects. You can accelerate your algorithms by running them on multicore processors and GPUs. Most toolbox algorithms support C/C++ code generation for integrating with existing code, desktop prototyping, and embedded vision system deployment.

5.2 Image processing In computer science, digital image processing is the use of computer algorithms to perform image processing on digital images. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and signal distortion during processing. Since images are defined over two dimensions (perhaps more) digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics (especially the creation 31

and improvement of discrete mathematics theory); third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased. 5.3 Image sensors The basis for modern image sensors is metal-oxide-semiconductor (MOS) technology, which originates from the invention of the MOSFET (MOS field-effect transistor) by Mohamed M. Atalla and Dawon Kahng at Bell Labs in 1959. This led to the development of digital semiconductor image sensors, including the charge-coupled device (CCD) and later the CMOS sensor. The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in 1969. While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting The NMOS active-pixel sensor (APS) was invented by Olympus in Japan during the mid-1980s. This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling reaching smaller micron and then sub-micron levels. The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985. The CMOS active-pixel sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993.By 2007, sales of CMOS sensors had surpassed CCD sensors. 5.4 Image compression An important development in digital image compression technology was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972 DCT compression became the basis for JPEG, which was introduced by the Joint Photographic Experts Group in 1992. JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format on the Internet. Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of digital images and digital photos, with several billion JPEG images produced every day as of 2015. 5.5 Digital signal processor (DSP) Electronic signal processing was revolutionized by the wide adoption of MOS technology in the 1970s. MOS integrated circuit technology was the basis for the first single-chip microprocessors and microcontrollers in the early 1970s, and then the first single-chip digital signal processor (DSP) chips in the late 1970s. DSP chips have since been widely used in digital image processing. The discrete cosine transform (DCT) image compression algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for encoding, decoding, video coding, audio coding, multiplexing, control signals, 32

signaling, analog-to-digital conversion, formatting luminance and color differences, and color formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as motion estimation, motion compensation, inter-frame prediction, quantization, perceptual weighting, entropy encoding, variable encoding, and motion vectors, and decoding operations such as the inverse operation between different color formats (YIQ, YUV and RGB) for display purposes. DCTs are also commonly used for high-definition television (HDTV) encoder/decoder chips. 5.5.1 Medical imaging In 1972, the engineer from British company EMI Housfield invented the X-ray computed tomography device for head diagnosis, which is what we usually called CT(Computer Tomography). The CT nucleus method is based on the projection of the human head section and is processed by computer to reconstruct the cross-sectional image, which is called image reconstruction. In 1975, EMI successfully developed a CT device for the whole body, which obtained a clear tomographic image of various parts of the human body. In 1979, this diagnostic technique won the Nobel Prize. Digital image processing technology for medical applications was inducted into the Space Foundation Space Technology Hall of Fame in 1994. Image processing toolbox in matlab Image Processing Toolbox™ provides a comprehensive set of reference-standard algorithms and workflow apps for image processing, analysis, visualization, and algorithm development. You can perform image segmentation, image enhancement, noise reduction, geometric transformations, image registration, and 3D image processing. Image Processing Toolbox apps let you automate common image processing workflows. You can interactively segment image data, compare image registration techniques, and batch-process large data sets. Visualization functions and apps let you explore images, 3D volumes, and videos; adjust contrast; create histograms; and manipulate regions of interest (ROIs). You can accelerate your algorithms by running them on multicore processors and GPUs. Many toolbox functions support C/C++ code generation for desktop prototyping and embedded vision system deployment.

5.6 Image Acquisition toolbox in matlab Image Acquisition Toolbox™ provides functions and blocks for connecting cameras and lidar sensors to MATLAB® and Simulink®. It includes a MATLAB app that lets you interactively detect and configure hardware properties. You can then generate equivalent MATLAB code to automate your acquisition in future sessions. The toolbox enables acquisition modes such as 33

processing in-the-loop, hardware triggering, background acquisition, and synchronizing acquisition across multiple devices. Image Acquisition Toolbox supports all major standards and hardware vendors, including USB3 Vision, GigE Vision®, and GenICam™ GenTL. You can connect to Velodyne LiDAR® sensors, machine vision cameras, and frame grabbers, as well as high-end scientific and industrial devices.

CHAPTER 6 MATLAB IMPLEMENTATION

Different toolboxes have been explored for functions and objects which can be useful at various levels in the object detection. All such functions/ objects are described in this Section.

34

6.1 Video Input Input video has two possible ways Stored Video and real time video. Stored video can be obtained from standard dataset available from internet. Real time video includescamera continuously monitoring specific area producing real time video. These video can be understood by MATLAB after reading.

6.1.1 Stored Video Some commonly used standard video dataset are as follows Wallflower Dataset [4]: It is provided by Toyama et al[]and contains seven canonical sequences with different background situation. PETS Dataset: ”Performance Evaluation of Trackingand Surveillance” (PETS) consist of various datasets like PETS 2001, PETS 2003 and PETS 2006. They are more useful for tracking evaluation rather for Background. ChangeDetection.net Dataset[5]: The CDW29 dataset presents a realistic video dataset consisting of 31 video sequence which are categorized in 6 different challenges. Color and Thermal IR type of video included in dataset.

BMC 2012 Dataset[6]: This dataset include real and synthetic video. It is mainly used for comparison of different background subtraction techniques .Fish4 knowledge Dataset[7]: The Fish4 knowledge 35dataset is an underwater benchmark dataset for target detection against complex background.Carnegie Mellon Dataset[8]: The sequence of CMU25by Sheikh and Shah involves a camera mounted on a tall tripod. The wind caused the tripod to sway back and forth causing vibration in the scene.

This dataset is useful while studying camera jitter back ground situation.Stored video need to be read in appropriate format before processing. Various related functions from image processing(IP) and computer vision(CV) toolbox can be used for this purpose.

Toolbox

Object/Function

Function Name

Image

Function

imread

processing

Use

Read image from graphics file

35

Image

Function

iminfo

processing

Image

graphics file

Function

imwrite

processing

Image

Information about

Write image to graphics file

Function

imshow

Object

vision.Video Reader

Display image

processing Computer vision

File Read video frames and audio samples from video file

Computer

Object

vision

vision.Video

Write video frames

File Writer

and audio samples to video file

Computer vision

Object

vision.Video

Play video or

Player

display image

6.1.2 Real Time Video 36

Image acquisition is widely used toolbox which allows real time acquisition of video from video acquisition device.Some commonly used function are explained below

Imaqtool:It launches an interactive GUI and allowsusersto explore, configure, and acquire data from image acquisition devices.

Videoinput: It can be used to create video input object.This object can further be used to acquire and display the image sequences.

Propinfo: It captures all the property information about image acquisition object. This information can be useful in further video processing.

Getsnapshot: It immediately returns one single imageframe, from the video input object. This function is useful to capture image at critical moment.

Trigger: Initiates data logging for the video inputobject. It can be used to initialize video at appropriate moment and collect a video data.

Triggerconfig: User can configure trigger properties of video input object.

6.2 PreProcessing : Data preprocessing is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to data mining and machine learning projects. Datagathering methods are often loosely controlled, resulting in out-of-range values (e.g., Income: −100). Analyzing data that has not been carefully screened for such problems can produce misleading results. Thus, the representation and quality of data is first and foremost before running an analysis.Often, data preprocessing is the most important phase of a machine learning project, especially in computational biology. If there is much irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase is more difficult. Data preparation and filtering steps can take considerable amount of processing time. Data preprocessing includes cleaning, 37

Instance selection, normalization, transformation, feature extraction and selection, etc. The product of data preprocessing is the final training set. Data pre-processing may affect the way in which outcomes of the final data processing can be interpreted. This aspect should be carefully considered when interpretation of the results is a key point, such in the multivariate processing of chemical data (chemometrics).

Task of PreProcessing ●

Data cleansing

●

Data editing

●

Data reduction

●

Data wrangling

Preprocessing may include series of operation those are shown :

6.2.1 Video Type Conversion

The video is needed to be converting to appropriate data type after reading. Useful objects and functions are listed Useful function/object for video data type conversion

Tool box

Function/object

Name

Use

CV

object

vision.Image Data Converts and scales Type Converter an input image to a specified output data type. OutputData type may includedouble, single,int8,uint8,int1 6,uint16,boolean, Custom 38

IP

Function

Im2doubl, im2single, im2 uint8,

These function can be used to convert image to specified form

im2uint16

6.2.2 Video Enhancement :

This step may include noise removal, contrast adjustment,image correction. Useful function and object summarized

Toolbox

Function/

Name

Use

vision.Median Filter

2Dmedian filtering (to remove

object

CV

Object

Salt peppernoise) 39

and

CV

Object

vision.Image Filter

Perform 2-D FIR filtering of input matrix

CV

Object

vision.Contrast Adjuster

Adjust contrast

image

by linear scaling

CV

Object

vision.Histogram

Enhance contrast of

Equalizer

images using histogram equalization

IP

Function

imadjust

Adjust image intensity values or colormap

IP

Function

imcontrast

40

Adjust Contrast tool

IP

Function

histeq

Enhance contrast using histogram equalization

6.2.3 Feature Extraction

Any object detection system performs segmentation based on one or more feature of the scene. It may include color,corner, edge, shape, gradient, texture, DCT or DFT coefficient. Different functions are available to extract these Useful function/object for feature extraction

Toolbox

Function/object

Name

IP

Function

rgb2gray

IP

Function

rgb2ycbcr

IP

Function

ycbcr2rgb

IP

Function

corner

IP

Function

edge

IP

Function

imgradient

IP

Function

entropyfiit

IP

Function

rangefilt

41

IP

Function

stdfilt

CV

Object

vision.ColorSpace Converter

CV

Object

vision.DCT

CV

Object

vision.FFT

CV

Object

vision.EdgeDetector

6.3 Step by Step Process: Different toolboxes have been explored for functions and objects which can be useful atvarious levels in the object detection. All such functions/objects are described in this section… STEP 1: INPUT…..to store the input…stored input need to be read in appropriate format before processing. Various related functions from image processing and computer vision toolbox can be used for this purpose.Some example functions are “imread, iminfo, imwrite, imshow”. These functions are used to read, to write, to get information and to display the image. STEP 2: PREPROCESSING….RGB to Gray conversion and gaussian noise removal using median filter. It includes series of operations those are shown..   

Input type conversion Enhancement Feature Extraction

STEP 3: Object Detection….various object detection methods being used to detect object. These methods are classified based on Template, motion, classifier, feature. Computer vision toolbox includes some predefined objects which can be useful to implement these object detection methods. Some of the functions are..  

vision.scadeobjectDetector vision.OpticalFlow 42



vision.PeopleDetector

STEP 4: Post Processing…..it is required to remove unwanted portion in the foreground mask. It mayarise due to false detection caused by dynamic may include speckle noise, small holes in the scene etc. Detected object can be annoted for proper display. Some of the useful functions in this process are…   

imclose imopen imfill

43

CHAPTER 7 SOURCE CODE

clc %% Test Two %%Histogram of Orientated Gradients %%Histogram of Pixel Orientation %%Histogram of curvatures %% Eccentricity %clear all close all %% Area Ratios Weight tic load ('newData.mat')%read video %load('FinalHog.mat') depth=6; Params = [9 3 2 1 0.2]; video=mmreader('F:\Thesis\Testing Datasets\test_videos\test_videos\3.avi');

Videos\T4.h64');%\Other

%[7.5 18.5 345 224])); for k=3501:5:4000 %%Read an image figure(1); image=imcrop(read(video,k),[7.5 18.5 345 224]); 44

%%Gray Scale image img=(rgb2gray(image)); BW=edge(sqrt(double(img)),'canny',0.29); % [x,y]=find(BW); % deri= diff([x y],2) ; % ind=find(deri(:,1)~=0&deri(:,2)~=0); img1=sqrt(double(img))-sqrt(double(fi)); fore=zeros(size(img1)); ind=find(img1>max(max(img1)*0.6)); fore(ind)=255; %BW=abs(edgelinking2_C(BW,3,3)); [BW AngleLeft AngleRight]= edgelinking2_C(BW,3,3); BW=abs(BW); st=strel('disk',3); BW=imopen(BW,st); fore=imdilate(fore,st); LabelsList=unique(BW(ind)); toc hold off; figure(2); subplot(2,2,1),imshow(BW); subplot(2,2,2),imshow(fore); subplot(2,2,3:4),imshow(image);hold on; maximumLabel=max(max(BW)); %Find Properties of Connected Components % for all those contours whose area is less than 20 and greater the 150 for i=2:numel(LabelsList) 45

[x_A,y_A]=find(BW==LabelsList(i)); px=x_A; py=y_A; if(~isempty(px))

box= boundingBox([y_A x_A]); height=box(4)-box(3); width=box(2)-box(1);

subImage=imcrop(img,[box(1) box(3) (box(2)-box(1)) box(4)-box(3)]); subImage=imresize(subImage,[35 20]);

if(~isempty(subImage))

hogs = HoG(double(subImage),Params); %

subImage=edge(subImage,'canny');

%

r = regionstat(double(subImage), 1, 'Extent');

% Z=[x_A-median(x_A),y_A-median(y_A)]; C=cov(Z); [E,V]=eig(C); V=sort(diag(V)); stra=V(2)/sum(V); ell = inertiaEllipse([x_A y_A]); OtherFeatures=[ell(4)/ell(3);ell(5);stra];%;AngleRight(i);stra]; 46

H=[hogs;OtherFeatures];%;length(unique(x_A));r;((box(2)-box(1))/height);stra];

[predict_label, accuracy, prob_estimates] = svmpredict(1,H', svmmodel);

drawBox(box,'g');

end

%%Result end end saveas(figure(2),strcat('Results\T4_',num2str(k),'.jpg')); time=toc; fi=rgb2gray(image); end % title('subImage'); %saveit = close(saveit);

CHAPTER 8 RESULT 47

Object Detection is observed

CHAPTER 9 APPLICATIONS

Here we can discuss some current and future Applications in detail. 1. OPTICAL CHARACTER RECOGNITION Optical character recognition or optical character reader, often abbreviated as OCR, is the mechanical or electronic conversion of images of typed, handwritten or printed text into machineencoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image, we are extracting characters from the image or video.

48

Widely used as a form of information entry from printed paper data records – whether passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation it is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech. SELF DRIVING CARS One of the best examples of why you need object detection is for autonomous driving is In order for a car to decide what to do in next step whether accelerate, apply brakes or turn, it needs to know where all the objects are around the car and what those objects are That requires object detection and we would essentially train the car to detect known set of objects such as cars, pedestrians, traffic lights, road signs, bicycles,motorcycles, etc.

TRACKING OBJECTS 49

Object detection system is also used in tracking the objects, for example tracking a ball during a football match, tracking movement of a cricket bat, tracking a person in a video. Object tracking has a variety of uses, some of which are surveillance and security, traffic monitoring, video communication, robot vision and animation.

4. FACE DETECTION AND FACE RECOGNITION Face detection and Face Recognition is widely used in computer vision task. We noticed how facebook detects our face when you upload a photo This is a simple application of object detection that we see in our daily life.Face detection can be regarded as a specific case of object-class detection. In object-class detection, the task is to find the locations and sizes of all objects in an image that belong to a given class. Examples include upper torsos, pedestrians, and cars. Face detection is a computer technology being used in a variety of applications that identifies human faces in digital images. Face recognition describes a biometric technology that goes way beyond recognizing when a human face is present. It actually attempts to establish whose face it is. There are lots of applications of face recognition. Face recognition is already being used to unlock phones and specific applications. Face recognition is also used for biometric surveillance, Banks, retail stores, stadiums, airports and other facilities use facial recognition to reduce crime and prevent violence. 50

SMILE DETECTION Facial expression analysis plays a key role in analyzing emotions and human behaviors. Smile detection is a special task in facial expression analysis with various potential applications such as photo selection, user experience analysis and patient monitoring.

PEDESTRIAN DETECTION Pedestrian detection is an essential and significant task in any intelligent video survillance system, as it provides the fundamental information for semantic understanding of the video footages. It has an obvious extension to automotive applications due to the potential for improving safety systems.

BALL TRACKING IN SPORTS Increase in the number of sport lovers in games like football, cricket, etc. has created a need for digging, analyzing and presenting more and more multidimensional information to them. Different classes of people require different kinds of information and this expands the space and scale of the required information. Tracking of ball movement is of utmost importance for extracting any information from the ball based sports video sequences and we can record the video frame according to the movement of the ball automatically.

OBJECT RECOGNITION AS IMAGE SEARCH By Recognizing the objects in the images ,combining each object in the image and passing detected objects label in the URL we can make the object detection system as image search.

51

AUTOMATIC TARGET RECOGNITION Automatic target recognition (ATR) is the ability for an algorithm or device to recognize targets or other objects based on data obtained from sensors. Target recognition was initially done by using an audible representation of the received signal, where a trained operator who would decipher that sound to classify the target illuminated by the radar. While these trained operators had success, automated methods have been developed and continue to be developed that allow for more accuracy and speed in classification. ATR can be used to identify man made objects such as ground and air vehicles as well as for biological targets such as animals, humans, and vegetative clutter. This can be useful for everything from recognizing an object on a battlefield to filtering out interference caused by large flocks of birds on Doppler weather radar.

CHAPTER 10 CONCLUSION AND FUTURE SCOPE

10.1 DISCUSSION AND CONCLUSION This report presents the basic object detection system. MATLAB platform(MATLAB 2012) is used to carry implementation of the system. Different Toolboxes has been explored and useful MATLAB functions and objects are collected which can be useful at various stages. Toolboxes mainly includes image acquisition, image processing and computer vision. Sample MATLAB coding is presented for object detection. Each stage in the system has been implemented by 52

available functions/objects in toolbox. It shows that implementation is easy and code is being short due to use of predefined objects/functions in MATLAB. This study may help new student and research in this field to study, implement and experiment established research.

10.2 FUTURE SCOPE: Some steps in the object tracking process are mostly done manually, feature selection is one example. The accuracy of object tracking could potentially increase by developing methods for a more automatic selection process of features. We know from experience that a human tends do make more mistake than a computer program optimized for a certain purpose. Automatic feature selection has received attention in the area of pattern recognition, where methods for this purpose are divided into filter methods and wrapper methods [48]. However, these have not gotten the same attention in the area of object tracking, where feature selection still is mostly done manually. There could be room for improvement in object tracking by developing fast and accurate methods for automatic feature selection. A suitable continuation of a work like this thesis would be to make an easy, comprehensible summary over the most common object tracking algorithms, thus making an extension to this work.

CHAPTER 11 REFERENCES    

Video Analytics: http://www.dspdesignline.com/videoanalytics.html

https://in.mathworks.com/matlabcentral/fileexchange/54092-object-detection https://en.wikipedia.org/wiki/Object_detection Shireen Y. Elhabian, Khaled M. El-Sayed, Moving Object Detection in Spatial Domain using Background Removal Techniques  Jun-Wei Hsieh, Shih-Hao Yu, Yung-Sheng Chen, An Automatic Traffic Surveillance System for Vehicle Tracking and Classification, IEEE Transactions on Intelligent Transportation Systems, Vol.

53

54

Object Detection [PDF]

A Project report On OBJECT DETECTION USING MATLAB Submitted in partial fulfillment of the requirement for the award of

File loading please wait...

Citation preview

A Project report On

OBJECT DETECTION USING MATLAB

Submitted in partial fulfillment of the requirement for the award of