I Jsa It 01132012

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

ISSN No.

2278-3083
Volume 1, No.3, July August 2012

International Journal of Science and Applied Information Technology


Available Online at http://warse.org/pdfs/ijsait01132012.pdf

A Content Based Approach for Image Retrieval from Relevance Feedback


A.V. Senthil Kumar, Director, Department of Computer Applications, Hindusthan College of Arts & Science,Coimbatore. [email protected] J.Sivakami, Research Scholar, Department of Computer Applications, Hindusthan College of Arts & Science,Coimbatore. [email protected] ABSTRACT Content-Based Image Retrieval (CBIR), is mainly based on finding images of interest from a large image database using the visual content of the images. Most of the approaches to image retrieval were text-based, where individual images had to be annotated with format. Existing works are based on the performance of a number of clustering algorithms in image retrieval has been analyzed. The proposed work in this paper is viewed on a new fuzzy based c-means partitional clustering algorithm. Partitional clustering algorithm is used to improve the Content Based Image Retrieval and for comparing the performance of the image. Keywords: Clustering, Partitional algorithm, CBIR, Fuzzy C- means 1.INTRODUCTION Knowledge discovery in databases process or KDD is relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large datasets involving methods at the intersection of artificial intelligence, machine learning, statistics and database system [1]. The goal of data mining is to extract knowledge from a data set in a human-understandable structure. Data mining is the entire process of applying computer-based methodology, including new techniques for knowledge discovery [2], from data. Databases, Text Documents, Computer Simulations, and Social Networks are the sources of data for mining. A cluster is a collection of data points that are similar to one another within the same cluster and dissimilar to data content. The proposed approach for the image retrieval system which request a number of iterative feedbacks to produce refined search results in a large scale image database and extracting the feature into the color, space, text into the pattern mining. Finally retrieves the image on auto retrieval and compare the performance for proposed system. While the performance of a number of clustering algorithms in image retrieval has been analyzed in existing paper and compare its performance with that of the automatic feedback. Points in other clusters [4]. Clustering is a method of unsupervised classification, where data points are grouped into clusters based on their similarity. The goal of a clustering algorithm is to maximize the intra-cluster similarity and minimize the inter-cluster similarity [3]. Partitional and hierarchical clustering are the most widely used forms of clustering. In partition clustering, the set of n data points are partitioned into k non-empty clusters, where k n. In the case of hierarchical clustering, the data points are organized into a hierarchical structure [6], resulting in a binary tree or dendogram. In this paper, we propose a new clustering algorithm, which would come under the category of partitional clustering algorithms. Two commonly used methods for partitioning data points include the k-means method and the k-medoids method. In the k-means method, each cluster is represented by its centroid or the mean of all data points in the cluster [5]. In the case of the k-medoids method, each cluster is represented by a data point close to the centroid of the cluster. Apart from these methods, there has been lots of work on fuzzy partitioning methods and partition methods for large scale datasets [8]. 2. RELATED WORKS Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) [7] so that the objects in the same cluster are more similar (in some sense or another) to each other than to those in other clusters [4]. Clustering is a main task of explorative data mining, and a common technique for statistical data analysis used in many fields,
65 @ 2012, IJSAIT All Rights Reserved

A.V.Senthil Kumar et al., International Journal of Science and Advanced Information Technology, 1 (3), July August, 65-69

including machine learning, pattern recognition, image analysis[8], information retrieval, and bioinformatics. Clustering is a data mining (machine learning) technique used to place data elements into related groups without advance knowledge of the group definitions [3]. Popular clustering techniques include k-means clustering and Expectation Maximization (EM) clustering. A clustering is essentially a set of such clusters, usually containing all objects in the data set [7]. Additionally, it may specify the relationship of the clusters to each other, for example a hierarchy of clusters embedded in each other. Clustering can be roughly distinguished in: Hard clustering: each object belongs to a cluster or not Soft clustering (also fuzzy clustering): each object belongs to each cluster to a certain degree (e.g. a likelihood of belonging to the cluster)

k-means algorithm. They define the contribution of a data point belonging to a cluster as the impact that it has on the quality of the cluster. This metric is then used to obtain an optimal set of k cluster from the given set of data points. They now outline the proposed contribution-based clustering algorithm. It optimizes on two measures, namely the intra cluster dispersion given by

And the inter-cluster dispersion given by

where k is the number of clusters and is the mean of all centroids. The algorithm tries to minimize and maximize . We use the notion of contribution of a data point for partitional clustering. The resultant algorithm requires only three passes and we show that the time complexity of each pass is same as that of a single iteration of the k-means clustering algorithm. While the k-means algorithm optimizes only on the intra-cluster similarity, our algorithm also optimizes on the inter-cluster similarity. Clustering has widespread applications in image processing. Color-based clustering techniques have proved useful in image segmentation [9]. The k-means algorithm is quite popular for this purpose. Clustering based on visual content of images is an area that has been extensively is in research for several years. This finds application in image retrieval. Content-Based Image Retrieval (CBIR) aims at finding images of interest from a large image database using the visual content of the images. Traditional approaches to image retrieval were text-based, where individual images had to be annotated with textual descriptions. Since this is a tedious manual task, image retrieval based on visual content is very essential [10]. At each feedback, the results are presented to the user and the related browsing information is stored in the log database. After accumulating long-term users browsing behaviors, offline operation for knowledge discovery is triggered to perform navigation pattern mining and pattern indexing. The log database maintains the separate details for each log user and the feedback for each user and it recovers image at retrieval. These logs keep records of database changes. If a database needs to be restored to a point beyond the last full, offline backup, logs are required to roll the data forward to the point of failure.

Clustering algorithm. While the k-means algorithm optimizes only on the intra-cluster similarity, our algorithm also The learning-enhanced feedback has been one of the most active research areas in content-based image retrieval in recent years[8]. However, few methods using the feedback are currently available to process relatively complex queries on large image databases. In the case of complex image queries[9], the feature space and the distance function of the users perception are usually different from those of the system. This difference leads to the representation of a query with multiple clusters (i.e., regions) in the feature space. Therefore, it is necessary to handle disjunctive queries in the feature space. In this paper, we propose a new content-based image retrieval method using adaptive classification and cluster merging to find multiple clusters of a complex image query [10]. 3. EXISTING SYSTEM Clustering is a form of unsupervised classification that aims at grouping data points based on similarity. In this paper, they propose a new partitional clustering algorithm based on the notion of contribution of a data point. They apply the algorithm to content-based image retrieval and compare its performance with that of the k-means clustering algorithm.Partitional clustering aims at partitioning a group of data points into disjoint clusters optimizing a specific criterion. When the number of data points is large, a brute force enumeration of all possible combinations would be computationally expensive. Instead, heuristic methods are applied to find the optimal partitioning. The most popular criterion function used for partitional clustering is the sum of squared error function given by A widely used squarederror based algorithm is the k means clustering algorithm. In this paper, we propose a clustering algorithm similar to the

66 @ 2012, IJSAIT All Rights Reserved

A.V.Senthil Kumar et al, International Journal of Science and Advanced Information Technology, 1 (3), July August, 65-69

Computing distance measures based on color similarity is achieved by computing a color histogram for each image that identifies the proportion of pixels within an image holding specific values (that humans express as colors). Current research is attempting to segment color proportion by region and by spatial relationship among several color regions. Examining images based on the colors they contain is one of the most widely used techniques because it does not depend on image size or orientation. Color searches will usually involve comparing color histograms, though this is not the only technique in practice. 4. PROPOSED SYSTEM The proposed approach for the image retrieval system which request a number of iterative feedbacks to produce refined search results in a large scale image database and extracting the feature into the color, space, text into the pattern mining. High quality of image retrieval on RF can be achieved in a small number of feedbacks. To resolve the problems existing in current RF, such as redundant browsing and exploration convergence. The approximated solution takes advantage of exploited knowledge (navigation patterns) to assist the proposed search strategy in efficiently hunting the desired images. All positive images are considered for navigation pattern mining. It focuses on the discovery of relations among the users browsing behaviors on RF. the frequent patterns mined from the user logs are regarded as the useful browsing paths to optimize the search direction on RF. The navigation pattern mining patterns from the user log in the log database and indexing the pattern .It finally retrieves the image on auto retrieval and compare the performance for proposed system. The concept of contribution to find the optimal cluster number, we use it in a different manner for optimal partitioning of the data points into a fixed number of clusters.

5. EXPERIMENTAL RESULTS The images were clustered using our algorithm with the initial centroids chosen at random. The cluster whose centroid was closest in distance to the given test image was determined and the images belonging to the cluster were retrieved. The results were then compared with images retrieved using the kmeans clustering algorithm with the same set of initial centroids. In this Figure 1, the proposed images basic partitional clustering algorithm for Content-Based Image Retrieval is compared with the proposed Fuzzy algorithm partitional clustering for Content-based image retrieval. When the number of clusters increases, the average precision will be increased. The average precision rate varies in the interval of 0.1. Fuzzy partitional clustering algorithm gives better results than the existing clustering algorithm. The number of clusters increased by 1.

67 @ 2012, IJSAIT All Rights Reserved

A.V.Senthil Kumar et al, International Journal of Science and Advanced Information Technology, 1 (3), July August, 65-69

6. CONCLUSION In this paper, the basic partitional clustering algorithm for Content-based image retrieval is compared with the proposed fuzzy algorithm partitional clustering for Contentbased image retrieval. When the number of clusters increases, the Cluster accuracy will increased. The main feature RF is to efficiently optimize the retrieval quality of interactive CBIR. On one hand, the patterns derived from the users long term browsing behaviors are used as a good support for minimizing the number of user feedbacks. On the other hand, the proposed algorithm RF Search performs the pattern-based search to match the users intention by merging three query refinement strategies. The proposed approach for the image retrieval system which request a number of iterative feedbacks to produce refined search results in a large scale image database and extracting the feature into the color, space, text into the pattern mining. High quality of image retrieval on RF can be achieved in a small number of feedbacks. To resolve the problems existing in current RF, such as redundant browsing and exploration convergence. The approximated solution takes advantage of exploited knowledge (navigation patterns) to assist the proposed search strategy in efficiently hunting the desired images. In the future, integrate users profile into NPRF to further increase the retrieval quality. We will apply the NPRF approach to more kinds of applications on multimedia retrieval or multimedia recommendation Present a set of experiments to evaluate the performance of the proposed approach with the existing. The result also shows that this enhanced approach performs better than conventional techniques. 7. REFERENCES 1. M.D. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B.Dom, M. Gorkani, J. Hafner, D. Lee, D. Steele, and P. Yanker. Query by Image and Video Content: The QBIC System, Computer, Vol. 28, No. 9, pp. 23-32, Sept. 1995. 2. R. Fagin. Combining Fuzzy Information from Multiple Systems, Proc. Symp. Principles of Database Systems (PODS), pp. 216-226, June1996. 3. R. Fagin. Fuzzy Queries in Multimedia Database Systems, Proc.Symp. Principles of Database Systems (PODS), pp. 1-10, June 1998. 4. J. French and X-Y. Jin, An Empirical Investigation of theScalability of a Multiple Viewpoint CBIR System, Proc. Intl Conf.Image and Video Retrieval (CIVR), pp. 252-260, July 2004.

Figure 1: Comparison of K-means, FCM and Proposed method In this Figure 2, we are taking the two parameters as methods and retrieval accuracy. In that the X axis represents the methods parameter and Y axis denotes the retrieval accuracy.

Figure 2: Average precision and Number of Clusters S.No 1 2 3 Method K-means Fuzzy C- means Proposed system Retrieval Accuracy % 80-85% 85-90% 90-95%

In our paper image retrieval is used as the parameter to find how much level of accuracy is achieved in image retrieval .The retrieval of images is compared with the three algorithm such as K means, Fuzzy c means (FCM) and proposed method From the graph shows that, the Fuzzy c means (FCM) has better than the retrieval accuracy of the image rather than the K means but the proposed system has high level of the image retrieval accuracy than other two methods. Finally the retrieval of the image accuracy is high in the proposed system.

68 @ 2012, IJSAIT All Rights Reserved

A.V.Senthil Kumar et al, International Journal of Science and Advanced Information Technology, 1 (3), July August, 65-69

5 D. Harman. Relevance Feedback Revisited, Proc. 15th Ann. IntlACM SIGIR Conf. Research and Development in Information Retrieval, pp. 1-10, 1992. 6. Y. Ishikawa, R. Subramanya, and C. Faloutsos. MindReader: Querying Databases through Multiple Examples, Proc. 24th IntlConf. Very Large Data Bases (VLDB), pp. 218-227, 1998. 7. Jin and J.C. French. Improving Image Retrieval Effectivenessvia Multiple Queries, Multimedia Tools and Applications, vol. 26,pp. 221-245, June 2005.

8. D.H. Kim and C.W. Chung. Qcluster: Relevance Feedback UsingAdaptive Clustering for Content-Based Image Retrieval, Proc.ACM SIGMOD, pp. 599-610, 2003. 9. K. Porkaew, K. Chakrabarti, and S. Mehrotra, Query Refinementfor Multimedia Similarity Retrieval in MARS, Proc. ACM IntlMultimedia Conf. (ACMMM), pp. 235-238, 1999. 10.J. Liu, Z. Li, M. Li, H. Lu, and S. Ma. Human BehaviourConsistent Relevance Feedback Model for Image Retrieval, Proc.15th Intl Conf. Multimedia, pp. 269-272, Sept. 2007.

69 @ 2012, IJSAIT All Rights Reserved

You might also like