Tag Cleaning

Ontology of Folksonomy -- Am I being very ambitious ?

Abstract
Online content has become huge and is scattered all over the internet, data present is not classified properly and is not linked properly and hence not very easy to find. Current search engines search on the basis of keywords and only very few keep context in mind, Semantic web steps in to solve the problem by classifying data according to a set of rules called ontologies and establish relation between data. Ontologies are explicit in nature and some exist for the field of research, medical, Journals etc but this is unusable for a vast entity like world wide web as its not humanly possible to create ontology of everything manually.This makes semantic web difficult to come in mainstream web, It becomes important to have a engine which can classify incoming data with the help of seed ontology and its own metadata.
Folksonomy is a new phenomena in web 2.0 where people have started labelling their content with metadata usually called tags. This brings in human element and thus give some chance for contextual data to come in picture. My idea is to read these tags of the data and build a ontology/taxanomy on the basis of seed ontology/taxanomy where classification may not be perfect but will become automated and hence usable.
Drupal is an open source CMS which supports concept of taxanomy and vocabulary in its framework which I plan to leverage and extend it to build a classifier. This engine take data from various forums, community bulletins and portals like youtube, flickr etc and try to classify the data into existing vocabulary and establish relation between the data. Using the same technique I can also execute tag cleaning which simply remove data which doesn't fit in the class.

LETS DO IT !!

Syndicate content