Monday, November 5, 2007

Week 6:Enterprise Integration Technologies-

The main idea behind the Semantic Web is to develop technologies and applications that will make the machines to understand information semantically and perform
the desired task.
Introduction
“To date, the Web has developed most rapidly as a medium of documents for people rather than for data and information that can be processed automatically. The Semantic Web aims to make
up for this.” The present day search engines like Google, Yahoo!, etc though they give us lot of hits, mostly spew out irrelevant information in answer to a search query. The problem with these search engines is that they use mostly statistical methods like frequency of occurrence of words, co-occurrence of words, etc. this results in the search queries resulting in irrelevant hits. Though, some search engines like Google and Yahoo! use human edited entries, still they come up with a large number of wrong hits. That is when people started talking about making the Web more meaningful. In other words, metadata should be attached to the content of the Web that it becomes easier to retrieve it.
The concept of Semantic Web was introduced by Tim Berners-Lee, the developer of
HTML, Hyper Text Transfer Protocol (HTTP), Uniform Resource Identifiers (URI) and
World Wide Web (WWW). His visualization of Semantic Web is that in future we will
have intelligent software agents that will analyze a particular given situation and present
us with the best possible alternatives. In other words, the connectivity that is found today
only on PCs through the Web, will become a part of our daily life.
The idea behind Semantic Web is to develop such technologies that make the information more meaningful for the machine to process, which in turn makes search and retrieval of information more effective for humans. For instance, available Web technologies include parsers which can validate the display of Web documents by checking for syntactical errors. But as of now, computers are unable to understand the semantics underlying the documents. For example, a computer cannot understand that a particular Web page is the homepage of an Institute or that of an individual; or that a hyperlink leads to the resumeof a person.
The World Wide Web Consortium (W3C) gives the following two definitions for the
Semantic Web :
"The Semantic Web is the representation of data on the World Wide Web. It is a
collaborative effort led by W3C with participation from a large number of researchers
and industrial partners. It is based on the Resource Description Framework (RDF),
which integrates a variety of applications using XML for syntax and URIs (Uniform
Resource Identifiers) for naming." Also ,"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation."
The conception of Semantic Web is characterized by developing languages, tools, etc.
that make information processing semantically by machines. And also a very important
aspect of Semantic Web is development of standards and protocols, as there is hardly any consensus among the people working on projects about what the future Semantic Web will be.

Underlying Technologies of Semantic Web
"The principal technologies of the Semantic Web fit into a set of layered specifications.
The current components of that framework are the RDF Core Model, the RDF Schema
language and the Web Ontology language. These languages all build on the foundation
of URIs, XML, and XML namespaces."
Most of the technologies involved in the development of the Semantic Web are still in
their infancy. Some of them already in use are the URIs (for identifying documents
uniquely and globally), XML (to semantically structure the data), RDF (to base the
structures of the documents on a common model base), Ontologies (to define the
objects/entities and the interrelations between these objects/entities) etc.

Conclusion
Tim Berners-Lee visualized the Semantic Web as a layered structure with resource
identifying systems like the Unicode and URI at its foundations. Then, the next layer
consists of the XML Schema used to describe resources. Next to XML comes the RDF
layer – RDF is used to harmonize the different descriptions used to describe the Web
resources.
Ontology defines the concepts and the relationships between each of these concepts. On
the ontology layer sits the logic layer. This is more at an abstract level. The assertions
made on Web can be used to derive new knowledge. The most important aspect of his
visualization is that the topmost levels are of proof and trust. It is very essential to
establish the validity and reliability of resources accessible over the Web. This can be
achieved by digital signatures. Digital signatures can establish the origin of the document
and thus establish trust about a given resource on the Web.

References
1. Berners-Lee, Tim, et al. The Semantic Web. In Scientific American, May
2001. http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-
84A9809EC588EF21
2. W3C Semantic Web. http://www.w3.org/2001/sw/
3. Semantic Web Activity Statement. http://www.w3.org/2001/sw/Activity
4. XSL Transformations (XSLT). Version 1.0 W3C Recommendation, 16
November 1999. http://www.w3.org/TR/xslt
5. Interactive Glossary of Internet Terms. www.walthowe.com/glossary/r.html
6. Resource Description Framework (RDF) Model and Syntax Specification:
W3C Recommendation, 22 February 1999. http://www.w3.org/TR/1999/REC-
rdf-syntax-19990222/#intro
7. Namespaces in XML. World Wide Web Consortium, 14-January-1999.
http://www.w3.org/TR/REC-xml-names/#sec-intro
8. Gruber, T. R. A Translation Approach to Portable Ontology Specifications.
http://gicl.mcs.drexel.edu/people/regli/Classes/KBA/Readings/KSL-92-71.pdf
9. Berners-Lee, Tim. Semantic Web - XML2000
http://www.w3.org/2000/Talks/1206-xml2k-tbl/slide10-0.html

No comments: