Home > Rediff Guide To The Net > Features
Feedback  |  May 16, 2002     

  >  Site Tours

  >  Features

  >  Off the Web

  >  Dr Know

  >  Celebrity Surfing

 Web Logs

  >   Terror in America

 Specials

  >   Best of Guide 2001

  >   Travel Guide

  >   Education Guide

  >   Email@30




 TIPS to search 1
 billion Web pages fast!

 Search the Web:

 

 
E-Mail this report to a friend
Print this page Best Printed on  HP Laserjets


[Let's talk your language][Let's talk your language]

   Avina Lobo


You're familiar with screensavers that are more than just pretty faces. Distributed computing projects like Seti@Home and United Devices harness idle computing power of millions of computers to perform complex number crunching in search of extraterrestrials from outer space or a cure for cancer.

Now imagine surfing, when taking a break from work, up pops a window asking you to translate a word or a phrase from Kannada to German or Swahili.

This is 'distributed human computing', part of an open source initiative inspired by Seti@Home, called the World Wide Lexicon Project. It is aimed at linking dictionaries, encyclopedias, translation servers, semantic networks and, most importantly, people throughout the Web into a huge multilingual dictionary service.

The objective: To make language services easily accessible to a wide range of Internet applications thus helping people communicate more effectively in other languages and reduce communication barriers among cultures.

What's it all about?

The Web has thousands of different dictionaries, encyclopedias and translation servers. While all of them do the same thing, they each have a different front end. "WWL creates a simple mechanism for finding and communicating with them via a standard interface," says Brian McConnell, the chief designer on the WWL project, in an email interview with Rediff Guide to the Net.

"This creates the appearance of a single worldwide dictionary when, in fact, the user is talking to many different servers scattered around the world," he explains.

Rather like GNUtella, it creates a worldwide network of dictionaries without reinventing the wheel.

Sounds complicated? It isn't.

WWL does this by creating a simple protocol based on the SOAP interface, just three generic lines of code allow you to locate WWL servers.

So you want a word translated in Polish? No problem! WWLP, or World Wide Lexicon Protocol, scours the various dictionaries and translation servers for words, synonyms, etc.

The second component is a project called GNUtrans. While SETI@Home taps the idle CPUs of millions of personal computers, the WWL enlists the help of Internet users or volunteers who are logged on, but not busy.

Volunteers will have to download a Lexicon@Home client, the programme that will prompt users to contribute to the system. It monitors keyboard and mouse activity to determine whether a user is busy or not.

"The Lexicon@Home client will normally be invisible, except when it is requesting the user to do work. The user will see a small dialog box or pop-up window asking if he or she can do some work," says McConnell. The user will be able to set preferences for when and how often he is prompted to work or decide which WWL server he wants to contribute to.

McConnell says that GNUtrans will work by "dividing Web documents into a large number of small texts (a few sentences per text) and by asking Lexicon@Home users to translate them to various languages. The system will also ask users to score the translations provided by others".

He adds: "WWL simplifies the use of Internet dictionary and translation services and can be incorporated into a wide range of applications, including browser and text editor plug-ins, chat clients, email editors and more." Thus, there is a possibility of integration with a multilingual chat client to help provide real-time translation for other users.

Currently, McConnell and his team are promoting the project through word of mouth as well as recruiting software developers to incorporate WWL into widely used client applications such as instant messaging software. "These applications are used by millions of people worldwide. As more of them support WWL, the reach of the system will grow," says McConnell.

Interesting as the project is, there is some skepticism from users, going by this discussion on Slashdot. Quality control issues are at the top of their minds.

McConnell says that there are several mechanisms that can filter submissions from the public. Most WWL servers that allow public input will use some combination of editorial oversight and randomised peer review where randomly chosen Lexicon@Home users are asked to score or revise recent submissions.

McConnell feels that the system will be most useful to people who speak less common languages because these are not served well by existing dictionary and translation tools. It will also cover slang terms.

The specification for the World Wide Lexicon Protocol has been published so that developers can start building client and server applications based on it. He hopes to see WWL applications running this month.

As far as volunteers are concerned, McConnell says, "We don't know exactly how many people will participate in Lexicon@Home. However, SETI@Home proved that you could recruit many (they have over three million people). I will be happy if we attract 100,000 users worldwide by next year."

He adds: "The incentive for volunteers to participate in this is the knowledge that they are contributing to a public resource that will enable people worldwide to communicate across language barriers."

The Rosetta Project

An interesting attempt at the documentation of linguistics is the Rosetta Project, which is bringing contributors and language specialists from around the world to create a unique archive of over 1,000 languages as unusual as Aleut or Koryak. It doesn't have just texts, but also descriptions, analytic materials and audio files.

This archive will be hosted online as well as in the form a micro-etched nickel disk with a 2,000-year life expectancy and a reference book, an attempt to create a new age Rosetta Stone to document existing languages.

The goal of this project is to create a platform for comparative linguistic research and education as well as a functional linguistic tool that might help in the recovery of lost languages in future.

dot
Channels:

News:
Shopping:
Services:
Astrology | Auctions | Auto | Contests | Destinations | E-cards | Food | Health | Home & Decor | Jobs/Intl.Jobs | Lifestyle | Matrimonial
Money | Movies | Net Guide | Product Watch | Romance | Tech.Edu | Technology | Teenstation | Women
News | Cricket | Sports | NewsLinks
Shopping | Books | Music
Personal Homepages | Free Email | Free Messenger | Chat
dot
rediff.com
  © 2002 rediff.com India Limited. All Rights Reserved. Disclaimer