Machine Translation -
=================================================================
INTRODUCTION
sub-field of computational linguistics, using software to translate text/speech from one natural language to another.
unable to match human translator, especially on casual language.
HISTORY
Georgetown-IBM Experiment
Russian sentences to English in organic chemistry domain, using
6 grammar & 250 items in vocabulary.
MT PROCESS (simplify)
decode source text meaning -> reencode meaning in target language
APPROACHES
1. Rule based
a. transfer based - using intermediate form, language dependent
make internal representation of one language and transfer the
representation into target internal representation, then generate
the translation.
--Superficial transfer (or syntactic). This level is characterised by transferring "syntactic structures" between the source and target languages. It is suitable for languages in the same family or of the same type, for example in the Romance languages between Spanish, Catalan, French, Italian, etc.
--Deep transfer (or semantic). This level constructs a semantic representation that is dependent on the source language. This representation can consist of a series of structures which represent the meaning. In these transfer systems predicates are typically produced. The translation also typically requires structural transfer. This level is used to translate between more distantly related languages, or languages which have no genetic relationship at all (e.g. Spanish-English or Spanish-Basque, etc.)
b. interlingual - using intermediate form, language independent
one "new language" as an intermediary language (interlingua) from
source language to target language.
--adv: economical way of translation.
--disadv: it's hard or even impossible to create interlingua
for many language and to create intermediate representation of
source language.
c. dictionary based
translate word-by-word using dictionary, may be done with or
without morphological analysis or lemmatisation.
--adv: help in simple translation e.g. a list of products and
can expedite(speed up) manual translation.
--disadv: least sophisticated and poor results on sentence or
paragraph.
2. Statistical Based
generate translations using statistical methods based on
bilingual text corpora.
3. Example Based
characterised by its use of a bilingual corpus as its main
knowledge base, at run-time. Translation by analogy, can be
viewed as an implementation of case-based reasoning of ML.
1. How much is that X ? corresponds to Ano X wa ikura desu ka.
2. red umbrella corresponds to akai kasa
3. small camera corresponds to chiisai kamera
MAJOR ISSUE
- WSD
- NER
APPLICATION
- From SYSTRAN company, used by Alta Vista dan Google
- Toogletext (Kataku -> transfer based system)
- Google leave SYSTRAN and move to statistical MT
- Funding by US for research in MT on Arabic, Pashto, Dari
EVALUATION - on the output, not the performace/usability
1. Round-trip translation
Translate to target and translate target back to source language.
Poor predictor of quality, testing 2 systems instead of 1. Not
appropriate for serious study of MT output.
2. Human evaluation
By ALPAC and ARPA, using trained/native linguist/speaker.
3. Automatic evaluation
1. BLEU
BLEU uses a modified form of precision to compare a candidate
translation against multiple reference translations. The metric
modifies simple precision since machine translation systems
have been known to generate more words than appear in a
reference text.
2. NIST
The NIST metric is based on the BLEU metric, but with some
alterations. Where BLEU simply calculates n-gram precision
adding equal weight to each one, NIST also calculates how
informative a particular n-gram is. That is to say when a
correct n-gram is found, the rarer that n-gram is, the more
weight it will be given.[12]
3. Word Error Rate
Based on the Levenshtein distance, where the Levenshtein
distance works at the character level, WER works at the word
level.
4. METEOR
The METEOR metric is designed to address some of the
deficiencies inherent in the BLEU metric. The metric is based
on the weighted harmonic mean of unigram precision and unigram
recall. The metric was designed after research by Lavie (2004)
into the significance of recall in evaluation metrics. Their
research showed that metrics based on recall consistently
achieved higher correlation than those based on precision
alone, cf. BLEU and NIST.[13]
METEOR also includes some other features not found in other
metrics, such as synonymy matching, where instead of matching
only on the exact word form, the metric will also match on
synonyms. For example, if the word "good" appears in the
reference and the word "well" appears in the translation, this
will be counted as a match. The metric is also includes a
stemmer, which lemmatises words and matches on the lemmatised
forms. The implementation of the metric is modular insofar as
the algorithms that match words are implemented as modules, and
new modules that implement different matching strategies may
easily be added.
Tidak ada komentar:
Posting Komentar