RELATED LANGUAGES

Junio 17, 2008

         I have made a translation of the text “the belle and the beast”, from french to spanish; these two languages are closely related languages, this are the probles that I have found in the tranlation, which has been made by the program call Translendium.

 

         The first mistake is in the title, Translendium have translate it into spanish as ” la bonita y el animal”, this is totaly wrong, because, the real translation will be ” la bella y la bestia”. Apart from that, I have to mention that he uses the word ” bonita” all the time, and it is totally wrong, as i have said before. There the same problem with other characters like “lumière” which is translated like “luz”, this is wrong, specially because it is a name, so it hasn´t got a translation.

 

        There are other phrases during the text that they are incorrect too, such as, “… persigue a bonita de sus asiduidades pero esconde….”, this phrase hasn’t got any sense, it is totally wrong; the real and the understable translation will be “….el le persigue a bella por sus asiduidades pero ella se esconde…”, there are lots of phrases like that along all teh text.

 

Finally, some words that have different meanings in the text, for example, llave/clave which in french will be “clés“, that can be “llave” or “clave“, butin this case the correct one will be clave, there are other examples such as, chica/hija, ….. 

 

SOURCES:

 

 


SOME SPECIALIZED TERMS

Junio 9, 2008

These are the differences between some specialized terms:

 

  • Machine tranlation: sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. Translation proper is performed by a computer,even if the human helps by preediting, postediting or answering questions to disambiguate the source text.

 

  • Computer-assisted translation: is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process. Computer-assisted translation is sometimes called machine-assisted, or machine-aided translation.

 

  • Multilingual content management: Contains information, including also video clips, audio clips and images.

 

  • Translation: is the action of interpretation of teh meaning of a text,and subsequent production of anequivalent text, also called translation, that communicates teh same message in another language. It must take into account constraints that include context, the rules of grammar of the two languages, their writing conventions, and their idioms.

 

SOURCES:

 

 

 

 

 


A translation example by MT systems

Mayo 12, 2008

         

        I´m going to use an MT system which is called Translendium to make a translation example from spanish into english, I mean, into a less related language and then I’m going to explain the gramatic errors and other errors that the tranlation programmes make. 

 

        As we can see,the first two lines of the spanish version are different in the english version, the programme hasn´t understand very well the spanish version so that the order of the words in the english has been change incorrectly, that´s why it is incorrect and it has no sense.

 

         Apart from that, some words in spanish have more than one meaning in english so in the tranlation appear all of the words, in this case the both words “step/crossing”.

 

       The same happens with the subjects; in spanish sometimes there aren’t subjects so in the translation appear different subjects “its/ his/her/ their…”, maybe because, in spanish, the subject isn’t very clear.

 

Finally, I have to mention other error that appear in the last paragraph, in the spanish version appear “… por una serie de murallas construidas y …” but in the english one the word “…una serie…” is translated into “series” and that is incorrect, because it is not the same thing.

 

RESOURCES:

 

 

 


Characteristics of a translation task according to the FEMTI report

Mayo 11, 2008

           According to the FEMTI report, characteristics of a translationtask refers to the information flow intended for the output, from the point of view of the human or other thing who receives the translation.

           As the FEMTI report says, there are three characteristics of a translation task; assimilation, dissemination and communication. I´m going to explain a bit all of them:

 

  • Assimilation: The ultimate purpose of the assimilation task is to monitor a large volume of texts produced by people outside the organization, in several laguages.

 

  • Dissemination: The ultimate purpose of the dissemination task is to deliver to others a translation of documents produced inside the organization.The translation quality required is generally high, but translation speed is usually not a factor.

 

  • Communication: The ultimate purpose of the communication task is to support multi-turn dialogues between people who speak different languages. The translation quality must be high enough for painless conversation, despite possible syntactically ill-formed input and idiosyncratic word and format usage.

 

           RESOURCES:


EXPLANATION OF THREE RESEARCH TOPICS

Abril 19, 2008

         I’ m going to explain three of those research topics.

         First of all, I’m going to talk about Corpus Linguistics. The word Corpus may be used to refer to any text in written or spoken form. However in moder linguistics thisterm is used to refer to na large collection of texts which represent a sample of a particular variety or use of language that are presented in machine readable form. On the hand, we have to refer to Corpus Linguistics, which is now seeing as the study of linguistic phenomena through large collections of machine- readable texts: corpora. These are used within a number of research areas going from the Descriptive Study of the Syntaxis of a Language to Prosody or Language Learning, to mention but a few. Th use of real examples of texts in the study of language is not a new issue in the history of linguistics. However, Corpus Linguistics has develop considerably in the lst decades due to the great possibilities offered by the processing of natural language with computers.

 

            Secondly, I have to mention Semantics, which is the study of meanings in communication. In linguistics is the interpretation of sings as used by agents or communities within particulr circumstances and contexts. Semanticists differ on what constitutes meaning  in an expression. Tradicionally , the formal semantic view restricts semantics to each literal meaning, and relegates all figurative associations to pragmatics, but this distinction is increasingly difficult to defend. The degree to which a theorist subscribes to the literal- figurative distinction decreases as one moves from the formal semantic, semiotic, pragmatic, to the cognitive semantic traditions.

 

         Finally, I’m going to mention  Computer Assisted Language Learning (CALL), is an approach to language teaching and learning in which computer technology is used as an aid to the presentation, reinforcement and assessment of material to be learned, usually including a substancial interactive element. The philosophy of this center is that the lessons should allow the learners to learn on their own using structure and/or unstructured interactive lessons. This lessons crry two important features: bidirectional (interactive ) learning and individualized learning. CALL is not a method. It is a tool that helps teachers to facilitate language learning process.

 

RESOURCES

 

 

 


Research Topics (Q.2)

Abril 16, 2008

     Here are some ofthe research topics of three European research centers:

     First of all, The German Language Technology Lab., which themes are elaborated in research, development and commercial projects:

  • Exploiting- and automatically extending- ontologies for content processing.
  • Tighter integration of shallow and deep techniques in processing.
  • Enriching deep processing with statistical methods.
  • Combining language checking with structuring tools in document authoring.
  • Document indexing for German and English.
  • Automatically associating recognize information with related information and thus building up collective knowledge.
  • Automatically structuring and visualizing extracted information.
  • Processing information encoded in multiple languages, among them Chinese and Japanese.

  

      Secondly, we can mention the projects of The Edinburgh Lnguage Technology Group, which conducts research and development in different areas.

  • Combining shallow semantics and domain knowledge.
  • Text mining for Biomedicational Content Curation.
  • Cross- retail Multi- agent Retail Comparison.
  • Smart Qualitative Data: methods and Community Tools for Data Mark-up.
  • Machine Learning for Name Entity Recognition.
  • Named entity tagging of historical parliamentary proceedings.
  • Integrated Models and Tools for Fine- Grained Prosody in Discourse.
  • Joint Action Science and Technology.
  • AMI consortion projects that are developing technologies for meeting browsing and to assit people participating in meetings from a remote location.
  • Study of how pairs collaborate when in planning a group on a map

 

       And finally,  The National Centre for Language, which research areas are the following ones:

  • CALL Computer Assisted Language Learning: Integrating CL/NLP/HLT Technology into CALL, CALL for Endangered Languages, CALL for Primary School Environments, CALL for Remedial Learners.
  • Corpus Linguistics: Collocation, Contrastive Computational Linguistics, Corpus- based Traslation Studies.
  • Machine Translation and Translation Technology: SMT, RBMT, EBMT, TMs, MAT, CAT.
  • Treebank- Based Unification Grammar Acquisition: Automatic Feature Structure Annotation Algorithms, Subcategorisation Frame Straction,…
  • Semantics: Discourse Representation Theory, Linear- Logic Based Semantics, Computational of Logic Forms from Treebanks, Open- Domain Question Answering Systems.
  • Speech Technology: Speaker Characterisation, Audio Classification, Retrieval and Coding, Human Computer Interfaces (HCIs).
  • Multilingual Information Retrieval/Extraction.
  • Language Evolution.

RESOURCES:

 

  • The Edinburgh  Language Technology Group. http://www.ltg.ed.ac.uk/, fecha de la consulta el 14 de marzo del 2008, hora de la consulta 20:15.

 


European Research Centers

Abril 14, 2008

There are some European research centers for Human Language Technologies, here are three examples:

  • Language Technology Lab., Germany.

           Their objective is the improvement of language technology through novel computational techniques for processing text, speech and knowledge, a deeper understanding of human language and thought, studying the needs of the users and the demands of the market. 

  • National Centre for Language Technology, Ireland.

               This centre conducts the research into the processing of Human Language through computers, for example speech recognition and synthesis, machine translation and others.

  • Edinburgh Language Technology Group, Scotland, UK.

           Is a research group that has been studying in the area of natural engineering  since the early 1990s.

          They focus on building practical solutions to real problems in text processing.

RESOURCES.

 

 

 


Hans Uszkoreit

Abril 14, 2008

Hans Uszkoreits is Proffesor of Computational Linguistics of Saarland University and he also serves as Scientific Director at the German Research Center for Artificial Intelligence (DFKI) where he heads the DFKI Language Technology Laboratory. Apart from that he is Proffesor of the Computer Science Department.

      He has studied Linguistics and computer science at the Technicl University of Berlin and the University of Texas in Austin. During part of his live he has been working as a research associate in a large machine translation project at the Linguistics Research Center,he also worked as a computer scientist at the Artificial Intelligence Center of SRI International in Menlo Park and in other similar ones.

    Hans is member of lots of academies, such as, permanent member of the International Committee of Computational Linguistics (ICCL) , Member of  the European Academy of Sciences, member  of the Executive Board of the European Network of Language and Speech and more and more.    

      Here are some of the recent publications of this proffesor :

  • Uszkoreit, H. (2007) Methods and Applications for Relation Detection. In: Proceedings of the Third IEEE International Conference on Natural Processing and Knowledge Engineering, Beijing, 2007.
  •    Uszkoreit, H. F. Xu, W. Liu (2007) Challenges and Solutions of Multilingual and Translingual Information Service Systems , To appear in Proceedings of HCI International 2007, 12TH International Conference on Human- Computer Interaction, Beijing, 2007.
  • Busemann, S. and H. Uszkoreit (2004) Predicting the Future: Technology Roadmapping. In: ELSNews, (3) 2004.

         RESOURCES:

 

 

 

 


DEFINITION OF HUMAN LANGUAGE TECHNOLOGIES

Marzo 28, 2008

     As in wikipedia appears,  natural language processing,  also called human language technologies,  is a subfield of artificial intelligence and linguistics. This studies the problems of automated generation and understanding of natural human languages. Thus, on the one hand the natural language generation systems convert information from computer databases into normal- sounding human languages, and on the other hand, natural language understanding systems convert samples of human language into more formal representations which are easier for computer programs to manipulated.

     Another definition is the one that is giving by Hans Uszkoreit in his book “Language Technology A First Overview”. In this book he explains that Language technologies are information technologies that are specialized for dealing with the most complex information medium in our world: human language. Therefore this technologies are also often subsumed under the term Human Language Technology. Whereas speech is the oldest and most natural mode of language communication, complex information and most of human knowledge is maintained and transmitted in written texts.[...]Nevertheless, lnguage technology had to create formal representation systems that link language to concepts and tasks in the real world.

    

   RESOURCES: