No Thumbnail Available

Automatic alignment of bilingual sentences: the case of English and Serbian

(2017)

Files

Senicic776715002017.pdf
  • Open access
  • Adobe PDF
  • 1.6 MB

Senicic776715002017Annexe1.pdf
  • Open access
  • Adobe PDF
  • 3.02 MB

Senicic776715002017Annexe2.zip
  • Open access
  • Unknown
  • 31.66 MB

Details

Supervisors
Faculty
Degree label
Abstract
The aim of this thesis is to explore current systems for sentence alignment and adapt one for the automatic extraction and pairing of sentences in Serbian and English. For this purpose, the EXtraction and ALingment Pipeline was developed. EXALP takes unstructured data as an input, extracts the sentences, aligns them and presents them in readable format. Evaluated against manually extracted and aligned data, EXALP shows accuracy of 84%. The pipeline is easily adapted for other language pairs and its output can be applied in various domains of NLP and linguistics.