Pular para o conteúdo principal

Understanding Drafting

Introduction

SIL International’s Natural Language Processing team has developed a tool that can assist in Bible translation by creating a rough first draft for translators to edit and refine. As of January 1st, 2024, this tool is now available for use within Scripture Forge, a scripture editing platform that is closely integrated with Paratext.

All of the drafts that are created contain errors that need to be corrected by the translation team. For some projects, the quality will be too low to be useful. However, extensive field testing has shown that a significant number of teams find the drafts to be very helpful in their work, and of sufficient quality to use as a starting point for the team to edit.

Ultimately, the measure of success for the drafts is in their usefulness as a tool to assist the translation team in their work, not their ability to stand alone as a finished product.

How it works

In order to use the tool effectively, it’s important to understand how it works. The drafts are created using a two-step process:

  1. Learn the language.
  2. Translate the text.

The most important step is the first one, the learning of the language. The quality of the draft that is created depends almost entirely on how well this step goes.

The system learns by seeing the same sentence written in multiple languages, one in a language that it already understands, and one in the language you are translating into.

The most important point to remember is that these sentences need to say the same thing in both languages. This is the same principle that allowed scholars to learn hieroglyphics from the Rosetta Stone: the same text was written in three different languages, allowing comparison between a language that was already understood and one that was not.

Text that is available in both the language you are translating from, and the language you are translating to, is known as “parallel text”. In general, the more parallel text you have, the better the system will learn the language. For most projects, this parallel text will be your reference text, and the translation work you have already completed. At this time, we recommend that you have at least the New Testament translated.

Example use cases

A basic example

Suppose a translation team is translating the Bible into a local language, using the English NIV as a reference text. The team has completed the entire New Testament, and is beginning work on Genesis. The system would generate a draft as follows:

  1. Compare the English NIV New Testament with the local language New Testament, in order to learn the language.
  2. Having learned the language, translate Genesis from the English NIV into the local language.

After generating the book of Genesis, the team would edit the draft to correct any errors. Afterwards, they would be able to generate a draft of the next book they plan to work on, and the system would use the translation of Genesis to improve the quality of the next draft.

A more complex example

In the first example, the team was using the English NIV as a reference text, and the system learned the language by comparing the English NIV with the local language New Testament. However, in many cases a better quality draft can be generated by using a different text than the one the team is translating from.

For example, it is often possible to improve the quality of the draft by using a back translation of the local language into the source language. If the team has back translated the project into English, it could be used as the reference text instead of the English NIV. A back translation is usually much more literal than a normal translation, and therefore makes it easier for the system to understand how the local language maps to English. In this example, the system would generate a draft as follows:

  1. Compare the English back translation with the local language New Testament, in order to learn the language.
  2. Having learned the language, translate Genesis from the English NIV into the local language.

Determining the ideal setup for a project is a complex process, and it’s not something you will need to learn. The SIL Natural Language Processing team is developing tools to determine the ideal setup, and can assist teams during the onboarding process.

Generating back translations

In addition to creating drafts into the vernacular language, the system can also generate back translations into supported source languages. In order to generate a back translation draft, the team needs to already have back translated at least a few books from the vernacular language into the source language.

In this example, suppose a team has translated the four gospels into the vernacular language, and has back translated Matthew, Mark, and Luke into English. To generate a back translation draft of John, the system would do the following:

  1. Compare the English back translations of Matthew, Mark, and Luke with the vernacular language versions of Matthew, Mark, and Luke, in order to learn the language.
  2. Having learned the language, translate John from the vernacular language into English.

Back translation drafts will also contain errors and need to be edited, but the quality is usually substantially higher than for the vernacular drafts.

Getting started

Generating back translation drafts is currently open and available to all Paratext users. Generating drafts into the vernacular, due to the complexity involved in setup, requires a team to be onboarded by the SIL Natural Language Processing team. Please fill out the translation drafting registration form, and a member of the team will assess whether your project is a good candidate for generating drafts.

Regardless of whether you are generating back translation drafts or vernacular drafts, you can begin by connecting your Paratext project to Scripture Forge by following these steps:

  1. Log in to Scripture Forge, using your Paratext credentials.
  2. Connect your Paratext project by following the Connect a Paratext Project guide. When you connect the project, select your reference text as the source. For a back translation, the source text should be the vernacular.
  3. After connecting your project, click “Generate draft” in the sidebar.
  4. If you are generating a draft into the vernacular, this is as far as you can on your own, and you will need to fill out the translation drafting registration form. If your project has already been onboarded, or you are working with a back translation, click “Generate draft” to start the process.
  5. Select the books you want to translate, and then select the books you want to use as training data.
  6. Click “Generate draft” to start the process.

The draft generating process can take anywhere from several hours to several days.

Once you have a draft generated, you can preview the draft and import individual chapters into your project.

Supported languages for back translation drafting

Back translation drafts can be generated from any language, but must be back translated into one of the following languages.

Language nameISO 639-1ISO 639-2ISO 639-3
Achineseace
Mesopotamian Arabicacm
Ta'izzi-Adeni Arabicacq
Tunisian Arabicaeb
Afrikaansafafrafr
South Levantine Arabicajp
Akanakaka
Amharicamamhamh
North Levantine Arabicapc
Standard Arabicararbara
Najdi Arabicars
Moroccan Arabicary
Egyptian Arabicarz
Assameseasasmasm
Asturianast
Awadhiawa
Aymaraayr
South Azerbaijaniazb
North Azerbaijaniazj
Bashkirbabakbak
Bambarabmbambam
Balineseban
Belarusianbebelbel
Bembabem
Bengalibnbenben
Bhojpuribho
Banjarbjn
Tibetanbobodtib
Bosnianbsbosbos
Buginesebug
Bulgarianbgbulbul
Catalancacatcat
Cebuanoceb
Czechcscescze
Chokwecjk
Central Kurdishckb
Crimean Turkishcrh
Welshcycymwel
Danishdadandan
Germandedeuger
Dinkadik
Dyuladyu
Dzongkhadzdzodzo
Greekelellgre
Englishenengeng
Esperantoeoepoepo
Estonianetestest
Basqueeueusbaq
Eweeeeweewe
Faroesefofaofao
Fijianfjfijfij
Finnishfifinfin
Fonfon
Frenchfrfrafre
Friulianfur
Nigerian Fulfuldefuv
Scottish Gaelicgdglagla
Irishgaglegle
Galicianglglgglg
Guaranigngrngrn
Gujaratigugujguj
Haitian Creolehat
Hausahahauhau
Hebrewhehebheb
Hindihihinhin
Chhattisgarhihne
Croatianhrhrvhrv
Hungarianhuhunhun
Armenianhyhyearm
Igboigiboibo
Ilokoilo
Indonesianidindind
Icelandicisislice
Italianititaita
Javanesejvjavjav
Japanesejajpnjpn
Kabylekab
Kachinkac
Kambakam
Kannadaknkankan
Kashmirikskaskas
Georgiankakatgeo
Central Kanuriknc
Kazakhkkkazkaz
Kabiyekbp
Kabuverdianukea
Khmerkmkhmkhm
Kikuyukikikkik
Kinyarwandarwkinkin
Kyrgyzkykirkir
Kimbundukmb
Northern Kurdishkmr
Kongokgkonkon
Koreankokorkor
Laololaolao
Ligurianlij
Limburgishlilimlim
Lingalalnlinlin
Lithuanianltlitlit
Lombardlmo
Latgalianltg
Luxembourgishlbltzltz
Luba-Lulualua
Lugandalgluglug
Luoluoluo
Lushailuslus
Latvianlvlvslav
Magahimagmag
Maithilimaimai
Malayalammlmalmal
Marathimrmarmar
Minangkabauminmin
Macedonianmkmkdmac
Plateau Malagasypltplt
Maltesemtmltmlt
Manipurimnimni
Halh Mongoliankhkkhk
Mossimosmos
Maorimimrimao
Burmesemymyabur
Dutchnlnlddut
Norwegian Nynorsknnnnonno
Norwegian Bokmålnbnobnob
Nepalinpinpi
Northern Sothonsonso
Nuernusnus
Chichewanynyanya
Occitanococioci
West Central Oromogazgaz
Odiaoryory
Pangasinanpagpag
Punjabipapanpan
Papiamentopappap
Persianfapesper
Polishplpolpol
Portugueseptporpor
Dariprsprs
Southern Pashtopbtpbt
Quechuaquyquy
Romanianroronrum
Rundirnrunrun
Russianrurusrus
Sangosgsagsag
Sanskritsasansan
Santalisatsat
Sicilianscnscn
Shanshnshn
Sinhalasisinsin
Slovakskslkslo
Slovenianslslvslv
Samoansmsmosmo
Shonasnsnasna
Sindhisdsndsnd
Somalisosomsom
Sotho, Southernstsotsot
Spanishesspaspa
Tosk Albaniansqalsals
Sardinianscsrdsrd
Serbiansrsrpsrp
Swazisssswssw
Sundanesesusunsun
Swedishsvsweswe
Swahiliswswhswh
Silesianszlszlszl
Tamiltatamtam
Tatartttattat
Teluguteteltel
Tajiktgtgktgk
Tagalogtltgltgl
Thaiththatha
Tigrinyatitirtir
Tamashektmhtaqtaq
Tok Pisintpitpitpi
Tswanatntsntsn
Tsongatstsotso
Turkmentktuktuk
Tumbukatumtumtum
Turkishtrturtur
Twitwtwitwi
Tamazighttzmtzmtzm
Uighuruguiguig
Ukrainianukukrukr
Umbunduumbumbumb
Urduururdurd
Uzbekuzuznuzn
Venetianvecvecvec
Vietnamesevivievie
Waraywarwarwar
Wolofwowolwol
Xhosaxhxhoxho
Yiddishyiyddyid
Yorubayoyoryor
Cantonesezhyueyue
Chinesezhzhochi
Malaymszsmzsm
Zuluzuzulzul