Here's How Russian, French, Portuguese, And German Native Speakers Screw Up English

Computer scientists at MIT and Israel's Technion recently developed an algorithm that analyzes quirks and errors in English-as-a-second-language (ESL) essays to predict the author's native tongue, potentially unlocking new tools for translation and teaching and also producing an alternative map of language similarity.

The breakthrough got us thinking: How do native speakers of different languages typically screw up English?

Researchers Dr. Boris Katz and graduate student Yevgeni Berzak offered the following examples over email:

In Russian, there are no determiners (no words equivalent to "a" and "the" in English). As a result, native speakers of Russian tend to omit determiners in English. For example, a native speaker of Russian uses the following sentence in our data: "I was champion of swimming competition in Russia," instead of, "I was the champion of a swimming competition in Russia." Another example is "Everyone has now a car" which would be the correct order of modifiers if translated word-to-word to Russian, instead of the proper form in English "Everyone has a car now."

In French, adjectives follow nouns (in most cases). Consequently, native speakers of French use phrases such as "lessons very important" instead of "very important lessons,""afternoon free" instead of "free afternoon" and so forth. Another pattern that is common with French natives is preference for the construction "noun preposition noun" over "noun noun" compounds which are less frequent in French. Hence, French natives may prefer "licence for sailing" over "sailing licence," and "manager of Bruce Springsteen" instead of "Bruce Springsteen's manager."

With natives of German we have the example "The pollution will more and more increase," which would be the correct position of the verb in German.

Two other types of ESL patterns that are influenced by the native language are prepositions, and count/mass noun distinctions.

ESL learners would often direct-translate the preposition from their native language (for example, Portuguese natives confuse "in" and "on" in a systematic way, based on the usage of the equivalent prepositions in Portuguese).

Different languages have different count/mass noun distinctions. For example "information", "furniture" and "homework" are count nouns in French, and so natives of French may say "informations", "furnitures" and "homeworks" in English.

Here's how the researchers identified and used this information:

The algorithm receives ESL essays by native speakers of different languages as input. It starts by learning linguistic patterns in ESL that are characteristic of each of these languages. The kind of patterns we take into consideration are word order, syntactic relations between words, and morphological suffixes and prefixes.

Using these patterns it is possible to predict the native language of a given ESL document, but in fact our method reveals something even deeper: based on these patterns, the algorithm estimates the similarity between all the language pairs, and clusters them into a tree. It turns out that this tree is very similar to a tree that one would obtain when using similarities according to the linguistic properties of these languages as manually documented by linguists. These linguistic properties are called "typological features" and refer to things like word order, valid syntactic constructions, how to form negation, are there morphological suffixes and prefixes or not, etc.

Finally, the algorithm uses the ESL-based similarity tree for prediction of typological features of languages for which we have no prior linguistic data. For example, imagine that I know nothing about the typological features of Portuguese (I don't know the correct word order, how to form negation etc), but I have a bunch of ESL documents written by natives of Portuguese. Using the similarity tree from ESL, I can determine that Portuguese is in fact similar to Spanish and Catalan. Assuming I do know the typological features for these two languages, I can now predict the typological features of Portuguese based on the features of Spanish and Catalan. Of course, this method is not perfect, but it's a pretty good approximation: in 72% of the cases the predicted typological features are correct.

To sum up, the practical value of the algorithm is the ability to predict native languages, their similarity structure, and their typological features directly from ESL. The fact that we can do this also tells us something new and interesting about language: there is a strong relation between native language typology and second language usage. In other words, the properties of your native language systematically affect the way you use a foreign language.

Based entirely on ESL essays, the researchers were able to create a language similarity tree, showing the close links between, for instance, Japanese and Korean. Their language similarity tree looks remarkably similar to one based on data from The World Atlas of Language Structures (WALS).

“The striking thing about this tree is that our system inferred it without having seen a single word in any of these languages,” Berzak said.

Here's a comparison of the ESL tree and the WALS tree: mit computing language

And an artistic rendering of the ESL tree:

language tree

Join the conversation about this story »

Here's How Russian, French, Portuguese, And German Native Speakers Screw Up English

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112