Welcome to ...

The place where the world comes together in honesty and mirth.
Windmills Tilted, Scared Cows Butchered, Lies Skewered on the Lance of Reality ... or something to that effect.


Sunday, July 7, 2013

What Are the Weirdest Languages in the World?

According to Idibon, a company that makes language processing applications, these are the weirdest languages on different continents:
In North America: Chalcatongo Mixtec, Choctaw, Mesa Grande Diegueño, Kutenai, and Zoque; in South America: Paumarí and Trumai; in Australia/Oceania: Pitjantjatjara and Lavukaleve; in Africa: Harar Oromo, Iraqw, Kongo, Mumuye, Ju|’hoan, and Khoekhoe; in Asia: Nenets, Eastern Armenian, Abkhaz, Ladakhi, and Mandarin; and in Europe: German, Dutch, Norwegian, Czech, and Spanish.
But is weirdness relative? Maybe the World Atlas of Language Structures provides a source for objective evaluation. Here's what Idibon did with it:
For each value that a language has, we calculate the relative frequency of that value for all the other languages that are coded for it. So if we had included subject-object-verb order then English would’ve gotten a value of 0.355 (we actually normalized these values according to the overal entropy for each feature, so it wasn’t exactly 0.355, but you get the idea). The Weirdness Index is then an average across the 21 unique structural features. But because different features have different numbers of values and we want to reduce skewing, we actually take the harmonic mean (and because we want bigger numbers = more weird, we actually subtract the mean from one). In this blog post, I’ll only report languages that have a value filled in for at least two-thirds of features (239 languages).

No comments: