Thursday, January 12, 2006

Dialects within a language

There are Wikipedias in many languages. So far there are some 212. Most of these languages have an ISO-639 code. There are two versions of this code that are "official", there is one version that is workable; the ISO/DIS-639-3 is currently maintained by SIL international. However, workable does not mean that it is perfect. Today there were to moments where the current practices relating to ISO-639 were the issue.

JAVA uses ISO-639 for its language codes. The codes used is the ISO-639-1. Consequently the Neopolitan language is not known. OmegaT is an open source CAT tool, it uses the languages known to JAVA as the languages that it can translate.. So in order to translate to Neapolitan you have to pretend that it is a different language.. Not nice.. So the nice people of SUN were asked this and we have great expectations.

Today there was a request on Meta, the website about the Wikimedia Foundation's project, for a new Wikipedia. The request is for tarantino, it is considered a dialect, a dialect of Neapolitan. This request is problematic because there is not even an ISO-639 code. Consequently there is little chance of there being a wikipedia for created. Now, with the new namespace manager, it is possible to create a seperate namespace within the nap.wikipedia.org for the tarantino dialect. This is also a solution for the problematic request for a Lower Saxon wikipedia that will be in an orthography that is not German..

It is sobering to see that standards can enable and prevent things to happen. Good standards are vital and ISO/DIS 639-3 is a big move forward.

Thanks,
GerardM

5 comments:

Minh Nguyễn said...

Simply using the namespace manager to maintain multiple dialects of a language on one wiki just doesn’t sound right to me. For one thing, you’d probably only leave enough room for articles and talk pages in that other language, but what about having image description pages for different dialects, for example? Also, it would be helpful to have a separate localization for that dialect as an option in Preferences.

I haven’t seen the request to set up the Tarantino wiki yet, but it is Tarantino so different from Neapolitan that it warrants even a separate namespace? After all, requests to form a separate Brazilian Portuguese Wikipedia have been denied repeatedly in the past, and I’ve read that there are notable differences between the Portuguese of Brazil and that of Portugal.

GerardM said...

Having articles for a dialect in a specific namespace is not without problems. Having room for these articles is however not a problem. Image description pages is also not a problem as they should be in Commons in the first place.

Localisation the UI in a dialect would be nice, this means that we have to be able to address dialects.

When you consider American and English, there are differences in the locale data, and as such they need to be addressed. This is different from having a seperate wiki for them. I would only argue for a solution like this when it solves a problem.

Thanks,
GerardM

Minh Nguyễn said...

Right, my previous comment didn’t quite come out the way I intended it: I’m not in favor of creating a separate Tarantino Wikipedia, because I don’t see how it’s so different that Tarantino-only speakers can’t communicate with Neapolitan-only speakers. Even the proposal to create a separate Brazilian Wikipedia – Brazilian Portuguese seems to have more differences as a dialect than Tarantino does – failed repeatedly, so that already set a precedent not to honor requests for dialect editions.

GerardM said...

My first question would be; what do you know about Tarantino. I do not know more than I am told by people who know about it like Sabine.. She lives in that area.. She cares about languages.

I would not presume that something is like Brazilian, I do not know Brazilian Portuguese, I do not presume about that either.

Given that at least the orthography is said to be substantially different, it does not make sense to compare.

Thanks,
GerardM

SabineWanner said...

Well as for differences in Tarantino and Neapolitan: they are huge. I have friends of that area and when they talk among themselves nor me or my husband (who has an aunt that comes from Apulia and therefore is used for "one" of all the Apulian dialects) can really understand them. You can catch the subject they are talking about but there's not much more.
The problem is that it is not considered an own language so I could imagine a language wikipedia with several portals and, if possibile, localised interface for each of the minor languages (even if that can be implemented step by step). The thing is: if there is only one person who want to write in a language he/she should have the possibility to do so. And if one day he/she doesn't go ahead it is not a problem, since the work is not lost and sooner or later someone else will come and go ahead.

Now let's imagine a main page for the nap wikipedia that says "this is the mainpage of the wikipedia for the Neapolitan and closely related languages" showing the map from http://it.wikipedia.org/wiki/Lingua_napoletana from where you can see that nap is quite a huge region (with maaaaany linguistic differences). And there, like it was on wikisource, we have links to the portal pages of the single "minority languages" of the Neapolitan speaking region. If one day all these "languages" should be considered separate languages the contents can be easily moved, but maybe then it would not be necessary anymore sice we already have "that solution" that makes anybody happy.

Of course it is problematic to insert all this within a wikipedia, of course we will have problems, but maybe it would be a way to go to open article writing to many other people who otherwise would have no chance. And maybe, besides putting at disposal that unique possibility of having encyclopaedic articles in really any language we can help to conserve cultural diversity. And: I suppose it makes sense to try this on a relatively small wikipedia.

Do you remember the Andlausian thingie? Well on wikicities they have their "Wikipedia" and it has already over 600 pages (http://andalu.wikicities.com/wiki/Portada)- they are doing a good job and I would be all for it to give them an own domain name within wikipedia. This would give their work much more value, because they would be connected to all other Wikipedias and they would have more chances to find native speakers who contribute.

Interwikilinks is then one of the next probs ... but problems are there to be solved :-)

Ciao, Sabine
*****
Sabine Cretella
s.cretella@wordsandmore.it
skype: sabinecretella