Friday, July 31, 2009

New Project IV

I'm currently looking at the field labels on the various forms and putting the "hooks" in place to permit captions in any Latin script (exempli gratia, what if someone asks for Italian?).

Not a quick job, then, but because it entails re-visiting almost the entire system, it has already unearthed a few daft bugs ... and no doubt there will be more!

So, inshallah it will all be worthwhile, if only because of that.

Most likely (apart from a few easy ones for the purpose of proving the method), I will leave the various "translations" blank ... and let the users fill them in for themselves. That should keep them quiet for a few days, at least! I have also included a "custom" column, so that users can change labels to any ASCII characters that their little hearts desire! So (for instance), if they don't like the word Equipment, but prefer to use Device (or System, Kit, or Junk ... or whatever), then that will be up to them.

Tuesday, July 28, 2009

New Project III

Drew came back with:-

I hear what you're saying about rewriting an application and the time it takes. Accurately estimating the time required for such an undertaking. Even if you double an original estimate, with feature creep and unexpected problems of all sorts you can still end up slipping the release date several times.

Good luck with it and if you post your progress updates, I'll read them with interest.

Meanwhile, I've just done a prototype form where I can switch between English and Spanish (or French, German or Italian) labels. Oh yes, we're all good European Comrades now (NOT)!

Sistema Imprenta anybody?

Yes ... it all takes forever. And users lose interest, patience, and all the rest. Meanwhile, it's a lonely pursuit, best suited to English summers (yes, it's raining again).

And a million little decisions to be made along the way. Like:- would a Spanish-speaking user be au fait with a push-button marked Code Page, or should we try to squeeze in Código pagina? I went for simply Código as there was no space for anything else ... then moved on.*

The real problem is that of the width of field labels and push-buttons. Whilst they may be OK for (sometimes terse) captions in English, they're often not wide enough ... especially for those er, rather long, German words! If we're not careful, what were previously quite nicely crafted forms can end up looking like a Hund's Frühstück!

* I could, of course, have equally well have mentioned Codice pagina abbreviated to Codice ... but anyway, what the Heck do I know about any of that stuff? Way back in school, when the clever (but spotty, I seem to recall) dudes were taking French, s'il vous plaît, I was generally to be found chiseling happily away in the woodwork shop!

New Project II

A guy called Drew Hodge, out in Canada, has asked:-

Have you considered converting your exiting help source into XML? Because of their inherent separation of content and structure, XML documents are often preferred by human translators, and perhaps by machine translators these days too. Although XML source is UTF-8 by default, you can specify other character encodings if necessary.

I like the sound of your project and would enjoy collaborating with you. I don't have any language skills, but I've recently come back into the Biomed field after a dozen or so years working as a technical writer, a second career for me after starting out in Biomed in the late 70s. During a stint with IBM, I wrote help documentation in XML and prepared it for translation into 13 languages, so that experience might be useful.

What Drew's saying is interesting. I'll bear it in mind (chew it over, or whatever). At the moment I'm thinking in terms of a straight forward look-up table being called each time a form is opened that will populate the labels on the form according to the language setting (as a technical writer, no doubt you'll be groaning at such a long sentence)!

That is, Spanish (say) will call the Spanish labels from a simple .dbf file which can be edited, improved, or otherwise tweaked by the end user. So if a guy out in Mexico, say, doesn't like my choice of a Spanish word on a form, then he can change it to something he's less ashamed of. This is the way I prefer to do most of my stuff ... let the users have reasonable access to the underlying data to use and mould to their own particular (or weird even) requirements.

But yes, help file narratives are a different kettle of fish, as there is a lot of text in there. I haven't given it a great deal of thought as yet, but (off the top of my head) I may just provide my usual English file, plus a blank "page" for my more helpful users to fill in for themselves. In actual fact, users at the highest level of access can edit (or screw up, as they wish) my help file anyway! So I have the option there to "leave as is", as they (architects, usually) say.

In truth, my stuff needs a total re-write in a modern language. But who has time for that? Not to mention that dreaded word "funding"! As you must know, most users haven't got a clue how much work it all takes.

Monday, July 27, 2009

A New Project

Well ... not really new as such ... more like a continuation of one I made a start on 18 years ago!

One of my more thoughtful "users" has recently enquired about the possibility of translating the famous TaskMaster system into Spanish. Tarea de Maestro? I've had a quick play and reckon I have code already in place that could form the basis of a "look-up table" to translate the field labels that appear on the forms. Basically, I would adapt an existing datafile which is already used to enforce standardised descriptions, correct "typos", and such.

"Eqpt type code" would appear as "Codigo de tipo" if a Country field was set to "ES", for instance. And "Manufacturer" would appear as "Fabricante", and all the rest. The bulk of the work required be in the translation of the Help file, however. I have zero Spanish (or any other language) skills myself, of course, but would need to rely upon an on-line translator. I see a long Summer of work stretching ahead of me (so much, then, for any thoughts about digging out my bucket and spade)!

All this has re-kindled memories of an attempt I made on a (much) earlier version to "translate" technical words in a dBASE database from English to Arabic. Because that needed an Arabic character set, and the program was running under good old DOS back in those days, I had used an old version of Nafitha. But I had to rely upon Arab friends (who were, in actual fact, not all that reliable, especially as they never seemed able to agree upon this word or that), coupled with a fair amount of detective work on my own, to work out my Arabic translations. First I made a list of phonetic spellings (so "maintenance" became "al-seena", for example) then used Nafitha to provide the corresponding Arabic script (right to left, of course). That was a long time ago now (yes, 18 years, as already mentioned), but I reckon it would be easier now with modern tools (and character sets)!

Those dBASE databases were quite useful to me at the time. There were three fields. When in a browse you would see the English word in the left-hand column ("grease", let's say), then in the next column we would have entered the phonetic "Arabic" spelling ("sha-ham")* and in the third (thanks to the magic of Nafitha) we would put in the Arabic script. This was very handy for non-Arabic speakers (like me) as I could pronounce the Arabic word whilst understanding what it meant! Ultimately, my aim was to produce the Arabic billing documents required by our client, automatically from my program (and my data).

Why am I mentioning all this? Well, perhaps the time has come to have another go. English to French, Italian, Spanish ... should be straight forward enough. But what about Polish, Tagalog and Turkish, for example? And what about languages that don't use Latin characters (eg, Arabic, Malayalam, Urdu)? Would a simple phonetic "translation" be any use? At the moment I don't see any easy way for the programming languages that I use to call another character set (Arabic, for example) directly to the screen, but this may well be possible. But at the moment I have to rely upon the extended ASCII character set, and switching between Code Pages if necessary and/or possible (that is, nothing esoteric like Unicode).

So, if anyone has any translation data already available, or indeed would like to collaborate on this Great International Gesture, feel free to shout (crier, gridare, gritar ... whatever)!

I wonder whether HTML is the answer?

* In cases where more than one word was available to correspond to the English, I would always default to the common, or "street" version!