Vi riporto questa discussione apparsa nella Mailing List ufficiale:
" I've never been happy with the standard sorting algorithm when dealing with
lists of names. The human eye expects the names to be listed
alphabetically, overlooking spaces, hyphens, accented characters, ...
Assume the following names:
- Benoizy
- Benoît
- Benï Lewis
- Benoix
- Ben Underwood
Sorting them using the default sort method results in the following list:
- Ben Underwood
- Benoix
- Benoizy
- Benoît
- Benï Lewis
Ben Underwood comes first due to the space having an UTF-8 value of
Benoît comes last in the Benoi series of names due to the UTF-8
value of î which is ï
Benï Lewis comes last of the list as ï has a UTF-8 value of î
Using the function Alfabet to sort the list, the end result using the
original strings appears in the form :
- Benï Lewis
- Benoît
- Benoix
- Benoizy
- Ben Underwood
which is the normal order a human expect to see when you ignore
spaces, accents, umlauts, ...
In annex (vedi allegato) I sent a function I have written that strips a string from all the non-letter characters and returns a simple pure ascii string with all
characters in the range "a"-"z". Also included a small snapshot of an
actual list sorted by alphabet in one of my programs. As you notice, the
names are truly listed 'by Alphabet'
Feel free to use the function or maybe the concept could be incorporated in a future build of Gambas
Alain J. Baudrez "
" Thanks for that interesting approach. I did a similar thing some years
ago to sort lists of (mainly German) names. But it is not all that easy
in every country.
You have to know that our Umlauts are sorted like vocal + "e", i. e. "ä"
= "ae" (which is its historical representation). And in office files (I
mean the paper ones) or telephone registers, we use tabs with "St" and
"Sch".
But beware: not all folks are doing it that way. The Swedish for
instance handle the umlauts as separate letters, i. e. they appear at
the end of the list. So in a Swedish dictionary, you will find a word
starting with an "ä" behind the words with "z". (And I would expect
"Hägar" to appear behind "Hazufel".)
My own algorithm sorts strictly the German way, whereas it does not
collect "St" and "Sch".
Rolf "