List Archives > 
Maple User Group List Archive > 
Archive by date > 
This Month By Date > 
This Month By Topic
[MUG] Re: sorting alphanumeric lists
| [MUG] Re: sorting alphanumeric lists |
|
Author: Maple User Group
Posted: Wed, 27 Nov 2002 10:37:35 -0500
|
>> From: Maple User Group "maple_gr"
-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-
Date: Sat, 23 Nov 2002 20:10:47 -0500 (EST)
From: Carl Devore "devore"
To: maple-list "maple-list"
Subject: sorting alphanumeric lists
| From: Robert Israel "israel"
|
| I'm somewhat puzzled, and would like to see a concrete example where
| lexorder gives a wrong answer on integers turned into names.
I believe that Bertfried wasn't converting them into names. I showed him
that in private email. Still, that does not sort correctly if you allow
negative numbers. Bertfried wants integers to sort -- as integers --
before names. In this case, I believe that you need to use a custom
ordering procedure to select between `<` and lexorder.
| Are you sure the problem isn't simply that lexicographic order is
| different from numerical order? For example, `10` comes before `9` in
| lexicographic order.
This can be corrected by using nprintf with a padding width. For example,
if you know that everything will fit into 20 spaces:
sortorder:= (a,b)-> lexorder(nprintf("%20a", a), nprintf("%20a", b));
This still doesn't work for negative numbers if you want them to sort as
numbers. Also, the sorting of names will be changed by the padding. You
could set it up so that integers pad and names don't. It's getting too
convoluted! So the simplest and fastest sort ordering procedure I can
come up for the following rules:
1. integers sort as integers
2. names sort lexicographically
3. integers come before names
is
sort_order:= (a,b)->
a::integer and (b::integer implies a<b)
or
not b::integer and lexorder(a,b)
;
Note that because of the McCarthy evaluation rules (see ?boolean),
complicated nestings of if-then-else that only return a true or false can
always be reduced to a boolean expression as above, and, in my experience,
it is usually saves some execution time.
-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-
Date: Sun, 24 Nov 2002 01:13:38 -0500 (EST)
From: Carl Devore "devore"
To: maple-list "maple-list"
Subject: sorting alphanumeric lists
On Tue, 19 Nov 2002, Bertfried Fauser wrote:
| I have a problen in sorting object. Maple provides two buildin functions
| sort and lexorder, which are pretty fast. I need to compare two lists, to
| compute the number of transpositions of a permutation
I am not sure if what you mean is that you want to count the number of
tranpositions required to transform one list into another. If that is
what you want, it can be done in near linear time (no sorting or
order relations required) as
CountTranspositions:= proc(L1::list, L2::list)
option `Copyright (c) 2002, Carl DeVore. All rights reserved.`;
local i,P;
P:= table([seq](L1[i]=i, i= 1..nops(L2)));
add(nops(i)-1, i= convert([seq](P[i], i= L2), disjcyc))
end proc:
`convert/disjcyc` is superlinear, but it works well for small lists. I
wrote a linear version of it that uses array operations rather than the
set and list operations used by `convert/disjcyc`. My version seems to be
faster for lists longer than 127, which I would guess has something to do
with hash bucket size. My version is only slightly slower for the shorter
lists, and is several times faster when the lists are in the thousands.
Here is my version. I maintained the structure and variable names of
`convert/disjcyc` as much as possible so you can compare:
DisjointCycles:= proc(p::list)
option `Copyright (c) 2002, Carl Devore. All rights reserved.`;
# Adapted from Maple's `convert/disjcyc`, but uses linear time.
local i,s,c,z,n,d,cp;
n:= nops(p);
s:= rtable(1..n, datatype= integer[1]); c:= array(1..n); d:= 1;
cp:= 1;
for i to n do
if s[i]=0 then
z:= p[i];
if z<>i then
c[cp]:= i;
for cp from cp+1 while z<>i do
c[cp]:= z; s[z]:= 1; z:= p[z]
od;
d:= d,cp
fi
fi
od;
[seq([seq(c[cp], cp= d[i-1]..d[i]-1)], i= 2..nops([d]))]
end proc;
| Note that there are two possibilities. the active one, which works well
| for lists of interges, since is(int1>int2) can compare integers (reals
| etc), _but_ fails to compute on letters (strings)
|
| > is(3>2);
| true
| > is(a>b),is(b>a);
| false,false
A name is not the same as a string. If you want to compare them as
strings, pass them through sprintf.
sprintf("%s", a);
"a"
Ordinary `<` will work for both strings and integers, and `is` doesn't
process names in the way that you want, so there is no need to use `is`,
which is a fairly slow high-level procedure anyway.
| which is nonsense, since a comparative is either true or false but cannot
| be false on both possibilities.
`is` uses universal quantification. is(a>b) means "for all values of a
and b satisfying the current assumptions, is it the case that a > b."
Clearly this false, and is(b<a) is also false.
| The (in the code above dissabled) idea to use the build in lexorder does
| not work either. While lexorder works on
|
| > lexorder(`2`,`3`),lexorder(`3`,2`);
| true,false
| > lexorder(a,b),lexorder(b,a);
| true,false
|
| Note that lexorder compares for `<`, hence we had to change the order in
| its arguments to achieve the same output.
|
| It is necessary to input `2` etc since lexorder is not able to handle
| numeric input like > lexorder(2,3);
You can acheieve this in the general by passing the argument through
nprintf:
nprintf("%a", 2);
2 [in italics]
| However, if applied to random listes of interges only, the function
| miscomputes. (I can check against a well tested function permsign from the
| Clifford package url: http://math.tntech.edu/rafal ) which computes the
| sign of a permutaion by looking at its cycle structures (which is however
| less information than the number of transposition N since it is just
| (-1)^N )
|
| I am entirely puzzled, why the above tow versions of teh function do not
| even work out to be equivalent on lists of integers. Indeed I would like
| to compute such a thing on lists of alpha numeric symbols,
| list{integer,symbol} hence I need lexorder.
To count the number of transpositions to make L2 from L1:
1. Let n = |L1|.
2. Contruct a function P : L1 -> {1..n} such that P(L1[i]) = i for all i.
3. Map this function over L2, thereby generating a permutation p of
[1..n].
4. Represent p in disjoint cycle form.
5. The number of transpositions in each cycle is one less than its length.
That is the process followed by my procedure CountTranspositions above.
However, I believe that what you are calling transpositions are what I
have seen called inversions by Knuth ([1] pp. 11-22): "If i<j and
a[i]>a[j], the pair (a[i],a[j]) is called an inversion of the
permutation."
Note that any sort (i.e., permutation) of a list can be achieved using at
most n-1 transpositions ** if you already have the sorted list to comapre
it to **. You could just modify the procedure given above to return the
transpositions from the disjoint cycles. Also note that sorting is
O(n*log(n)) whereas the number of inversions can be as high as n*(n-1)/2
if the list is completely reversed. It follows that a true sort does not
perform a transposition for every inversion. Therefore, modifying Maple's
`sort` to return the number of transpositions it performs cannot help you
in your quest to compute the number of inversions unless Maple is using a
bubble sort (O(n^2)).
However, Knuth ([1] p. 592) does give a O(n*log(n)) algorithm for counting
the number of inversions!! I think this is remarkable considering the
last paragraph. I encoded it in Maple. This works on permutations of
[$1..n]:
CountInversions:= proc(p)
option `Copyright (c) 2002, Carl DeVore. All rights reserved.`;
# Adapted from Knuth _Art_ vol 3, ed. 2, p. 592, prob. 5.1.1.6
local n,s,x,r,j,`2^k`,bits,cnt;
n:= nops(p); cnt:= 0; bits:= ilog2(n); `2^k`:= 2^bits;
x:= array(0..iquo(n,2));
to bits+1 do
for j from 0 to iquo(n,2*`2^k`) do x[j]:= 0 od;
for j in p do
s:= iquo(iquo(j,`2^k`), 2, 'r');
if r=0 then cnt:= cnt+x[s] else x[s]:= x[s]+1 fi
od;
`2^k`:= iquo(`2^k`, 2)
od;
cnt
end proc;
To get it to work on permutations of arbitrary lists, use the same
technique that I used in CountTranspositions to first transform it into a
permutation of [$1..n].
Compare CountInversions with the O(n^2) naive BubbleCount:
BubbleCount:= proc(p)
local i,j,n,cnt,pi;
n:= nops(p); cnt:= 0;
for i to n-1 do
pi:= p[i];
for j in p[i+1..n] do if j<pi then cnt:= cnt+1 fi od
od;
cnt
end proc:
CountInversions begins to be faster than BubbleCount when n=40.
Reference:
Donald E. Knuth _The Art of Computer Programming: Volume 3: Sorting and
Searching_ 2nd ed., Addison-Wesley 1998.
|
[View Complete Thread]
Previous by date: [MUG] Re: isqrt command, Maple User Group
Next by date: [MUG] Re: graphing parametric equations in 3d, Kamminga
Previous thread: [MUG] convolution operator, MTK-Adem Kilicman Dr
Next thread: [MUG] graphing parametric equations in 3d, Han Wesseling
|