There is huge range of reasonable textbooks on data structures and algorithms. Standish (1994), recommended in DS&A 1, covers a fair amount of the course, but has relatively little on string processing algorithms. Sedgewick (1992) is another established text, and the material on string processing in this course is based on this text.
Before starting these main sections I'll briefly introduce a new List ADT, which as well as being important in its own right, provides a review of some of the term 1 ideas, and is the basis of the first assessed exercise.
A given programming language will have a number of data types available, along with some operations on them. Pascal includes boolean, reals, integers, characters, records etc. Operations on booleans include and, or, etc. We don't normally worry about how these datatypes and operations are implemented, but just know what we can do with them.
However, most real programming tasks require more complex and sophisticated data types. If your programming language doesn't provide them, you'll have to define them yourself. Once you've defined such data types, you again don't want to worry again about how they're implemented, just know what operations are allowed on that data type. The definition of your abstract data type (ADT) should specify the operations allowed, but the implementation details should be hidden from the ``user'' of that abstract data type.
As a simple example, mentioned in DS&A 1, some high level programming languages will provide a ``stack'' data type built in, with operations ``initialise'', ``empty'', ``push'' and ``pop'' (or some equivalents) and maybe ``full''. In a language such as C or Pascal that does not provide such a thing, a stack data type can easily be defined, and implemented using either an array or a linked representation. Once implemented, we no longer need be concerned about which implementation technique is used, just what operations are allowed and what they do.
The notion of abstract data types obviously allows much more complex
data types to be defined, even in languages that do have a good range of
built in data types. So, for example, Common Lisp
has hash tables, stacks, lists, but does not have a graph ADT (discussed
later in this course). If a programming problem requires the use of a graph,
it should be defined and implemented in such a way that the main program
only uses a set of well defined operations on that datastructure (independent
of implementation decisions). It also helps if these operations are fairly
standard, so that standard algorithms can be used on that graph.
The operations we want to allow on such a structure are similar to those we might want on a normal linear list (though the names I use will be a bit different):
[figure39]
Figure 1.1: Partial Generalised List Implementation
[figure44]
Figure 1.2: Linked Representation of Generalised List (1 (2
3))
A simple extension of the above representation allows us to have nodes of different types (e.g.,integer, character and string), allowing list structures like ((1234 Fred) (2345 Jim)). We'd just extend the node-types to explicitly include these, rather than just the category simplenode. This is essentially what is done in the Lisp programming language. (A diagram illustrating this is given in Standish, fig 8.13.) Providing such flexible datatypes may be useful in some applications, though possibly at the expense of efficiency and some compile-time typechecking (which may catch certain errors).
Now that we can construct and do some basic operations on generalised lists we can use these to define higher level operations. Two common list processing operations are ``append'' and ``reverse''. These can straightforwardly be implemented using the existing primitives. It is also clearly necessary to write some reading and writing functions/procedures -- creating a list using all these Cons's is a bit awkward, so we'd really like to be able to say something like l := ReadList('(1 (2 3))'). This can be viewed as a very simple kind of parsing problem, and will be set as part of the assessed exercise.
Once we have defined a useful library of higher level operations we can then consider what algorithms can be used to manipulate the generalised list to solve our programming task. For example, we might use a generalised structure to represent logical expressions, such as (forall X (man x) -> (mortal x)). (The AtomType is now string). Then we could construct a theorem prover to use these structures to derive new conclusions.
However, this representation is obviously wasteful for small strings and inflexible for strings more than 255 characters. So another approach would be to use a linked representation with its end marked with a null character. This approach is obviously much more flexible, but some operations (like length) are more expensive. Turbo Pascal's strings unit uses this implementation.
Standish discusses how strings are represented in text files, in terms of the physical organisation of the storage medium (e.g., hard disk). However, I'll just stick to the basic logical operations on strings, and string processing algorithms which use these.
Sometimes we might want to search in binary strings (ie, sequences of 0s and 1s). For example the ``pbm'' graphics format is based on sequences of 1s and 0s. We could express a task like ``find a wide white stripe in the image'' as a string searching problem.
In all these applications the naive algorithm (that you might first think of) is rather inefficient. There are algorithms that are only a little more complex which give a very substantial increase in efficiency. In this section we'll first introduce the naive algorithm, then two increasingly sophisticated algorithms that give gains in efficiency. We'll end up discussing how properties of your string searching problem might influence choice of algorithm and average case efficiency, and how you might avoid having to search at all!
Procedure NaiveSearch(s1, s2: String): Integer;
{Returns an index of S2, corresponding to first match of S1 with S2, or}
{ -1 if there is no match}
var i, j: integer;
match: boolean;
begin
i:=1; match := FALSE;
while match = FALSE and i <= length(s1) do
begin
j := 1;
{Try matching from start to end of s2, quitting when no match}
while GetChar(s1, i) = GetChar(s2, j) and (j <= length(s2)) do
begin
j := j+1; {Increment both pointers while it matches}
i := i+1
end;
match := j>length(s2); {There's a match if you got to the end of s2}
i:= i-j+2 {move the pointer to the first string back to
{just after where you left off}
end;
if match
then NaiveSearch := i-1 {If it's a match, return index of S1}
{corresponding to start of match
else NaiveSearch := -1
end;
We'll illustrate the algorithms by considering a search of string 'abc'
(s2) in document 'ababcab' (s1). The following diagram should be fairly
self-explanatory (the îndicates where the matching characters are
checked each round):
'ababcab' i=1,j=1 'abc' ^ matches: increment i and j 'ababcab' i=2.j=2 'abc' ^ matches: increment i and j 'ababcab' i=3,j=3 'abc' ^ match fails: exit inner loop, i=i-j+2 'ababcab' i=2, j=1 'abc' match fails, exit inner loop, i=i-j+2 ^ 'ababcab' i=3, j=1 'abc' matches: increment i and j ^ 'ababcab' i=4, j=2 'abc' matches: increment i and j ^ 'ababcab' i=5, j=3 'abc' matches: increment i and j ^ i=6, j=4, exit loop (j>length(s2)), i = 4, match=TRUE, NaiveSearch = 3Note that for this example 7 comparisons were required before the match was found. In general, if we have a string s1 or length N, s2 of length M then a maximum of approx (M-1)*N matches may be required, though often the number required will be closer to N (if few partial matches occur). To illustrate these extremes consider: s1= 'aaaabaaaabaaaaab', s2 = 'aaaaa', and s1= 'abcdefghi', s2= 'fgh'.
The Knuth-Morris-Pratt (KMP) algorithm uses information about the characters in the string you're looking for to determine how much to `move along' that string after a mismatch occurs. To illustrate this, consider one of the examples above: s1= 'aaaabaaaabaaaaab', s2 = 'aaaaa'. Using the naive algorithm you would start off something like this:
'aaaabaaaabaaaaab' 'aaaaa' ^ 'aaaabaaaabaaaaab' 'aaaaa' ^ 'aaaabaaaabaaaaab' 'aaaaa' ^ 'aaaabaaaabaaaaab' 'aaaaa' ^ 'aaaabaaaabaaaaab' 'aaaaa' ^ match fails, move s2 up one.. 'aaaabaaaabaaaaab' 'aaaaa' ^ etc etcbut in fact if we look at s2 (and the 'b' in s1 that caused the bad match) we can tell that there is no chance that a match starting at position 2 will work. The 'b' will end up being matched against the 4th character in s2, which is an 'a'. Based on our knowledge of s2, what we really want is the last iteration above replaced with:
'aaaabaaaabaaaaab' 'aaaaa' ^We can implement this idea quite efficiently by associating with each element position in the searched for string the amount that you can safely move that string forward if you get a mismatch in that element position. For the above example..
a 1 If mismatch in first el, just move string on 1 place. a 2 If mismatch here, no point in trying just one place, as that'll involve matching with the same el (a) so move 2 places a 3 a 4 a 5In fact the KMP algorithm is a little more cunning than this. Consider the following case:
'aaaab' i=3,j=3 'aab' ^We can only move the second string up 1, but we KNOW that the first character will then match, as the first two elements are identical, so we want the next iteration to be:
'aaaab' i=3,j=2 'aab' ^Note that i has not changed. It turns out that we can make things work by never decrementing i (ie, just moving forward along s1), but, given a mismatch, just decrementing j by the appropriate amount, to capture the fact that we are moving s2 up a bit along s1, so the position on s2 corresponding to i's position is lower. We can have an array giving, for each position in s2, the position in s2 that you should backup to in s2 given a mismatch (while holding the position in s1 constant). We'll call this array next[j].
j s2[j] next[j] 1 a 0 2 b 1 3 a 1 4 b 2 5 b 3 6 a 1 7 a 2In fact next[1] is treated as a special case. In fact, if the first match fails we want to keep j fixed and increment i. One way to do this is to let next[1] = 0, and then have a special case that says if j=0 then increment i and j.
'abababbaa' i=5, j=5 'ababbaa' ^ mismatch, so j = next[j]=3 'abababbaa' i=5, j=3 'ababbaa' ^ ------------------- 'abaabbaa' i=4, j=4 'ababbaa' ^ mismatch, so j = next[j]= 2 'abaababbaa' i=5, j=2 'ababbaa' ^ ------------------- 'bababbaa' i=1, j=1 'ababbaa' ^ mismatch, so j = next[j]= 0, increment i and j. 'abaababbaa' i=2, j=1 'ababbaa' ^It's easy enough to implement this algorithm once you have the next[..] array. The bit that is mildly more tricky is how to calculate next[..] given a string. We can do this by trying to match a string against itself. When looking for next[j] we'd find the first index k such that s2[1..k-1] = s2[j-k+1..j-1], e.g:
'ababbaa' s2[1..2] = s2[3..4] 'aba....' so next[5] = 2. ^(Essentially we find next[j] by sliding forward the pattern along itself, until we find a match of the first k-1 characters with the k-1 characters before (and not including) position j).
The detailed implementations of these algorithms are left as an exercise for the reader - it's pretty easy, so long as you get the boundary cases right and avoid out-by-one errors.
The KMP algorithm is extremely simple once we have the next table:
i := 1; j:= i; while (i <= length(s1) and j <= length(s2)) do begin if (j = 0) or (GetChar(s1,i) = GetChar(s2, j))) then begin i := i+1; j:= j+1 end else j := next[j] end;(If j = length(s2) when the loop exits we have a match and can return something appropriate, such as the index in s1 where the match starts).
The Boyer-Moore algorithm is significantly better, and works by searching the target string s2 from right to left, while moving it left to right along s1. The following example illustrates the general idea:
'the caterpillar' Match fails:
'pill' There's no space (' ') in the search string, so move it
^ right along 4 places
'the caterpillar' Match fails. There's no e either, so move along 4
'pill'
^
'the caterpillar' 'l' matches, so continue trying to match right to left
'pill'
^
'the caterpillar' Match fails. But there's an 'i' in 'pill' so move along
'pill' to position where the 'i's line up.
^
'the caterpillar' Matches, as do all the rest..
'pill'
^
This still only requires knowledge of the second string, but we require
an array containing an indication, for each possible character that may
occur, where it occurs in the search string and hence how much to move
along. So, index['p']=1, index['i']=2, index['l'] = 4 (index the rightmost
'l' where repetitions) but index['r']=0 (let the value be 0 for all characters
not in the string). When a match fails at a position i in the document,
at a character C we move along the search string to a position where the
current character in the document is above the index[C]th character in
the string (which we know is a C), and start matching again at the right
hand end of the string. (This is only done when this actually results in
the string being moved right - otherwise the string is just moved up one
place, and the search started again from the right hand end.)
The Boyer-Moore algorithm in fact combines this method of skipping over characters with a method similar to the KMP algorithm (useful to improve efficiency after you've partially matched a string). However, we'll just assume the simpler version that skips based on the position of a character in the search string.
It should be reasonably clear that, if it is normally the case that a given letter doesnt appear at all in the search string, then this algorithm only requires approx N/M character comparisons (N=length(s1), M=length(s2)) - a big improvement on the KMP algorithm, which still requires N. However, if this is not the case then we may need up to N+M comparisons again (with the full version of the algorithm). Fortunately, for many applications we get close to the N/M performance. If the search string is very large, then it is likely that a given character WILL appear in it, but we still get a good improvement compared with the other algorithms (approx N*2/alphabet_size if characters are randomly distributed in a string).
The best choice of algorithm (and the average case efficiency of your algorithm) will depend on your 'alphabet size' (how many different characters may appear in a string), and the amount of repetition in the strings. For normal natural language text (e.g., an English document), the KMP algorithm gives little advantage over the naive algorithm, but Boyer-Moore does very significantly better. For a binary string where subsequences of the search string may occur frequently, KMP may do quite alot better than the naive algorithm, while Boyer-Moore may do little better than KMP.
For applications such as text retrieval, where you may be (conceptually) searching though many text files for the occurence of a given string, approaches that go through each file from start to end are likely to be too inefficient. The only practical thing is to create a giant index which indicates for each word/character sequence just which files contain this sequence (and where in the file). You could have a large hash table, that allowed you to rapidly get from a given sequence to the list of files and locations. This obviously requires alot of preprocessing of files, and a fair amount of storage for the index/hash table. However, for any application where many users will be searching the same (fixed) set of texts such indexing will be worthwhile. I won't discuss details of how you'd do it here, just point out that in such cases we would pre-process the files to create indexes rather than searching them on-the-fly from the raw text.
Matching against patterns is important for many text processing applications. New languages have been developed whose major feature is good string manipulation features such as pattern matching. A good example is the PERL language, which is widely used to process and generate text files to be used on the World Wide Web.
To allow matches against such patterns we need to consider both how to describe such patterns (so that they are both easy to specify by the user and easy to process by the machine), and how we can do efficient matches and searches given such patterns. The next section will briefly discuss how we can represent patterns, and then we'll look at algorithms to do the matching.
This simple language is in fact very flexible. We can build up quite complex expressions, such as '[ab[tex2html_wrap_inline1276]c]*z' to match 'cz', 'abababz', 'ccabccz' and a whole lot more.
A regular expression may be represented graphically as a finite state machine as illustrated below. (The numbers in brackets are just so we can go discuss it more easily). Each possible string corresponds to a path from start (right arrow on LHS) to end node (box on RHS) of the network. So, for example 'abcabz' corresponds to following the path between nodes numbered 1,2,3,5,1,4,5,1,2,3,5,6.
[figure122]
Figure 2.1: finite state machine for pattern '[ab|c]+z'
We can implement a datastructure based on this graphical representation by having an array holding, for each state, the possible next states. So, state 1 has possible next states 2 and 4. (We could use a Nx2 array, where N is number of states, and agree on what to put in the second position where there is only one successor). There is then a fairly simple algorithm for checking matches of strings with this representation of a pattern. It is based on using a variant of the list datastructure to hold a representation of the states in the network to check when matching along the string. A list such as (1 2 + 5 6) means ``Possible states to go to in order to process the current character are 1 and 2, and possible next states assuming current character already matched are 5 and 6.''. So the representation allows you to explore many paths simultaneously, by keeping the alternatives next states on the list. The special symbol + is used essentially to separate possible states corresponding to the current character from states corresponding to the next character.
Put '+' on list; State := initial-state; quit := false; While not state=final-state and position <= length(string) and not quit do begin if State = '+' then increment position in string and put a new '+' at end of list. else if State contains current character then put the next state at end of list else if State contains choice point, put both next states at front of list If list is empty then quit := true else pop State off front of list. endA match succeeds if we reach the final state. The position in the string when the loop exits will indicate the shortest initial substring matching the pattern (we'd need slightly different exit conditions to check if the complete string matched the pattern).
As an illustration, consider what would happen given the network above if we try to match the pattern against 'abcz'. To simplify things we'll not show 'State', but just assume it is the first element of the previous list (which is 'popped off' after each iteration):
String position List of states to process 1 (1 +) Choice point: put next states at front 1 (2 4 +) State 2 contains 'a', so put next state at end 1 (4 + 3) State 4 doesnt match 'b' so no action. 1 (+ 3) '+', so increment string pointer, put + at end 2 (3 +) State 3 contains 'b', so put next state at end 2 (+ 5) '+', so increment string pointer etc 3 (5 +) Choice: put next states at front 3 (1 6 +) Choice: Put next states at front 3 (2 4 6 +) State 2 doesnt match 'c' so no action 3 (4 6 +) .. but State 4 does, so put next state at end 3 (6 + 5) no action.. 3 (+ 5) increment string pointer 4 (5 +) Choice: put next states at front 4 (1 6 +) Choice: Put next states at front 4 (2 4 6 +) State 2 doesnt match 'z' so no action 4 (4 6 +) State 2 doesnt.. 4 (6 +) but state 6 does, so put next state at end 4 (+ 7) increment string pointer.. 5 (7 +) REACHED END! Must match pattern.Think for a minute what one of the lists corresponds to. The list (6 + 5) given string position 3 means: another node to check for the 3rd character is '6'; but there's a next node 5 corresponding to a path we've already found containing the third character (1 2 3 5 1 4).
The algorithm outlined works reasonably efficiently: with a small modification to avoid the possibility of having redundant repeats of the same state, we have a worst case on [tex2html_wrap_inline1280] where M is the number of states and N the length of the string.
To complete our pattern matcher we need code to translate our regular expressions into a finite-state machine (our network). This is a non-trivial problem in itself.
As a very simple example, we might parse an expression ``1 + 2 * 3'', determine that it is a legal arithmetic expression, and return a tree structure such as:
+ / \ 1 * / \ 2 3In this section we will give a very brief introduction to parsing. However, it is a very large subject, and so the introduction here will be somewhat superficial, and the methods presented should not be blindly adopted for more serious applications without further study of the subject.
Parsing is also important for the interpretation of ``natural'' language such as English, and for other applications such as the processing of ``marked up'' text files (SGML). We'll include some examples of parsing natural language, as the grammar involved should be more familiar.
Programming languages are often described using a particular type of grammar known as a context-free grammar. For example, the following grammar rules define legal arithmetic expressions involving * and +:
<expression> ::= <term> | <term> + <expression> (rule 1) <term> ::= <factor> | <factor> * <term> (rule 2) <factor> ::= ( <expression> ) | v (rule 3)This grammar is meant to allow expressions like `1 + (1 + 3) * 6', but NOT expressions like `1 + * 2 ('. The grammar is such that the right structure will be assigned to the expression, as well as invalid expressions rejected.
The rules above can be read as follows: ``An expression consists of a term OR a term followed by a `+' followed by an expression. A term consists of a factor OR a factor followed by a `*' followed by a term. A factor consists of an expression in brackets OR a single letter/digit.'' (`v' stands for any letter or digit). So in the notation above, `::=' can be read as `consists of' or 'is a' and `[tex2html_wrap_inline1282]' means OR. In the above grammar, '+' and '*' are examples of terminal symbols corresponding to symbols in the language described, while 'expression' and 'term' are examples of non-terminal symbols, which are internal to the grammar. One non-termal is distinguished as the start symbol (in this case `expression'). A sequence is recognised by the grammar if we can rewrite it into a sequence containing just the start symbol, by repeatedly replacing sequences matching the right hand side of a rule with sequences matching the left hand side of the rule.
[figure163]
Figure 2.2: Top Down and Bottom up Parses, for a fragment of
English
Both top-down and bottom-up parsing are based on the idea of repeatedly modifying a sequence of symbols (based on allowed rewrites in the grammar rules), with the aim of getting from a sequence consisting of just the start symbol, to our sequence to be parsed. This can be illustrated as follows:
Top Down (Sentence) apply rule 1 (NounPhrase VerbPhrase) apply rule 2 (Article Noun VerbPhrase) apply rule 4 (the Noun VerbPhrase) apply rule 5 (the dog VerbPhrase) apply rule 3 (the dog Verb) apply rule 6 (the dog jumps) SUCCEEDS Bottom up (the dog jumps) apply rule 6 (the dog Verb) apply rule 3 (the dog VerbPhrase) apply rule 5 (the Noun VerbPhrase) apply rule 4 (Article Noun VerbPhrase) apply rule 2 (NounPhrase VerbPhrase) apply rule 1 (Sentence) SUCEEDSNow, of course, the grammar in the figure is a bit limited! In fact, it can ONLY recognise the sentence 'The dog jumps'. There are no options or alternatives. In any real grammar there will be many alternative structures to explore, and that's where the difficulty in parsing arises. This is particularly a problem when parsing natural language. For artificial languages such as programming languages it is not so bad, but we still need ways of choosing between alternatives. For example, if we have the rule:
<factor> ::= ( <expression> ) | vand we are doing a top-down parse, how do we know whether to rewrite <factor> to ( <expression> ) or to v (letter/digit). For programming languages, it is often possible to do this by looking ahead one character. In the above rule, if the next character in the sequence you are parsing is '(' then the first option is correct, otherwise it is the second option. Often, in order to make this one-char lookahead work, the grammar must be modified into an equivalent, but easier to process form.
As an example, the grammar rule above could be implemented as:
Procedure ParseFactor();
begin
If GetChar(S, j) = '(' then
begin
j := j+1;
ParseExpression;
If GetChar(S, j) = ')' then
j := j+1
else
error
end
else if LetterOrDigit(GetChar(S, j)) then
j := j+1
else error()
end;
This assumes that there is some error function that will, at least, set
a flag to indicate failure. Although rather inelegant, the approach is
at least simple, and illustrates the 'one char lookahead' very explicitly
in selecting between alternatives in the relevant grammar rule.
The basic mechanism can best be illustrated using an example. (Note that this is similar to the example near the beginning of this section, but we are specifying more precisely the actions to be taken at each iteration). I'll give examples both for natural language, and for arithmetic expressions (I'll use square brackets in the arithmetic expressions as otherwise its confusing).
Stack Input Sequence () (the dog jumps) (the) (dog jumps) SHIFT word onto stack (Art) (dog jumps) REDUCE using grammar rule (Art dog) (jumps) SHIFT.. (Art Noun) (jumps) REDUCE.. (NounPhrase) (jumps) REDUCE (NounPhrase jumps) () SHIFT (NounPhrase Verb) () REDUCE (NounPhrase VerbPhrase)() REDUCE (Sentence) () SUCCESS () (2 * [ 1 + 3 ]) SHIFT (2) (* [ 1 + 3 ]) REDUCE using rule 3 (<Factor>) (* [ 1 + 3]) SHIFT (<Factor> *) ([ 1 + 3]) SHIFT (<Factor> * [) (1 + 3]) SHIFT (<Factor> * [ 1) (1 + 3]) REDUCE (2 times..) (<Factor> * [ <Term>) (+ 3 ]) SHIFT (twice) (<Factor> * [ <Term> + 3) ( ]) REDUCE (3 times) (<Factor> * [ <Term> + <Expression>) ( ]) REDUCE using rule 1 (<Factor> * [ <Expression>) ( ]) SHIFT (<Factor> * [ <Expression> ]) () REDUCE using rule 3 (<Factor> * <Factor>) () REDUCE using rule 2 (<Factor> * <Term>) () REDUCE using rule 2 (<Term>) () REDUCE using rule 1 (<Expression>) () SUCCESSIn the second example, there were many cases where it was unclear whether to shift or reduce. This is a complicated issue. We can build shift-reduce parsers (for certain types of grammar) that use some look-ahead to determine whether to shift or reduce, in a similar manner to that outlined in the example for top down parsers. However, we'll leave it as an important issue to be discussed in further courses.
Note that now we have got part way towards writing a simple LISP interpreter. A Lisp interpreter takes an input string such as '(first (quote (a b c)))', constructs a generalised list representation of it, then will evaluate the expression by applying a function corresponding to the first item in the list to the results of evaluating the remaining items in the list. (The 'quote' function suppresses evaluation, so the list is treated literally).
As a simple example of how we might compress a file, suppose we replace every occurence of the word 'the' with the symbol '#' (and deal with actual occurences of '#' in some way!). The resulting file would take up less space, but it would be easy to recreate the original. In general we can try and replace common sequences with shorter 'codes' for those sequences.
There are four broad classes of compression algorithm, each good for different types of data. These will be discussed in turn.
aaaaabbbbbbbbbccccccould be represented more concisely as:
5a 9b 5cFor binary files we can be even more concise:
11111111111000000011111111111111could be represented simply as
11 8 14as we can assume that we are alternating between 1s and 0s. Of course, numbers take up more space (bits) than 1s/0s to represent, but we can still get good savings if we have quite long runs of 1s or 0s (as may be common in some simple image formats, for example).
Run-length encoding isn't much good for text files, as repeated sequences aren't that common. However, it is quite useful for images. Many standard image-file formats are essentially compressed versions of the simple bitmap representation -- run-length encoding is one compression method used.
One problem with this is that we have to include spaces between the codes somehow, in order to tell where one code begins and another ends. If we had an encoding such as 101 we wouldnt know whether this represented 'h' or 'te'. One way to avoid this problem is if we avoid using two codes such that one code consists of the first few characters of another (e.g., 101 10).
An acceptable set of codes can be represented as leaves in a binary tree:
[tex2html_wrap1290]
A code for a letter corresponds to the path from the root to the leaf labelled with the letter. So, we have T: 00, H: 010, A: 011, E: 1. If we have an encoding such as '000101' we can uniquely determine what it represents and translate back to the uncompressed form. The decoding can even be done efficiently, using the binary tree as a datastructure to traverse to find the decoding of a given bit sequence.
To actually construct an optimal tree, we can use data on the frequencies of difference characters. A general method for finding optimal codes was developed by D. Huffman, and the coding scheme is referred to as Huffman encoding.
The method involves building up a binary tree, bottom up, storing in each node the total frequency count for the characters below that node. Lets say we're only dealing with the letters t, a, e and h, and the frequencies are 10, 5, 15 and 3. We start off by creating leaf nodes for each of these:
t:10 a:5 e:15 h:3Then we pick the two with lowest frequency, and combine them, creating a new node with a combined frequency count:
*:8 / \ h:3 a:5 e:15 t:10We then repeat this, but now the nodes we consider are the new one, e and t:
*:18 / \ *:8 \ / \ \ h:3 a:5 t:10 e:15and finally:
*:33 / \ *:18 \ / \ \ *:8 \ \ / \ \ \ h:3 a:5 t:10 e:15We now have an efficient encoding scheme where the most common letter 'e' has a single bit code (1), the next most common 't' has a 2 bit code,
The code to construct such a tree, and use it in encoding/decoding is fairly simple, and described in Sedgewick, ch 22.
Huffman encoding is worth doing for text files, though you won't get any very dramatic compression (maybe 20-30%). It may also be useful for image files, where (for example) a small range of colours may be much more common than others.
The LZ78 family of compressors
LZ78-based schemes work by entering phrases into a *dictionary* and then, when a repeat occurrence of that particular phrase is found, outputting the dictionary index instead of the phrase. The dictionary is constructed as the string is traversed. New strings are only added if they extend a current string in the dictionary. By this means, you will tend only to get long strings in the dictionary if they occur frequently in the text.
The LZ77 family of compressors
LZ77-based schemes keep track of the last n bytes of data seen, and when a phrase is encountered that has already been seen, they output a pair of values corresponding to the position of the phrase in the previously-seen buffer of data, and the length of the phrase.So, we might replace the string 'the cat ate the dog' with 'the cat ate (1,4)dog'.
Substitutional compressors are good for strings with frequently reoccuring subsequences. So, they are useful for text files, but also for graphics files where a certain texture or colour may be represented by a repeated sequence. Many popular compression and archive programs (e.g., zip, gzip, compress) are based primarily on substitututional methods.
JPEG allows you to vary the degree of ``lossiness''. If you want a very small file size, but can put up with a slightly poor quality image, you can set parameters appropriately.
JPEG is jolly complicated. However, there is one key idea that is worth explaining. The basic idea is that you transform (bits of) the image into frequency space by applying a kind of fourier transform. Then you throw away some of the details of the high (spatial) frequencies. The result of this is that gradual changes are kept, but fast or abrupt changes may be lost. (If you've never heard of fourier transforms this probably won't make much sense, but it would take to long to explain).
As an illustration, here's a Piglet before and after its been compressed with JPEG with quality set very low:
[tex2html_wrap1292]
[tex2html_wrap1294]
The size of the file has been reduced by a factor of 20, but the image quality is poor. With higher quality settings any decrease in quality may be barely perceivable by human eye, yet the image size may still be reduced by a factor of 5 or so.
MPEG is a lossy compression algorithm for video. Very roughly, it uses JPEG like compression for a frame, but does uses difference information between frames to reduce the information needed for each frame. (Two successive frames may be virtually identical).
Any lossy compression algorithm is useless for text. It's no good saving your essay, only to find your compression program has removed all the 'details'! However, typically lossy algorithms can give greater compression than lossless methods, and for video, audio or graphics the result may not be noticeably poorer than the original.
In practice many compression programs combine aspects of each approach. For example, Huffman encoding can be used in combination with any of the other techniques, achieving further compression.
Play with an image processing program such as XV, looking at the resulting size of some different image formats (GIF, JPEG, Sun Raster etc). Again, try both with artifificial and natural images.
We'll briefly discuss a number of encryption methods, from the simplest, to an outline of current techniques. We'll be focusing on the encoding methods (cryptography) rather than methods of trying to break the codes (cryptanalysis).
The aim is to develop encryption methods such that someone can't work out the message given the code (cyphertext) without knowing the key.
In practice, for many applications we might be willing to have a system where, in principle, one could decode it without the key, but it would be so very expensive it wouldnt be worth it. (Internet financial applications fit this: People are probably not going to use many computer-years of effort to obtain credit card details which might result in a few hundred pounds gain).
Encryption methods can be divided into symmetric and asymmetric systems. In symmetric systems the same key is used by both the sender and the receiver. In asymmetric systems the sender uses one key (the public) key, and the receiver uses another (the private key. Symmetric systems are simpler, so will be described first, but asymmetric methods are now more important for most applications.
message: attack code: cvvcemHowever, this is pretty easy to decode: Just try the 26 possibilities until you get one that results in a meaningful message.
A better approach is to use a key that gives a mapping from letters of the alphabet (plus spaces) to letters that should replace them:
abcdefghij.... zqjprtoisva... message: a cab code: qzpqjbut this is also pretty easy to break: we could use letter frequency information (e.g., spaces, ts and es are common) to guess at the key.
To avoid this we could use a method closer to the first, but with a repeated key to decide on the value of K.
key: abcabc message: attack code: bvwbenor a method that combines these two ideas, with an offset depending both on a repeated key and the letter itself.
If the key is as long as the message itself (and only used once) then this encyption method is provably secure. But there is the problem of distributing the key. It is only really useful in applications where a long key can be distributed ahead of time, in order to allow urgent messages to be safely transmitted at a later date.
For simple encryption methods it is often possible to decode a message if you know (or can guess) part of the message. Perhaps you know that an email message starts with a standard header (From: ... Return-Path ..). Obviously if a simple repeated key encryption method is used, with a short key, then this would allow someone to break it. Speech signals often have large areas of silence. This too may be used to find the key.
Many encryption methods involve ``generating'' a long key from a shorter key. So, rather than just repeating a short key, we have some cunning method for creating a longer key out of the shorter one, that creates a pseudo random key that is hard to break, even if some plaintext is known.
If you don't know any of the plaintext, breaking the code will be harder whatever encryption method: possible strategies are brute force methods (try all keys until one produces something with dictionary words in), and statistical methods (use knowledge of letter/word frequencies). However, these strategies will also tend to be foxed if a long pseudo key has been generated.
In a public key system, different keys are used to encrypt and decrypt. The receiver has a private key that can be used to decode any message encoded using his public key. But the public key is NOT enough to decode the message. To send a message to someone, you look up their public key (which is, as you might expect, quite public), encode your message with it, and sent it. The receiver can decode it using his private key. We can express this semi-formally as follows. If P(M) is the encoding of M using the public key, S(M) is the decoding using the private key, we want a system where S(P(M))=M (for all M), but where S cannot be (easily) derived from P.
A method that enables this is RSA public-key encryption. Here, the public and private keys are just pairs of BIG numbers. Let the public key be (N, p) and the private key (N, s). The plain text is broken up and translated into numbers (by whatever method), then encryption is just: [tex2html_wrap_inline1296], decryption [tex2html_wrap_inline1298]. For this to work we want keys such that [tex2html_wrap_inline1300] (and such that S cant be derived easily from P). It turns out that if we generate 3 random prime numbers x,y,and z, make S be the largest (say z), N the product of x and y, and P a number such that PS mod (X-1)(Y-1)=1, then we have suitable keys. (You can try this out with small primes, but really we'd use prime numbers with hundreds of digits!). Finding suitable keys (and trying to break codes!) using this system requires number theory algorithms, such as algorithms for finding greatest common divisors, large primes, prime factors etc.
Encryption and decryption using this method is relatively time consuming, compared with other methods. It is therefore common to effectively use public key encryption just to encode a session key to be passed with the message to the receiver, and to encode the message itself using simpler methods (using the session key).
Netscape (a WWW browser) uses RSA public-key encryption to encode session keys that are then used in less costly symmetric encryption algorithms. If, say, a bookstore wanted to setup so that people could securely send orders which included their credit card details and address, they would get hold of suitable public and private keys, then tell their customers their public key (or rather, incorporate the key in the electronic order form so that it is automatically used at the client end to encode messages sent with forms from the server). Now, if a client sends a message using that public key, that message can (in principle) only be read by the bookstore, so it is only the store which has the credit card details.
[figure259]
Figure 3.1: Example Graphs
Graphically we can think of a graph as a collection of points, some of which are connected. These connections can be two way or one way (illustrated by arrows, for one way connections). Graphs with one way connections are referred to as ordered. In an unordered graph, if there is a connection from vertex 1 to vertex 2, there also is from node 2 to node 1. Ordered graphs may be cyclic or acyclic depending on whether it is possible to get back to a vertex by following arrowed links. Two nodes are adjacent if an edge connects them, and a path is a sequence of adjacent vertices. Graphs may be connected or unconnected as illustrated in the figure. Vertices may be labelled as illustrated in the first example in the figure, where the labels are just integers.
We often use family tree terminology for graphs. The children of a node n1 are those nodes such that there is an edge from n1 to the child node. (These are also sometimes called the neighbours of the node, particularly when talking about general graphs rather than trees). The parents of a node is the reverse. Ancestors, siblings and descendants are what you might expect.
Formally, a graph G = (V, E) where V is a set of vertices, E is a set of edges, where each edge is a pair of vertices from V. Standish gives formal definitions for paths, cycles, adjacency, connectivity etc in terms of this formal description.
Procedure CreateGraph(var G: graph; size: integer);
{ Creates and initialises a graph of given size (no. of nodes)}
Procedure AddEdge(var G: graph; N1, N1: integer);
{ Adds edge from N1 to N2}
{Pre: N1 and N2 are nodes in graph}
Procedure RemoveEdge(var G: graph; N1, N2: integer);
{ Removes an edge}
{ Pre: There is an edge in G from N1 to N2}
Function EdgeExists(var G: graph; N1, N2: integer): boolean;
{ True if there is an edge in G from N1 to N2}
Procedure SetLabel(var G: graph; L: graphlabel; N: integer);
{ Adds a label L to node N}
{Pre: N is a node in G}
Function GetLabel(var G: graph; N: integer): Graphlabel;
{ Returns the label of a node}
{ Pre: N is node of G}
Function Neighbours(var G: graph; N: integer): EdgeList;
{Returns a linear list containing all the neighbours of N}
{Pre: N is a node in G}
(This assumes type definitions for Graphlabel and EdgeList; GraphLabel
could be a fixed length string, while EdgeList could be a fairly standard
one way linear list or stream).
Other operations might be useful, but this provides a reasonable minimal set.
The first graph in the figure would be represented as follows:
1 2 3 4 5 1 F T F T F 2 F F T F F 3 F F F F F 4 F F T F T 5 F F F F FThe details of this are left as an exercise. In fact, exercise 2. All the ADT operations specified above can be straightforwardly implemented using this representation.
For the same graph as the one illustrated above this would give:
1 (2 4) 2 (3) 3 () 4 (3 5) 5 ()Now comes the question of how to represent the list of nodes. One way is to use linked lists. The graph would involve an array, where each element of that array is a linked list. Some of the graph operations are a little more tricky using this representation. However if you have a linked list ADT defined, you can just 'use' this, rather than redefining basic operations - In pascal unit terms you would have one unit using another (more primitive) unit.) You can actually go further in re-using existing ADTs, as the table ADT gives much what is required.
You might think that an easier method would be to use Pascal's set datatype rather than a linked list. But, in fact, Pascal sets are implemented as bit-arrays (e.g., 1000100, for a set containing item 1 and item 5) so this in fact reduces back to the previous representation.
[figure292]
Figure 3.2: Depth first and depth first search
For trees, there is a very simple algorithm for depth first search, which uses a stack of nodes:
push(start-node, stack); {(assumes stack initially empty..)}
repeat
current-node := pop(stack);
visit(current-node);
for {each node n in Neighbours(graph, current-node)} do
push(n, stack);
until empty(stack)
Pseudocode has been used to describe putting all neighbours of the node
on the stack; you'd actually need to either use list traversal of neighbours
or check through all possible nodes and add them if they are a neighbour.
'Visit' is assumed to be some procedure that processes each node. If, rather
than processing a node, you want to find a given node, the code would be
the same, but would exit when the current-node is the one you're looking
for.
We can see how this will work for the tree in the figure above. (The trace below shows 'stack' on entering the body of the loop each time, and 'current node' after it has been popped off stack).
Stack Current-node Neighbours (1) 1 2,3,4 (2 3 4) 2 5,6 (5 6 3 4) 5 none (6 3 4) 6 none (3 4) 3 7 (7 4) 7 none (4) 4 8,9 (8 9) 8 none (9) 9 none ()Note that the order of current-nodes is correct for depth first.
The algorithm for breadth first is exactly the same BUT we use a queue rather than a stack of nodes: put the neighbours on the BACK of the queue, but remove current-node from the front. I'll leave it as an exercise for the reader to verify that this leads to the right search. Depth first search also has a simple recursive version of the algorithm, but using a stack makes it more explicit what is going on.
Note that for any of these algorithms, it will only result in nodes in a connected region of the graph being traversed. To traverse the whole of a not fully connected graph you would have to try different starting nodes.
push(start-node, stack);
visited := [];
repeat
current-node := pop(stack);
if not (current-node in visited) then begin
visit(current-node);
visited := visited + [current-node];
for {each node n in Neighbours(graph, current-node)} do
if not (n in visited) then
push(n, stack)
end
until empty(stack)
It turns out that there is a simple trick that can be used to find the path. Whenever we push a node onto the stack, we also make a record of that node's parent (ie, the current-node). As we only push nodes on the stack if they aren't already visited, this will result in a single parent being recorded for each node (so we can just hold the parent record in a 1-d node array). Once the target has been found, we can find its parent, its parent's parent, that node's parent, and so on to the start node. That sequence will be a path from start to target.
This is illustrated in the modification of the algorithm given below. A 'writepath' procedure would also be required to write out the resulting path, given path array, start node and target node.
push(start-node, stack);
visited := [];
repeat
current-node := pop(stack);
if not (current-node in visited) then begin
visit(current-node);
visited := visited + [current-node];
for {each node n in Neighbours(graph, current-node)} do
if not (n in visited) then begin
push(n, stack);
parent[n] := current-node
end
end
until empty(stack) or current-node=target;
Such a graph is referred to as a weighted graph. A weighted graph is generally implemented using an adjacency matrix: whereas before the matrix was an NxN array of booleans, now we use an NxN array of integers or reals, where the number indicates the weight of an edge (where there is no edge between two nodes this corresponds in principle with a weight of infinity, but some convention can be adopted for what to put in the matrix for such cases). The ADT definition would be much as before, but with further operations to modify and access weights.
Given a weighted graph, we may want to find the shortest path between two nodes, where the 'length' of the path is just defined as the sum of the weights on the relevant edges. This might be used to, say: find the shortest route between cities; the quickest train journey; the cheapest way to do a sequence of tasks.
If the weights on every edge are the same (e.g., a weight of 1) then we already have an algorithm that will do the trick: Breadth first search. This always explores paths of length N before paths of length N+1, and so will find the shortest path first.
For arbitrary weights, things are a little more complex (at least, if we want to do the task reasonably efficiently). It is necessary to keep track, as we search, of the shortest distance we have found so far connecting the start node to each other node. This information can be kept in an array (say, called shortest). Then, as we search, given a number of possible nodes to explore, the one that is the shortest distance from the start node is explored first. (This is like breadth first, but it is kind of 'shortest' first). If we do this we find that when we find the target node, the path we have effectively traversed is guaranteed to be the shortest.
I'll go through how this works with an example (modified from Standish's example). Suppose we are searching the graph in figure ?, in order to find the shortest path from node 1 to node 5. We'll use a set W to denote the nodes so far examined, and I'll give entries in the wet as Node:PathWeight, so initially :
[figure319]
Figure 3.3: Example weighted graph
W=[1:0].
We then look at the nodes connected to 1, and record the shortest distance from node 1 to these nodes, ie:
shortest[2] = 3; shortest[6]=5
The one with the shortest distance is added to W, and the 'shortest' array updated to also hold information about the nodes connected to 2:
W = [1:0,2:3]; shortest[3]=10
Now, the node with the shortest distance, not already in W is 6, so this is added, and shortest updated as before:
W = [1:0,2:3,6:5]; shortest[4]=7
note that we will also have discovered another path to node 3 via 6, but the distance is 13, so more than shortest[3]=10 (added earlier). So this path is ignored.
Now the best node not in W is 4:
W= [1:0,2:3,6:5,4:7]; shortest[5]=7+6=13
Best is now node 3:
W = [1:0,2:3,6:5,4:7,3:10]; shortest[5]=11 (overwriting old value)
Best is now node 5, with shortest path length 11. If that's the node we were looking for, we now know its shortest path from node 1. Or we could continue to find the shortest paths for all the nodes. With a small extension to the approach, we could have recorded the path taken, which could be returned.
The actual algorithm for this is as follows. T[i,j] gives the weight between two nodes; NodeSet is the set of all nodes; Max is the largest number that we need consider (ie, same as using infinity). I've only elaborated the last part of the loop: more details on the rest of the stages are in Standish, if it is not obvious.
{Initialise shortest array so shortest[i]=MAX if i<>start, else 0.}
W := [start-node];
V := NodeSet - W;
while W<>V do begin
{find the vertex w in V-W at minimum distance from start-node,
ie, where shortest[w] is lowest}
{add w to W}
{Update the shortest distances to vertices in V-W}
for {each u in V-W} do
Shortest[u] := min(shortest[u], shortest[w]+T[w,u])
end;
It is possible to prove that this algorithm (known as Dijkstra's
algorithm) works correctly. Essentially you use induction to show
that if the shortest distances are correct for all the nodes in W at one
stage, they are still correct after you add a new node to W as specified.
The same problem arises in spreadsheets. To recalculate cells after some of the values are changed, it is necessary to consider how one cell's value depends on the value in another, e.g.,
Cell Contents Value 1 100 100 2 (Cell 1) + 10 110 3 (Cell 1) * 2 200 4 (Cell 2) + (Cell 3) 310In the above example, if you change the value in cell 1, you need to first recalculate the values in cell 2 and 3, and only when that is done can you recalculate the value in cell 4.
Both these problems are essentially equivalent. The data of both can be represented in a directed graph (see fig ?). In the first each node is a module; in the second example each node is a spreadsheet cell. Directed edges occur when one node depends on the other, because of prerequisite relationships among courses or dependencies among nodes. The problem in both is to find an acceptable ordering of the nodes satisfying the dependencies. This is referred to as a topological ordering.
[figure331]
Node Dependent Nodes 1 none 2 1 3 1 4 2,3You would first remove any node that has NO dependencies, as an acceptable first node. That node could then be removed from the dependency lists of the other nodes:
Node order: 1 Node Dependent Nodes 2 none 3 none 4 2,3The process then repeats, with either node 2 or 3 being chosen. Let's say node 2.
Node Order: 1,2 Node Dependent Nodes 3 none 4 3This continues in the obvious manner, with the final order being 1,2,3,4.
In fact, to implement this we don't even need to store lists of dependent nodes, just the NUMBER of dependent nodes. So you start by finding the number of dependent nodes, for each node. Then when a node (with zero dependent nodes) is removed, you decrement the number associate with each of the dependent nodes:
ie,
Node Dependent Nodes 1 none 2 1 3 1 4 2 remove 1, then 2 and 3 have 0 dependent. remove 2, then 4 has 1 dependent remove 3, then 4 has 0 dependent remove 4To implement this, we can use a list datastructure L to return the topologically sorted nodes, and a queue data structure Q to hold the nodes will zero dependent nodes waiting to be processed. When a node N is removed from Q and added to L, all of N's dependent naighbours have their number decremented, and any that are then zero are added to Q. Something like:
Q L 1 1 2 3 1 2 3 1 2 3 4 1 2 3 4
What key ideas should you remember from all this? First, you may, when solving some computational problem, be able to spot that the data can be best represented as a graph, and that solving the problem involves analysing the graph in some way. You can then look up graph algorithms in a major textbook (like Sedgewick) to find out if there is a standard algorithm that fits. Second, the very basic algorithms mentioned here are part of the basic `vocabulary' of computer scientists, so might be referred to when people describe their system or method. A basic familiarity with the algorithms should help in understanding such descriptions.
Geometric problems are often easy for humans, but hard to find good algorithmic solutions. As an example, given a polygon defined by its vertices, it is easy for humans to: decide whether a given point falls within the polygon; find the smallest convex polygon (or convex hull) that encloses the vertices (see figure ?).
[figure339]
Figure 4.1: Polygon with point inside (A) and outside (B), and
'convex hull'
Finding out whether a given point falls within a specified polygon is obviously essential for drawing packages - for example, most of them will allow you to create an arbitrary polygon and have it 'filled' with a given colour. Finding the 'convex hull' surrounding a set of points could be useful when working out the shortest boundary enclosing some set of objects (e.g., a fence around a number of trees..).
In this course I'll only give a very brief introduction to geometric algorithms, roughly corresponding to chapter 24 in Sedgewick. You should, at least, be aware of this category of algorithms, so you know where to look things up if you need such a thing in your programming and problem solving.
type point = record x, y: integer end; line = record p1, p2: point end; polygon = array[0..Max] of point.We will require that the end point in a polygon will be the same as the first point, so it is possible to know where the polygon array ends.
\ / vs \ / \ / \/ \ / /\ \ /One way is basically to use school maths (y=mx+c etc) to find the equations, and then intersection points of the lines, and then find out if these intersection points lie 'within' each line.. Another approach, given in Sedgewick, which is introduces ideas useful in other algorithms is as follows.
First define a function that, given three points, tells you whether you have to turn clockwise or counter clockwise when travelling from the first to the second to the third. Call this CCW: it will return TRUE if it is counter clockwise. Now, two lines l1 and l2 will intersect if [tex2html_wrap_inline1312]AND [tex2html_wrap_inline1314]. Basically, this just means that both endpoints of each line are on different 'sides' of the other line.
The CCW function can be defined fairly straightforwardly, by calculating the gradient between point 1 and point2, point 2 and point 3, and checking if one gradient is larger than the other. (The implementation must cope with the fact that the slopes may be infinite, or that they are equal, but that's just a minor fiddle - see Sedgewick).
The following gives what might be an initial attempt at both ccw and intersect. I'll leave it as an exercise for the reader to work out how to make it deal with infinite slopes (ie, dx=0), and the case where two lines are collinear.
function ccw(p1, p2, p3: point): boolean;
{Slightly deficient function to determine if the two lines p1, p2 and
p2, p3 turn in counter clockwise direction}
var dx1, dx2, dy1, dy2: integer;
begin
dx1 := p2.x - p1.x; dy1 := p2.y - p1.y;
dx2 := p3.x - p2.x; dy2 := p3.y - p2.y;
CCW := (dy1/dx1 < dy2/dx2)
end;
function intersect(l1, l2: line): boolean;
begin
intersect :=
((ccw(l1.p1, l1.p2, l2.p1) <> ccw(l1.p1, l1.p2, l2.p2))
and (ccw(l2.p1, l2.p2, l1.p1) <> ccw(l2.p1, l2.p2, l1.p2)))
end;
[figure348]
Figure 4.2: Determining if point is included in polygon
This is slightly complicated by the fact that the line might intersect a vertex of the polygon. If so, it may sometimes be counted as two counts and sometimes as one, as illustrated by the two such examples in the figure. Without this complication, the algorithm to determine inclusion is the following (which ignores such points). 'n' is the number of points in the polygon.
function inside(p: point; poly: polygon; n: integer): boolean;
var i, j, count: integer;
lt, lp: line;
begin
count := 0; j := 0;
lt.p1 := p; lt.p2 := p; lt.p2.x := MAXINT; {Defines horizontal line
out from p}
for i := 0 to n do begin
lp.p1 := p[i]; lp.p2 := p[i+1]; {Create line for segment in
polygon}
if intersect(lp, lt) then count := count +1;
end;
inside := (count mod 2 <> 0)
end;
To handle the case where our line intersects a vertex, we need to check
whether the two points in the polygon either side of the vertex are on
the same 'side' of the test line. (Convince yourself that this is right).
Implementation details are left to the reader, or to look up in Sedgewick.
[figure356]
Figure 4.3: Convex hulls
A function for finding a convex hull takes an array of points, and returns a polygon (another array of points) (a subset of the points in the first array). Sometimes it may be easier to modify the original point array, rather than returning a new one.
The most natural algorithm for finding the convex hull is referred to as package wrapping. Starting with some point guaranteed to be on the hull (say, the one with smallest y coordinate), take a horizontal line in positive direction and sweep it up until it hits a point. This point is guaranteed to be on the hull. Record this point, anchor the line at this point, and continue sweeping until another point is hit. Continue in this way until the `package' is fully `wrapped', as illustrated.
[figure360]
Figure 4.4: `Package Wrapping' a Convex Hull
So, how do we find the next point on the hull, assuming we have a suitable line which is to be `wrapped' round to this point? Basically, the next point is the one for which the angle between the current line, and the line from the last point on the hull to the point considered, is a minimum. (This angle is labelled theta ([tex2html_wrap_inline1318]) on the diagram). If we have a function theta, that takes two points and returns the angle between the associate line and the horizontal, then this results in the following algorithm:
procedure wrap(var points: polygon; n: integer);
var i, min, m: integer;
th, v: real; {v will be current 'sweep' angle;}
begin {th will keep track of minimum angle}
min := 0;
for i := 1 to n-1 do
if p[i].y < p[min].y then min := i; {find starting point}
th := 0.0; p[n] := p[min]; {initialise 'th' and p[n]}
m := 0; finished := false;
while m < n and not finished begin {we'll be looking at the m'th point on hull}
swap(points, m, min); {Move the next point into the M'th position;}
and vice versa}
min := n; v := th; th := 360; {set sweep angle, initialise min & th}
for i:= m+1 to n do
if (theta(p[m],p[i]) > v) and (theta(p[m], p[i]) < th) then begin
{if the angle to this point is > current sweep angle, and is
the lowest such angle come across, then set is as a new min}
min := i; th := theta(p[m], p[i])
end;
if min=N then finished := true;
m := m+1;
end;
end;
Some steps of the above may not be obvious. As points on the hull are found,
they are swapped with an actual point in the array, so at any point the
array contains the first m points of the hull plus the rest of the points.
So that the algorithm can exit when the hull is complete, the starting
point in the hull is placed in the N'th position in the array. When the
current point on the hull just found is that one, then exit.
The algorithm should look vaguely like the sort algorithms which you should have already met. This is not surprising, as we are basically sorting a bunch of points according to their position around the hull. Like other simplistic sort procedures (selection sort), this algorithm doesnt have very good worst case complexity (N*N). Other methods for finding convex hull can have N log N complexity, and have analogs in other sorting methods.
Efficiently implementing this kind of interface to a database clearly relies on geometric notions, and some knowledge of geometric algorithms may enable it to be done efficiently. (Very roughly, range searching can either be done by, say, searching all points to find those falling within range in first dimension, then searching those points to find those falling in range, and so on; or mor efficiently by storing the data in a (binary tree like) form that enables comparisons to be done very quickly.)
Anyway, that's it for geometric algorithms, and indeed pretty much for algorithms in general. We've looked at three main categories: string processing, graph algorithms and geometric algorithms. In practical terms, this should be useful if you have to solve problems that might involve finding a suitable data representation and algorithm: you might vaguely think `maybe a geometric algorithm could be used for this' and then go away and look it up in your favourite algorithm book. In examination terms of course a little more is expected: you should be able to explain and work through the algorithms mentioned, giving pseudo code (or Pascal) where this is straightforward, and be able to at least suggest how algorithms might be extended, combined, or applied to given tasks.
The basic idea of dynamic programming can be illustrated by looking at the `knapsack' problem. In this problem, a thief robbing a safe finds it filled with N items of various size and value. But his bag has only limited capacity (M). How does he choose which items to take to maximise his loot's value.
As an example, perhaps there are four objects in the safe of following sizes/values: size 12, value 20; size 5 value 8; size 3 value 10; size 6 value 10. His bag's capacity is 12. He should clearly go for the first object and scrap the rest. But what if his bag has capacity 15. This time he should leave the big object, and go for the smaller ones. This problem is clearly of some practical relevance, for example in real packing (e.g., shipping) applications.
In a dynamic programming approach we do a rather non-intuitive thing. We calculate the best combination of objects for ALL knapsack sizes up to M. This can be done very efficiently, as illustrated in the following program (which assumes that we have a potentially number of items of each given size/value, these sizes/values being stored in an array).
for j:=1 to N do {Go through each item}
for i := 1 to M do begin {Consider each size knapsack}
if i >= size[j] then
if (cost[i] < cost[i-size[j]] + value[j]) then begin
cost[i] := cost[i-size[j]] + value[j];
best[i] := j
end;
Cost[i] is the highest value that can be achieved with a knapsack of capacity
i and is initialised to zero; best[i] is the last item that was added to
achieve that maximum. First we calculate the best we can do only using
objects of type 1 (j=1). Then we calculate the best considering items of
type 1 and 2 (using our result for just type 1). And so on.
The revised calculation of 'cost' when considering a new item is very simple. We can find out the value (cost) of items that we'd be able to take of the other kinds considered so far IF we removed enough to leave room for our new kind of item (cost[i-size[j]]). If that value PLUS the value of the item we're considering adding is greater than the old cost just with all the old items then replacing the old item(s) with this new one is a good idea, and so cost (and best) are updated to reflect this. (We can see parallels with shortest path algorithms).
As an example, suppose we have the following items, and are looking to fill a bag of size 4:
Item Size Value 1 3 6 2 2 5 3 1 2Initially we just consider item 1. For a bag of size i=1 and 2 the item won't fit in the bag. For a bag of size 3, we can fit one item and as cost[3]<cost[0]+value[1], that one is added. So cost[3] is now 6; best[3]=1. For a bag of size 4, we check if cost[4] < cost[1]+value[1] - this is true, so for this size too we have item 1 in the bag, and cost[4]=6; best[4]=1.
Now we consider item 2. For bag size 1, it won't fit. For bag size 2 we check whether cost[2] < cost[0] + value[2]. This is clearly the case (as cost[2] is zero, as item 1 wouldnt fit in the bag) so we make that the `last' (and only) item in this bag, and cost[2]=5. However, for size 3 we could fit a type 1 object - so when we check if cost[3] < cost[1]+value[2] the answer is no. For size 4 we check if: cost[4] < cost[2]+value[2]. Now cost[4] is 6, cost[2] has just been calculated as 5, so the check succeeds, which means it is worth throwing out the previous contents of bag 4, and replacing the top item with item 2 (and assuming the rest of teh bag is te same as what is `best' for bags of size 2). So now we have: cost[0]=0; cost[1]=0; cost[2]=5; cost[3]=6; cost[4]=10.
Now, considering item 3. Item 3 fits in a bag of size 1, so cost[1]=2. It is not worth using for bags of size 2, but for a bag of size 3 we have: cost[3] < cost[2]+value[3], so the top item in this bag is item 3 (and we assume the rest is the best we can get for a bag of size 2). For a bag of size 4, this is not worth doing, so at the end of the day we have: cost[0]=0; cost[1]=2; cost[2]=5; cost[3]=7; cost[4]=10. We also have best[4]=2; best[3]=1; best[2]=2; best[1]=3.
It turns out that when the calculation is complete we can find out the contents of the knapsack using the 'best' array. Suppose best[M] is item k, of size size[k]. Then the next item in the bag must be the last item in a bag of size[M-size[k]]. And so on. So, in the above case, best[4]=2, so we have an item 2 in the bag. best[4-size[2]]= best[2]=2, so we have another item of type 2 in the bag.
All this can be proved correct by induction. ie, we assume that it gives the right result for N items, it gives the right result for N+1 items. Then we just check it is correct for 1 item. We check, for N items, that if it gives right result for bag of size M it gives right result for bag of size M+1. I'll leave that as an exercise for the reader.
This approach to the knapsack problem takes NxM steps, which isn't bad. Try thinking of other solutions to the problem. However, the approach does have a limitation - it only works for bags whose size can be represented as an integer (or where this is an acceptable approximation). The more fine-grained a result we want, in terms of bag sizes, the more partial solutions we need calculate and store for smaller sizes. For dynamic programming it is necessary to think hard about the time and space requirements of the approach. However, the general idea, of solving for simpler cases and then using these results to solve the harder case, is clearly a very widely applicable technique, and it should at least be viewed as a possible problem solving technique to be used for all sorts of hard problems.
Turbo Pascal and C++ support both imperative and object-oriented programming. You can write C++ programs which don't exploit its object oriented features, however very often there will be a cleaner solution that uses objects.
This simple feature introduces a great deal of flexibility, and more importantly makes it relatively easy to re-use code in different applications, where minor variations of a basic approach are required. Libraries of generally useful datatypes (objects) are made available, but the programmer can easily define special customised versions with slightly different behaviour.
The key concepts in object oriented programming are encapsulation and inheritance. Encapsulation has been met before for ADTs - the internals of the data representation should be hidden, and all communication with the object should be via the specified methods. Inheritance is where methods and data of parent (or ancestor) classes are used by instances of a given class.
Suppose we wanted to write a simple drawing program that could (among other things) display different shapes on the screen. The shapes to handle are circles and rectangles. They each have a position and size, though this will be defined differently for the different shapes. A circle will be defined by center point and radius, rectangle by a center point and the length and width. Shapes also have various characteristics such as colour.
Useful methods for these objects are: Initialise; Move; SetColour; Draw. Initialise should set the position and size, while Move, SetColour and Draw should do the obvious. It should be clear that Initialise and Draw will be different for circles and rectangles, while SetColour and Move could be defined in a shape-independent way.
The first thing to do in object oriented programming is decide on a class hierarchy - that is, a hierarchy of kinds of objects, from very general ones to very specific kinds. In this simple example we can have a general class of 'shapes' and more specific classes for 'circles' and 'rectangles' (a realistic program would of course have many more such classes). A particular circle with specific size, colour etc will be an instance of the circle class. We can now decide on which operations and datafields can be defined for the most general class (shape). The following gives the type definition, and an example method.
type Shape = object procedure Move(x,y: integer); procedure SetColour(col: ColourType); private xpos,ypos: integer; colour: colourtype; end; procedure Shape.SetColour(col: ColourType); begin colour := col end;Note that in the SetColour procedure 'colour' is not treated as a global variable, but as the value of the colour field of the object whose colour is being set. We can also access these fields when calling further procedures within the procedure (e.g., we could call a procedure MovePen(xposn, yposn) and the x and y position of the object in question would be accessed, without these needing to be specified as arguments to the procedure.) Note too that the position and colour of the shape are `private' and can only be accessed or modified via any defined operations.
Now, a circle and a rectangle are both kinds of shape. To specify this we put 'Shape' in brackets after 'object'. Class definitions for Circle and Rectangle could be:
type Circle = object(Shape) procedure Initialise(posn, rad: integer); procedure Draw; private radius: integer; end; Rectangle = object(Shape) procedure Initialise(posn, width, length: integer); procedure Draw; private width, length: integer; end;Circle and rectangle would inherit the procedures and datafields from Shape, so it is not necessary to repeat them. I'll just assume that the relevant procedures can be straightforwardly written. Note that the class 'Shape' is not a class that you'd have instances of. It is just defined so that specific shapes like circle and rectangle can be defined much more simply. Such classes are referred to as abstract classes.
If we now had an object instance Ob of type Circle, and wanted to set its colour, we could call Ob.SetColour. It would use the SetColour procedure inherited from Shape. We can think of this as sending a message SetColour tothe object Ob.
Now, suppose we wanted to deal with a specialised kind of rectangle: say, a rounded rectangle (one with rounded corners). This can use the standard rectangle's initialise method, but will need a special draw method. It will also need an extra field to specify the radious of the roundedness, so to speak:
type RoundedRectangle = object(Rectangle) procedure Draw; override; private roundedness: integer; end;Note that we explicitly state that the new Draw procedure should override, or replace, the one that would be inherited from the standard rectangle object.
This example pretty much covers the basics of object oriented programming. The last assessed exercise involves developing the example. However, there are many subtleties that must be mentioned. These will be discussed next, and then we'll see how some of the List ADT stuff could be reworked as objects with inheritance.
link_list = object
procedure Init;
function IsEmpty: boolean;
function IsFull: boolean;
function Length: integer;
procedure Insert(el: item; var success: boolean);
{Inserts element at current position}
procedure InsertAtEnd(el: item; var success: boolean);
{Inserts element at end}
procedure AccessFirst;
procedure AccessNext;
{Advances current position}
procedure AccessLast;
procedure GetValue(var el:item; var success: boolean);
{Accesses contents of current node}
procedure Delete;
{Deletes current node}
procedure Clear;
private
start, current, previous, end: listpointer;
size: integer;
end;
procedure link_list.InsertAtEnd(el: item; var success: boolean);
{ Inserts item el as the contents of new node at end of list, and
makes the current node be this new one.
Pre: List is initialised
Post: If `start' previously nil, it now points to new element else
New element is successor of one previously pointed to be 'end'}
begin
new(node); {Create a new node}
if start=nil then begin
node^.successor := nil;
start := node; end := node
end else begin
end^.successor := node;
node^.successor := nil;
end := node;
end;
node^.contents := El;
size := size+1;
end;
Have defined a fairly general purpose linked list object, we can now define
a stack object, making the linked list its parent. Again, I'll give the
type declaration, and an example procedure.
type stack = object(link_list)
procedure Push(x: item);
procedure Pop(var x: item)
end;
procedure stack.push(x: item);
{ Pushes x onto stack
Pre: Stack is initialised
Post: First & Current point to new node containing x, which has the
previous first node as its successor}
begin
AccessFirst;
Insert(x)
end;
Note that useful stack operations like Init, IsEmpty, IsFull and the data
fields are all inherited from link_list, and further that push (and pop)
can make use of link_list operations to simplify their implementation.
Although we can achieve this sort of code-reuse using Pascal units,
it is cleaner through objects.
A queue object can be defined in a similar manner. This is left as an exercise for the reader.
In the example above, none of the linear list methods are overridden -- new ones are just added. However, it may be appropriate to redefine some of the methods. For example, GetValue might be redefined in a stack/queue implementation where the first item could be examined without it being removed. For ordered lists (mentioned below), insert might be redefined to insert a new item such as to retain the order of items in the list.
There is one (arguable) problem with the approach outlined -- that is, that all of the linear list's other methods are inherited by stacks and queues, not just the ones we really want. But if we make these methods private then the link_list is less useful in itself (we might want to define instances of linear list, and require all the operations). We could get around it by re-organising things so that link_list is an abstract class with (mainly) private methods, and, say, owl_list is another descendant.
[tex2html_wrap1324]
A C++ or Turbo Pascal programmer would normally have access to class libraries providing a range of datatypes such as those above. The programmer could then develop specialised versions of a particular datastructure by creating a new class within the hierarchy. This requires knowledge of the classes available in the library (or browsing/search tools to rapidly navigate the hierarchy), an understanding of the underlying (documented) implementation, and an understanding of inheritance.
Now, if you DID have one, then you could double click on any of the objects in the hierarchy, and get information about that object. The information would normally include the declarations for all the objects methods and data fields, with an indication of where they were inherited from (and perhaps further info, such as whether they are declared virtual methods). If the source code was available, then by selecting one of the methods you should be taken directly to the place in the source code where that method is defined.
An object browser allows you to rapidly move within the text files defining a large object oriented system. If, say, you're editing a procedure which calls the `draw' method, and want to check out/modify that method, you need to go straight to just that version of the draw method that would be used there. Lets say the procedure you're editing is ob1.display. You'd find ob1 in the graphical class hierarchy, select that, find its draw method, select that, and then be at the appropriate draw method. Simple search tools (that may allow you to search for the procedure draw's definition) are just not adequate. There may be many versions of draw, and ob1 may inherit its version from any of its ancestors.
Object browsers allow static browsing of the classes. However, many systems also provide more run-time debugging tools, showing the object instances created at run-time, their inherited methods, and their location inthe hierarchy.
[figure410]
Figure 6.1: Test program illustrating Static methods
------------ Parent Value: 3 ------------This is perhaps unexpected, as it was a child object that was used. You might hope that the `child' writeout method would be used. But no. This is because writeout is (by default) a static method. This means that which version is used at any point is fully determined at compile time. And when a procedure, say, P1 is called from within a procedure Ob.P2 the compiler determines that it is Ob.P1 that should be used at that point. In this case, within parent.display, the compiler determined that the call to writeout should really be parent.writeout. This process is sometimes referred to as early binding. The called routine is bound to the calling method at run-time.
Now if we want the method used to always that appropriate to the actual instance we can define that method to be virtual. Then the choice of which method is used will be made at run-time, based on the class of the actual instance. Declaring a method to be virtual involves modifying the declarations in the type definitions for parent and child as follows:
procedure WriteOut; virtual;(Declarations for all occurences of Writeout must be modified). If you do this, and run the program then you get the expected output:
------------ Child Value: 3 ------------In fact, for virtual methods to work another small change has to be made to the program. We have to declare the initialisation procedure to be a `constructor' and ensure that it gets run before the virtual methods are accessed. When the constructor procedure is run, some initialisation is done to ensure that the virtual methods get set up correctly. Anyway, the new `virtual method' program would be:
type parent = object constructor init(value: integer); procedure Display; procedure WriteOut; virtual; private x: integer; end; child = object(parent) procedure WriteOut; virtual; end; constructor parent.init; begin x := value end;This process of determining which method is to be called at run time is referred to as late binding.
procedure child.display;
begin
parent.display;
writeln('Some extra stuff for child objects')
end;
OR
procedure child.display;
begin
inherited display;
writeln('Some extra stuff for child objects')
end;
For this example either of these would result in the output:
------------ Child Value: 3 ------------ Some extra stuff for child objectsThe former version would be used if you specifically wanted the method associated with a particular class. The latter would be used if you just wanted to use first the method that would normally be inherited by the child (from its parents or ancestors), but then add some more stuff.
A final basic concept to introduce here is the special identifier self. As things stand, within a procedure (or function) there is no way to get hold of the actual object instance which the method is being applied to. This contrasts with non-OOP where the `object' would be passed explicitly as a parameter. Normally this doesn't matter, as we're interested in the data fields (and methods) of the object, and the method will always access the data fields and methods of the instance in question. But occasionally we DO want to have a variable whose value is the object itself. The indentifier `self' can be used in these situations. Then self.x could refer to the x datafield of the relevant instance, rather than just `x'. Or we can set a local variable to `self', as illustrated below, where it is used in a procedure to write out the elements of a simple linear list datastructure defined as an object.
procedure list.writeels; var temp: list; begin temp := self; while (not temp=nil) do begin write(temp.value); temp := temp.successor end; end;
type graphlabel = string[20]; node = 1..maxnodes; graph = object procedure CreateGraph(size: integer); procedure AddEdge(n1, n2: node); procedure RemoveEdge(n1, n2: node); function EdgeExists(n1, n2: node): boolean; private numnodes: integer; nodearray: array[node,node] of boolean; end; labelledgraph = object(graph) procedure creategraph(size: integer); procedure SetLabel(l: graphlabel; n: node); function GetLabel(n: node): graphlabel; private labels: array[node] of graphlabel; end; implementation procedure graph.CreateGraph(size: integer); var i, j: integer; begin numnodes := size; for i := 1 to size do for j := 1 to size do nodearray[i,j] := false; end; procedure labelledgraph.creategraph(size: integer); var i: integer; begin inherited creategraph(size); for i := 1 to size do labels[i] := emptylabel end;Exercise for reader: Consider how to create A) Undirected graphs and B) Weighted graphs as subclasses of graph.
Anyway, we can modify existing programs very simply to use dynamic objects, without having to make any changes in the actual object declarations:
program drawtriangles;
uses shapes;
var t1: ^triangle; {t1 is a pointer to a triangle object}
x,y: integer;
begin
win := createsimplewin;
x := 100; y := 100;
new(t1); {allocate memory for a triangle}
t1^.create(x, y, 50); {Create a triangle (note the ^)}
t1^.changecolour(red);
t1^.changelinestyle(dashed);
t1^.draw(win);
dispose(t1); {De-allocate the memory}
stopwin;
end.
If a dynamic object contains virtual methods then the object's constructor
must be called after the new procedure allocates space, but before any
method calls are made. To simplify this the `new' procedure has been modified
to allow the constructor call as a second parameter. So the following is
legal, and does the obvious thing:
new(t1, create(x,y,50))When using virtual methods you should also really declare one of the methods as a destructor. When that is invoked the virtual method stuff for that object is `switched off'. Like new, dispose can be given an extra argument to specify the destructor method:
triangle = object(shape) destructor destroy; ..... ..... dispose(t1, destroy)
type colour = (red, green, blue);
....
Typecast Result
integer(green) 1
integer('a') 97
boolean('red') False
colour(True) green
char(100) d
Turbo Pascal also permits Typecasts of record and object datatypes. However,
the results may be meaningless unless the `shapes' of the datastructures
are the same (there is a a requirement that the total size of the two datastructures
is the same, but not that the individual field sizes match - odd results
occur if they don't).
type rec1 = record
a: integer;
b: char
end;
ob2 = object
c: integer;
d: char;
end;
var r1: rec1; r2: ob2;
begin
r1.a := 1; r1.b := 'a';
r2 := ob2(r1); {Covert r1 into something of type ob2 and assign to r2}
writeln(r2.c, r2.d);
ob2(r1).c := 2; {Convert r1 into an ob2 and put 2 into field c}
rec1(r2) := r1; {Convert r2 into a rec1 and out r1 in it}
writeln(r2.c, r2.d);
end.
Typecasting can be used in this way to convert between any two objects,
of different classes. It doesn't matter if some of the fields of the object
are inherited, but it DOES matter that the resulting field sizes match
up. This is a severe restriction, but fortunately got around through the
use of dynamic objects.
type squareptr: ^square; shapeptr: ^shape; var square1: squareptr; shape1: shapeptr; begin square1^.create; shape1 := square1 end;We can also use typecasting when allocating space on the heap (using new) so that, when allocating space for a `shape' it in fact sneakily allocates some extra space so that a `square' will fit. So really now the shape pointer points to something which is a square!
The following illustrates this:
var shape1: shapeptr;
begin
new(squareptr(shape1)); {Create space for a square}
shape1^.create(100,100,100); {Shape1 is now square shaped! So create it}
shape1^.draw; {And draw the square}
end.
The typecast operation means that first shape1 is converted to a
square pointer, and then the appropriate space on the heap is allocated
for the square.
Alternative valid forms of the first two statements above include:
shape1 := new(squareptr); shape1^.create; and new(squareptr(shape1), create(100,100,100));This new capability of using a general shape pointer to store squares is jolly handy. For example, we could let the user choose the shape to display, and then dynamically create the appropriate datastructure to handle it:
var s: shapeptr;
choice: char;
begin
write('Choose a shape: ');
readln(choice);
if choice = 's' then new(squareptr(s))
else if choice = 'c' then new(circleptr(s))
else if choice = 'r' then new(rectangleptr(s));
....
s^.draw;
this is developed a bit more in Standish (but ignore what he says about
handles, which is specific for a different version of Pascal which uses
objects).
Note: In order for the right method to be used when a variable may point to subclasses of the defined type, it is necessary for the methods in question to be dynamic methods. At compile time there is no way of knowing in examples like the above whether 's' is a square or a circle or a triangle, so the right method cannot be determined.
genlist = object ... atom = object(genlist) ... private contents: atomtype; end; list = object(genlist) ... private first: ^genlist; rest: ^genlist;
A special purpose design tool might make more, um, special purpose `shapes' available. For example, a tool for office layout might have built in classes for desk, filing cabinet and so on. The user would be able to manipulate their attributes (size, colour) but would have available standard methods for displaying, moving etc the objects.
Generally the interface objects themselves are defined as object classes. This provides advantages for the developer of the interface toolkit, as inheritance can be exploited both to simply the programming and also (as an added extra) encourage some uniformity between different user interface objects. Typically user interface objects such as buttons and check boxes (as illustrated above) are subclasses of the class of `dialog item'. All dialog items share the code concerning how they are positioned in a window, sized, have colour set, and so on. A particular dialog item (e.g., button) will have additional data fields (e.g., the procedure to call) and special display/draw methods.
[tex2html_wrap1326]
Dialog items themselves are often a subclass of a a more general class, which might include (among its subclasses) the class of windows. A window, like buttons and things, can be positioned, has a particular text font, and so on. It also needs special code to allow windows to be opened and closed, to be brought to the front, iconified, etc. A particular toolkit might also provide special purpose subclasses for, say, a text window or a graphics window.
Although these different user interface objects are defined as classes, using inheritance, a naive user of a given toolkit may only be vaguely of the class hierarchy, just selecting options from a menu/pallette to create their application interface. However, a more sophisticated applications programmer would probably want to get under the surface a bit, and be able to create special purpose subclasses with slightly different behaviour to the that provided in the toolkit/library. The underlying OO implementation of the interface objects should make that easy.
Here's an example C++ class, defining an array-based stack:
class Stack
{
private:
itemtype *stack {The Stack data field is a pointer to
something of type itemtype}
int p; {p is an integer indicating position in stack}
public:
stack(int max=100)
{ stack = new itemType[max]; p=0}
{stack is the constructor proc. It allocates
a new array, and sets the position to 0}
~stack()
{ delete stack;} {~stack is the destructor, and deletes it}
void push(itemtype v)
{ stack[p++] = v} { push increments p, and sets the array item to v}
itemtype pop()
{ return stack[p--] } { pop decrements p and returns the array el}
};
It should be reasonably clear that this is pretty much the same as the
equivalent Pascal version, just with different syntax.
As a contrast, I'll give an example of OOP in CLOS (Common Lisp Object System).
(defclass square(shape)
((size :initform 100)
(name :initform ``A Square'')))
(defmethod printshape(s square) {a print procedure for squares}
(print (slotvalue s 'name)))
The syntax here is very different, but we still define square a subclass
of shape, give it various data fields (with initial values), and define
methods that are used for instances of a given class.
Multiple inheritance is a mixed blessing, as there are often conflicts concerning which version of a procedure to inherit. Suppose X is a subclass of both Y and Z. Both Y and Z have different `display' procedures. Which one should be used by X? Or maybe Y inherits its method from another class Q. One way to determine which class to inherit from is to use topological sorting described earlier in the course. Suppose we have the following class hierarchy:
S / \ Q R | | Y Z \ / XA topological sort of these nodes would give: X, Y, Q, Z, R, S. Or equally well: X, Z, R, Y, Q, S or X,Y,Z,Q,R, S. There has to be some mechanism to choose between these alternatives, and the details depend on the particular OO language. One language uses the following: Each subclass must give parent classes in a precedence order (e.g., X might have Y before Z). Use this order to decide between alternatives. If this isnt enough, use the following rule: select the class which has a subclass rightmost in the sorted list so far. These two rules would result in the first of the alternatives above.
This sorted list is called the class precedence list. If the system is deciding which version of a method (or data field) to use, it will use the one associated with the first class it comes across in this precedence list that has the relevant method/field. Sp, if Q, Z and S all had a display method, X would inherit Q's version.
OOP has the following advantages over conventional approaches:
Objects are normally accessed via pointers, with memory allocated dynamically. Using this approach, a variable of a given (pointer) type can contain pointers to subclasses of the declared type. Explicit typecasting may be needed to allow this. Dynamic memory allocation also means that space can be allocated and deallocated in a flexible way. (You could have memory allocated and de-allocated for an object by declaring the object variable locally within a procedure - when that procedure exited the memory would be de-allocated. But this is less flexible).
Object oriented programming is generally considered good for software re-use and maintenance, as objects with clearly defined interfaces can be used and modified with relatively little effort. It has been particularly widely used in creating user interface libraries and toolkits. Although there are some time penalties in OOP, these are becoming less significant as machine speed increases.
However, to navigate round a large object-oriented program requires the use of good object browsers. It is very difficult to follow a large OO program simply by inspecting the code.