GUI Fest Challenges

Explode

(Colin Runciman)

Explode is a game for two or more players, in which each player has an inexhaustible supply of stones of their own distinctive colour. The game is played by placing the stones on the nodes of a finite connected graph, according to the rules below (in which we say a node is `full' when it holds as many stones as it has incident arcs).

Initially there are no stones on the graph. Players take it in turns to make a move. Each move increases the number of stones on the graph by one. A player takes a stone from their supply and places it on any node n not already containing stones of another colour. If this does not make n full, the move finishes. If it does make n full, then n `explodes': each adjacent node is invaded by one of the stones. The colour of any stones already in the invaded nodes turns to that of the invaders. Any of the invaded nodes that is now full also explodes in turn, and so on until either this move has caused every node in the graph to explode (and the player who made the move wins) or there are no more full nodes (and it is the next player's turn to move).

Explode is often played on a rectangular board divided into squares. The squares are the nodes, and there is an arc between squares with a common edge (ie., assuming at least a 3x3 board, corner squares have two arcs, other edge squares three, and non-edge squares four).

An `explode program' should provide a purely graphical interface to a rectangular board, depicted on the screen, with a height and width in squares set when the program is run. The program should support a contest between two players, displaying the effect of each move.

Combination Lock

(Sigbjørn Finne & Simon Peyton Jones)

Take an existing application and replace one of its buttons with a "combination lock"; that is, a combination of buttons which you have to press in the right order to complete one "press" of the button you are replacing.

Viewports

(Sigbjørn Finne & Simon Peyton Jones)

Enclose your entire Explode game in a viewport, so that you can deal with a board that doesn't fit on the screen. Did you have to change anything about the description of the Explode game?

Sliders

(Sigbjørn Finne & Simon Peyton Jones)

Change the "finger" in a slider (the bit you drag) so that it displays in numerical form where the slider is. How easy is it to change the feedback form? Eg instead of displaying a number, change the colour of the slider; or make the number display in application-specific units (eg kbytes) rather than interface-specific units (eg percentage of full-scale).

HTML browser

(Sigbjørn Finne & Simon Peyton Jones)

A bigger task: an HTML forms processor

This challenge is to build an HTML browser for HTML documents. The documents are not required to be loaded using the HTTP protocol, but can be assumed to exist in a file. A very open-ended challenge, but an HTML browser for visualising HTML-documents is not only a Cool application with considerable Superhighway street cred., but it shows up some interesting issues from a user interface architecture point of view:

How straightforward is it to describe formatted text? The existing HTML standards contain a relatively small set of operators for structuring text and document - how hard is it to map the HTML to graphical output in your system?
Mixing of graphics and text. Although it is not necessary to handle images for a compliant HTML browser, HTML documents tend to contain a large number of images. How easy is it to extend the output model to cope with the mixing of graphics and text?
Active document. When a hypertext link is selected, how do you map from the displayed output to the link. Did you have to wire in the hypertext capabilities into the component that displays the document, or is it possible to factor out the handling of the active links? Could you reuse the resulting solution to handle the browsing of other ``data-structures'' than the HTML document,e.g. browsing of a graph or abstract syntax tree?
HTML Level
Advanced, and well out of scope, but very interesting nonetheless. In the extension of the active document, would it be possible for the document to contain code that could be executed on-the-fly? Current work in this area (Java), Dylan and Oblique) have a strong imperative feel to them - do FPLs have a story to tell here? If not, why?

This is a very open-ended challenge since HTML is an evergrowing language of whizzy features, so as to focus matters we partition the challenge into three sub-challenges (the specification is minimal, for more detail refer to reference material available via http://www.w3.org/)

Sub-Challenge 1: Core HTML

The viewer should be able to accept the whole of HTML, but only has to interpret the following components of an HTML-file:

Document structure elements
- Headers (<HEAD>...</HEAD>)
- Body (<BODY>...</BODY>)
- Title (<TITLE>...</TITLE>)
Data elements
- Line breaks (<BR>)
Character Format Elements None - no attributes for the physical or logical meaning of text has to be handled.
Hyperlink Elements
- Anchors (<A >...</A>) start or destination of hypertext link, the following anchor attributes should be handled:
  - HREF="URL" The viewer should emit the URL upon the user clicking it, i.e. not required to fetch it via the network address contained in the URL.
Block structuring elements
- Paragraph (<P>...</P>)
- Pre-formatted text (<PRE>..</PRE>)
- Address (<ADDRESS>..</ADDRESS>)
- Block-quote (<BLOCKQUOTE>...</BLOCKQUOTE>)
- Headings (<H1>...</H1> through to <H6>..</H6>) no distinction between the different header levels.
- List elements
  - Ordered lists (<OL>..</OL>)
  - Unordered list (<UL>..</UL>)
  + handle the <LI> items contained within the list elements.

Sub-challenge 2: Formatted text

In the addition to the features that should be supported by a browser from sub-challenge 1, this browser should also deal with the following:

Inline images
Character Format Elements
Citation
Code
Emphasis
Keyboard
Sample
Strong
Variable
Bold
Italic
Teletype
Headings
- handle the formatting of different header levels.
List elements
- definition list (<DL>..</DL>)
- directory list (<DIR>...</DIR>)
- menu list (<MENU>...</MENU>)

Sub-challenge 3: Forms

A browser for this sub-challenge should comply to the HTML Level 2, and handle forms that allows you to include common user interface elements into your documents. See

http://www.w3.org/hypertext/WWW/MarkUp/html-spec/L2index.html

for overview of the elements that should be supported. Minimally, the browser could just handle text-input fields and command buttons.

If you get this far and want more, add support for the <META> tag and use it to add some whizzy Netscape-like features special to your browser. Now start your own company..:-)

Game of Life

(Peter Aachten)

Challenge: 'Concurrent' interactive components with global state.

This challenge concerns a program based on the Game of Life. For those who are not familiar with the Game of Life a brief explanation can be found below.

The features of our challenge program are:

The program starts with one Game of Life window.
In each Game of Life window the user can:
- draw life cells in the window (using some pointing device),
- start the computation and rendering of generations of life,
- halt this computation,
- open a new Game of Life window,
- close itself, not quitting the program,
- close all windows, thereby quitting the program,
- open a status window:
  - it displays a list of all currently open Game of Life windows,
  - for each Game of Life window it displays whether it is computing a simulation or not,
  - by selecting one of the Game of Life windows from this list, the user can bring the window 'to the front' (whatever is appropriate given the screen management of your system),
  - it contains one special field that gives the size in which life cells are to be displayed in every window. If the user changes this field, then all current Game of Life windows render life cells in the new size (refreshing their display if necessary).
Game of Life windows run 'concurrently': i.e.: there is not a central unit in the program code that one by one computes the generations for Game of Life windows in turns.

The questions raised in this example are:

How is the dynamic behaviour of the program programmed?
How can each Game of Life instance keep track of all other Games of Life?
How achieves the program that a local change has a global effect? (Such as changing the size of all displayed life cells causes all Games of Life change display mode).
Is the method used by the program in question (3) synchronous or asynchronous? With synchronous we mean a change that has immediate effect . With asynchronous we mean that it takes various, arbitrary delays for the change to have effect in all Games of Life.
What is needed to turn this program in a distributed application in which Game of Life run in parallel on different machines?

Explanation of the Game of Life

The Game of Life is played on an infinite two dimensional space. A cell is iden tified by its Cartesian position. A cell is either alive or dead. An initial ge neration of alive cells is sown. Each following generation is computed given th e current one by two rules:

Only if an alive cell has two or three alive neighbour cells, it survives in the next generation.
Only if a dead cell has three alive neighbour cells, it becomes alive in the next generation.

Graph Editor

(Emden Gansner)

I. One class of application that capitalizes on GUIs consists of structured browsers and editors. A typical instance involves some collection of "model" data, such as abstract graphs, documents, system specifications or a program. Usually, the model data is stored in some abstract, structured representation, often with semantic connections between various parts. For example, in a programming environment, there would be a way to go from the use of a variable to the definition of the variable.

The application then allows the user to view and interact with the underlying data in a variety of ways. For example, again using a programming environment example, a piece of a program may be displayed in one view in the standard concrete syntax and edited as text. In another window, the same piece might be presented as an abstract syntax tree, drawn and edited as a tree. A third view might offer a relational database view of the code, drawn and edited as a table or form. Using any view, the user should be able to alter the underlying data and have the change reflected in the remaining views. Of course, there might be multiple views of the same type open at the same time. An additional feature that is sometimes supported allows another layer between the model data and the view. For example, there might be multiple, different layouts of an abstract graph in the plane, and then multiple windows on each layout.

As a challenge, build a simple graph editor for creating, viewing and editing directed acyclic graphs. The program should support at least two different types of concrete representations of a graph. Representations can be either textual or graphical. Changes to the graph should be propagated to all of the views.

II. Despite how easy we know it is to put together a GUI using our respective toolkits, there will always be those people who will feel that our toolkits are too complex and require too much typing. To help these people and prevent them from just using TCL/TK and getting their work done, create a prototype interface builder that supports some subset of your toolkit's components. The interface builder should allow the user to construct an interface graphically, picking components and inserting them into the layout. In addition, the user needs to be able to customize the look of the components, dynamically setting colors, fonts, labels, etc. If desired, the interface builder could be extended to handle semantic actions and component interconnections, even becoming a Visual{FP}, where FP is your favorite functional language.

This challenge is probably too broad and open-ended, but I believe it is significant, even as a thought experiment, in that it addresses a major concern in building a GUI using certain functional languages, namely the tension between the flexibility and fluidity required for tailoring interfaces easily and dynamically, and the strong typing and immutability propelled by the language.

Changes and their Costs

(Phil Gray)

My challenge has to do with changes and their costs (not a million miles from Roger's concerns). That is, if I have built my interactive system in a particular way, what will it cost me to change it? For the purposes of creating some challenges, I suggest the following application area, types of changes, and cost measures:

The Application Area

Two of our challenges have to do with simple(!) board games of a relatively modest implementation scale. I'll stick to these as well. So, you can apply the suggested changes to either an explode game or a noughts-and-crosses game as appropriate.

The Changes

Assuming you have implemented a version of either game, make one or more of the following changes:

add a player to the game
add a time-out to a player turn (e.g., after 30 seconds the player loses their turn)
[this one is actually courtesy of Simon] change the board topology. For example, make the explode board a DAG. Make the noughts-and-crosses board 3D.
Allow one or both players batch several moves together.
Offer some semantic feedback such as highlighting, as the user moves the cursor onto them, squares which would lead at least to a double explosion or would lead to a win on the next turn for the other player.

Measures of Cost

Number of lines of code which had to be changed
Number of new lines of code
Time to perform the change
Did change require modifications to UI code, application code, or both (assuming this can be determined)?

Separation between GUI and application

(Roger Took)

The problems of interactive system design that engage me are architectural ones: how is a system best modularised so that we maximise the possibility of code (or concept) reuse? Coming from an imperative, object-oriented background, principally in C++, appears to give me a powerful means of expressing modularity and incremental development. My basic challenge therefore is the same as is implicit in Goguen's paper*: demonstrate that a functional approach to the construction of interactive systems is as flexible and expressive as an object-oriented one (or more so!).

Even using object-orientation, you will be glad to hear, not everything is resolved. The fundamental problem of GUI design, it strikes me, is to support both logical and physical separation, particularly between the GUI component (which I call the `surface') and the application semantics, at the same time as providing highly granular direct manipulation of the objects on the surface. By logical separation I mean relationships of inheritance between modules or classes, typically expressed statically. By physical separation I mean relationships of instantiation between classes and objects, and messaging and containment relationships between objects, typically expressed dynamically. The problems of merging these two are especially acute over a distributed network, where the user interface may run locally on a workstation and the application on a server (maybe on the other side of the world via WWW!). In addition to this, providing separated direct manipulation is difficult since on the one hand we can regard the details of the display layout as an implementation of the more abstract application, or, on the other hand, we can regard the application as a set of concrete constraints on the behaviour of a more general display management system. Either way suggests the two should be tightly coupled or encapsulated. But then it is difficult to reuse either the application with a new interface or the GUI with a new application. There are also issues of control and input channeling: where does the interactive initiative lie? - with application objects, with surface objects, with a surface IO manager, or with the application as a monolithic whole. All sorts of synchronisation problems arise.

I am particularly interested in exploring the design space here: what are the fundamental approaches? I would be good if a functional approach could clarify the distinctions (that are still woolly to me). I would like to see examples of the following four approaches:

The surface module inherits from the application module, so that surface objects are created already knowing their application behaviour.
The application module inherits from the surface module, so that application objects are created already knowing how to display themselves.
Application and surface objects are created separately, but application objects are more active, receive input, and know which surface object(s) represent them. Surface objects, however, are ignorant of the application. Application objects manage feedback by sending messages to surface objects.
Application and surface objects are created separately, but surface objects are more active, receive input, and know which application object(s) they represent. Application objects are ignorant of the precise state of the surface. Surface objects inform application objects of any directly manipulated changes to their state, and application objects can reply by requesting other changes.

I suggest two possible instantiations of these architectures:

Noughts-and-Crosses: Two players play in turn, preferably by dragging a stock of O and X objects around a grid. The application essentially decides when the game has been won. Ideally we want a system that we can easily modify, perhaps dynamically, for example to change the number of cells on the grid, the number of players (and so the number of different tokens) etc.
A Simple Arcade Game: Animated objects fly around the screen, and the user has to manoeuvre by direct manipulation a sight or a ship in order to destroy them. This introduces the extra problem of animation and synchronisation of input with output.

I apologise if I have not adopted a functional tone in the above! It may also all be far too ambitious. I will try to bring along to the GUIFEST examples of the above written using C++ and my own display system Presenter.

Roger Took

*J. A. Goguen, "Higher-order functions considered unnecessary for higher-order programming", in "Research Topics in Functional Programming", pp. 309-352, ed. D. A. Turner, Addison-Wesley, 1990.

Stream Merging

(Ian Holyer, Pascal Serrarens)

Introduction

The idea was to create a challenge which captures a potential problem in concurrent systems: merging streams of information. Many implementations will depend on some variation of the nondeterministic merge, but will this give a consistent system?

The idea is based on the following scheme:


         ,-----------.       Meanings:
         | display 1 |
         `-----------'       display 1 : a field displaying a number; the
               |                         value can be changed by the user.
          ,----^----.
          |         |        func 1    : a button connected to a complex
          v         v                    function which takes as input the
     ,--------. ,--------.               value in display 1 and produces
     | func 1 | | func 2 |               a number
     `--------' `--------'
          |         |        func 2    : a button connected to a function
          `----&----'                    which takes the value in display 1
               |                         and immediately returns it.
               v
         ,-----------.       display 2 : a field displaying the results of
         | display 2 |                   func 1 and 2.
         `-----------'

The problem lies in merging the inputs into display 2 (marked & above).

Suppose the user first clicks on the 'func 1'-button and immediately after on the 'func 2'-button. The result of func 2 will be available very quickly, but the result of func 1 takes significantly more time to return. Merges based on availability will fail here, because with those, the result of func 2 will be displayed first, before that of func 1. Of course, users expect to see result 1 first, because they clicked on the 'func 1'-button first.

To make things just a bit more complicated, we will introduce a loop into the system:


         ,-----------.
         | display 1 |<-----.       Meanings:
         `-----------'      |
               |            |       display x : see above
          ,----^----.       |
          |         |       |       func x    : see above
          v         v       |
     ,--------. ,--------.  |       copy      : copies the value displayed in
     | func 1 | | func 2 |  |                   display2 to display 1
     `--------' `--------'  |
          |         |       |
          `----v----'       |
               |            |
               v            |
         ,-----------.   ,------.
         | display 2 |-->| copy |
         `-----------'   `------'

Besides the construction of the loop itself, this may introduce some more complications. When copy is pressed, and after that one of the func-buttons, will the old or new value of display 2 be copied into display 1?

Without concurrency, this little system can be quite hard to program neatly, though it should give correct results. With concurrency, most systems rely on nondeterminism, which makes it difficult to predict what results will be given. We would like to see a construction which is both programmed neatly and gives deterministic results.

Dialog Boxes

(Magnus Carlsson & Thomas Hallgren)

The idea is to find a set of combinators for constructing "dialog boxes", .i.e., those windows that pop up and ask the user to enter some information.

The starting point when designing a dialog box is a data type. The dialog box allows values of this type to be edited by the user.

As an example, consider a dialog box for entering a user name and a password when logging in to a computer:

+------------------------------------+
|           +---------------------+  |
| Username: |                     |  |
|           +---------------------+  |
|                                    |
|           +---------------------+  |
| Password: |                     |  |
|           +---------------------+  |
|                                    |
|  +----+               +--------+   |
|  | OK |               | Cancel |   |
|  +----+               +--------+   |
+------------------------------------+

The data type:

data Login = Login UserName Password
type UserName = String
type Password = String

From this we would like to construct a fudget (or whatever):

loginPopupF :: F Login (Maybe Login)

The input of the fudget is of type Login so that the program can set default values that will be displayed when the dialog box first pops up. The output is Nothing if the user presses the Cancel button and Just (Login name password), where name and password are the contents of the respective fields, if the user presses the Ok button.

As another example, consider the following piece from a dialog, with a radiogroup of two buttons:

|                        			    |
|	/\    Print all pages			    |
|	\/    					    |
|						    |
|						    |
|	                        +---+      +---+    |
|	/\    Print only pages	|   |  to  |   |    |
|	\/			+---+      +---+    |
|						    |

If "Print all pages" is selected, then the integer input elements are inactive. If one tries to enter something in them, the radiogroup flips over to "Print only pages".

This corresponds to the datatype

data Pages = PrintAll | PrintPages Int Int

Now, we would like to have a set of combinators by which we easily could assemble such dialogs (confer parsing combinators, for example).

Construct a set of combinators for the types Either _ _, () and (_,_), as well as a few basic types. (Can type classes be used?)
Generalise to other datatypes. Can the combinators be expressed in Haskell, or is some kind of language extension needed?

There are a number of problems that we haven't considered so far:

How is the text "Print all pages" and "Print only pages .. to .." specified?
The password input field is special in that the characters should not be echoed.
How is the layout specified?
We might have semantic restrictions on the data (such as the last page should not be smaller than the first page in the print dialog), that the types do not capture. It is an advantage if the user immediately gets feedback when trying to input invalid data. How is this expressed?
If the data structure is large, perhaps the combinators should only output the changes that the user has done to it. How should this be represented?

The last two points show that the challenge overlaps a lot with some other challenges. Our dialogs are perhaps just a special case of the problem of presenting structured data, letting the user manipulate it in certain ways. Confer e.g. Emden Gansner's challenge.