Galaxy Communicator Tutorial:

MITRE Demo: A Sample End-to-End System

License / Documentation home / Help and feedback

The best way to understand how a configuration of Galaxy Communicator servers might work together is to watch an example in action. For pedagogical purposes, we've constructed a set of dummy components which follow a script of message sequences. In this lesson, we'll follow a single interaction with this set of components, to illustrate one plausible flow of control for messages in a Communicator-compliant system, which we call our toy travel system.



The servers

We will focus our attention on seven dummy servers, plus the Hub's Builtin server:
There are two other dummy servers in the toy travel system. One is a dummy server which is intended to emulate typed input and output; we will not use this server in this course. The other server, called IOMonitor, monitors both input and output and reports what is said by each side; we do use this server in this lesson, but we omit it from the illustrated flow of control for the sake of simplicity.


Setting up the toy travel demo

Starting the process monitors

We'll be using the process monitor once more. Start up the demonstration as follows:
[Toy travel demo exercise 1]

Unix:

% process_monitor $GC_HOME/tutorial/toy-travel/short-toy-travel.config

Windows:

C:\> python %PM_DIR%\process_monitor.py %GC_HOME%\tutorial\toy-travel\short-toy-travel.config

You'll get a process monitor in its compressed configuration, with three button rows of three processes each. Select "Process Control -> Restart all". You'll start all the processes in the order of the presentation of the buttons. The processes are started in the order described in the tutorial on starting up a Galaxy Communicator configuration: first the servers which are listening for connections, then the Hub, then the servers (in this case, the dummy audio server), which will connect to the Hub. More precisely, in the case of the audio server, you'll get a second process monitor window which will allow you to start the audio server separately (we'll do that in just a minute).

At this point, you should have two process monitors on the screen, like so:

Understanding what you see: server startup

The "Recognizer" button is selected in the "Toy travel single exchange" process monitor, so this pane shows the output of the recognizer. We can see three lines of output. The last two tell us, first, that the recognizer server is available for connections on port 11000, and second, that the recognizer server has accepted a connection (from the Hub; remember, it's started up already).

You'll find that if you select any of the first six processes, you'll see a variation of this output; all of these servers will have started listening (each on a different port, of course), and all of these servers will have accepted a connection. However, the last three processes will look different.

Controlling how much you see

Select the "IOMonitor" button. The only indication that the server is running is that the Start/Stop button now reads "Stop"; there will be nothing in the output pane except the original command line:
[IOMonitor pane]

[exec $DEMO_ROOT/$BINDIR/IOMonitor -verbosity 0 2>&1]

The reason that there's no other output is because we've limited the verbosity of the server to 0.

For Communicator-compliant servers, there are six levels of verbosity. The status messages in the Galaxy Communicator infrastructure can be made sensitive to the verbosity level. 0 is the most severe; no verbosity-sensitive status messages of any sort are printed. 3 is the default; at this level, you'll see normal status messages, indicating when connections are established and lost, and what messages are being sent and received. 6, the most verbose, provides debugging information and a full dump of the encoded message traffic. In this tutorial, we'll only use verbosity levels 0 or 3 (the default).

You can control the verbosity in two ways. First, as shown here, Communicator-compliant servers all accept the -verbosity command-line argument, as does the Hub.  Second, you can set the environment variable GAL_VERBOSE to the verbosity level you desire. In this tutorial, we'll only use the command-line argument. (Notice that the command line for the audio server in the "Audio client" process monitor also limits the verbosity to 0.)

The IOMonitor will print out a transcription of the dialogue you're about to trace through. Make sure that the "IOMonitor" button is still selected, and press "Detach this pane" to detach the IOMonitor.

Understanding what you see: Hub startup

Now select the "Hub" button in the "Toy travel single exchange" process monitor. We'll now examine the output of the Hub up to this point. Detach the Hub pane, and enlarge it for easier viewing. First you'll see printouts informing you that the Hub is reading and loading the program file:
[Hub pane]

Reading /usr/local/GalaxyCommunicator/tutorial/toy-travel/toy-travel.pgm
Done reading /usr/local/GalaxyCommunicator/tutorial/toy-travel/toy-travel.pgm (264 lines)
9 service types
7 service providers
9 programs
Notice that the Hub distinguishes between service types, which are named collections of behavior (e.g., the service named Parser provides the operation Parse), and service providers, which are actual processes which are instances of service types. So the Parser server in our configuration, from the Hub's point of view, is a provider for the Parser service.

Next, you'll see printouts informing you that the Hub is listening for connections from the servers which might contact it (the audio server, in this case). Observe that the Hub reports setting up this connection listener just as the Recognizer server does:

[Hub pane]
 
--------------------------------------------------
service type: Audio : 2800
Opening listener Audio
Trying to set up listener on port 2800
Opened listener on port 2800
Using listener on port 2800
Finally, the Hub contacts all the other servers, and exchanges some crucial connectivity information (which we don't care about). Here's an example for the dummy Parser server:
[Hub pane]

Sending new message to localhost
{c handshake
   :conn_type 1
   :protocol_version 1 }
Received reply from localhost
{c Parser
   :signatures ( ( "Parse"
                   ... ) )
   :properties {c server_properties ... }
   :extra_service_types (  )
   :protocol_version 1 }
Connected to provider for Parser @ localhost:10000
There are two things which commonly appear which we haven't seen so far: processing the initial token in the Hub, and processing a reinitialize message in the server. We'll see the reinitialize message in later lessons; there are no examples of use of the initial token in this tutorial.

Starting the interaction

At this point, we're ready to begin. The message script we're going to follow models the following simple dialogue:
System: Welcome to Communicator. How may I help you?
User: I WANT TO FLY FROM BOSTON TO LOS ANGELES
System: American Airlines flight 115 leaves at 11:44 AM, and United flight 436 leaves at 2:05 PM
We're not going to step through the initial greeting in detail; we're going to pay detailed attention only to the user request and system reply. In the following lessons, we'll learn in detail how this exchange is constructed.

Press the "Start" button on the "Audio client" process monitor to start the dummy audio server, which will contact the Hub. The reason the audio server contacts the Hub is that we want to manage access to our speech-based service dynamically, so that users can contact it from any one of a number of phone lines or desktop microphones. Since every one of the audio connections needs to be a Communicator-compliant server, we can either choose a fixed number (from fixed locations) when we write the Hub program file, or we can set up a listener so that any appropriate service can contact the Hub.

Now that you've pressed the "Start" button, a number of things have happened. The Audio server sent a message which caused the Dialogue server to send a greeting. The Hub has reported its message processing, which we will ignore for the moment. The detached "IOMonitor" window contains the resulting textual output, and the "Audio client" process monitor shows the (dummy) audio segments which are being streamed to the user:

[IOMonitor window]

In session session-998661263.227: system said "Welcome to Communicator. How may I help you?"

[Audio client process monitor]

[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (184 samples)]
[Audio data to user is finalized (14520 samples).]
Hit <return> to send speech:

The "Audio client" process monitor has a button called "Input <return>". You'll use this button to move the dialogue forward, to send the next user utterance. The "Hub" pane also has an "Input <return>" button; the program file the Hub is running has breakpoints inserted at crucial points, and you'll use this button to step through these breakpoints.


Step 1: Audio available

Press the "Input <return>" button on the "Audio client" process monitor, which simulates telling a push-to-talk audio server that it should start listening. As a result, the Audio server informs the Hub that audio is available, and the Hub informs the Recognizer to start processing:

At this point, the Hub will be paused at a breakpoint.

Understanding what you see: Hub processing

For this first step, let's take a detailed look at the Hub output. First, the printout informs us that the Hub has received a new message from the Audio server. This output will be at the very bottom of the Hub pane. The Hub constructs a token from this new message, and the token index of the new token is 4 (that is, it's the fourth new message the Hub has received since startup). This new token contains seven key-value pairs, which will will ignore for the moment:
[Hub pane]

Got new message from provider for Audio (id 8)
Created token 4

----------------[  4]----------------------
{c FromAudio
   :sample_rate 8000
   :encoding_format "linear16"
   :proxy "[broker proxy: call ID 129.83.10.107:5478:0, host 129.83.10.107, port 15010, type GAL_INT_16]"
   :session_id "session-998661263.227"
   :tidx 4 }
----------------------------------------

When the Hub receives a new message, it looks for a program by the same name, and starts comparing the new token state with the conditions on the rules in that program. When it finds a rule whose conditions are satisfied by the token state, it invokes the associated dispatch function (which the Hub calls an operation) by constructing and sending a message to the appropriate server.

In some conditions, the Hub can fire multiple rules in immediate sequence, and in this case, the Hub finds and fires two rules: the first invokes the Recognizer.Recognize message, and the second invokes Builtin.hub_break, which forces the breakpoint. In each case, there is only one service provider for the relevant service type.

[Hub pane]

Found operation for token 4: Recognizer.Recognize
Found operation for token 4: Builtin.hub_break
Serving message with token index 4 to provider for Recognizer @ localhost:11000
---- Serve(Recognizer@localhost:11000, token 4 op_name Recognize)
Serving message with token index 4 to provider for Builtin
---- Serve(Builtin@<none>:-1, token 4 op_name hub_break)
Finally, the Builtin server within the Hub receives the hub_break message, and triggers the breakpoint:
Received new message from <local>
{c Builtin.hub_break
   :session_id "session-998661263.227"
   :hub_opaque_data {c admin_info ... } }
Invoking dispatch function: hub_break
(h for help, c or <return> to continue) -->
From this point on, we'll ignore the calls to hub_break.

Press the "Input <return>" button on the "Hub" pane to continue.


Step 2: Reroute to handle general input

Once the recognizer is done processing the audio, it sends a new message reporting its results. This is one form of textual input. This configuration of servers, under other circumstances, can also handle typed input and output. In this next step, we use the Builtin.call_program dispatch function to invoke a new Hub program, which unifies the processing of text input.

We see this reflected in the Hub printout:

[Hub pane]

Got new message from provider for Recognizer @ localhost:11000
Created token 5

----------------[  5]----------------------
{c FromRecognizer
   :input_string "I WANT TO FLY FROM BOSTON LOS ANGELES"
   :session_id "session-998661263.227"
   :tidx 5 }
----------------------------------------

Found operation for token 5: Builtin.call_program
Found operation for token 5: Builtin.hub_break
Serving message with token index 5 to provider for Builtin
---- Serve(Builtin@<none>:-1, token 5 op_name call_program)

The Hub finds two operations to perform, and then performs the first one.

Understanding what you see: server processing

We can take this dispatch to the Builtin server as an opportunity to examine a little more closely what server-side printouts look like. Immediately after the Hub reports that it's fired the rule which calls Builtin.call_program, the Builtin server "takes over" the printout and reports how it processes the message. First, the server reports the message it receives, and the fact that it's found a dispatch function for the message:
[Hub pane]

Received new message from <local>
{c call_program
   :session_id "session-998661263.227"
   :program "UserInput"
   :input_string "I WANT TO FLY FROM BOSTON LOS ANGELES"
   :hub_opaque_data {c admin_info ... } }
Invoking dispatch function: call_program
Next, the server prints out anything related to what it does to process the message. In this case, it sends a new message to the Hub:
[Hub pane]

Sending new message to <local>
{c UserInput
   :session_id "session-998661263.227"
   :input_string "I WANT TO FLY FROM BOSTON LOS ANGELES"
   :hub_opaque_data {c admin_info ... } }
Finally, it reports its return value. In this case, there is none:
[Hub pane]

No result frame for <local>
This structure for printouts is identical to the printout for any server at the default level of verbosity.

Press the "Input <return>" button on the "Hub" pane to continue from the current breakpoint.


Step 3: Send to dialogue

At this point, the Hub will receive the new message sent by Builtin.call_program, named UserInput, and will route this message through the parser to the dialogue manager:

At each point where the Hub receives a message reply, it will print out the token state for the updated token. So we can extract the following sequence from the Hub output (there will be various other messages interspersed, such as calls to the IOMonitor to print out the output, to the breakpoint function, and replies which the Hub program does not need and will ignore):

[Hub pane]

Got new message from provider for Builtin
Created token 6

----------------[  6]----------------------
{c UserInput
   :session_id "session-998661263.227"
   :input_string "I WANT TO FLY FROM BOSTON LOS ANGELES"
   :tidx 6 }
----------------------------------------

Found operation for token 6: Parser.Parse
Serving message with token index 6 to provider for Parser @ localhost:10000
---- Serve(Parser@localhost:10000, token 6 op_name Parse)

Got reply from provider for Parser @ localhost:10000 : token 6

----------------[  6]----------------------
{c UserInput
   :session_id "session-998661263.227"
   :input_string "I WANT TO FLY FROM BOSTON LOS ANGELES"
   :tidx 6
   :frame {c flight ... } }
----------------------------------------

You can see that the token state is evolving as the program proceeds; so token 6 has a value for the :frame key after the call to the Parser which it didn't have before.

Press the "Input <return>" button on the "Hub" pane to continue from the current breakpoint.


Step 4: Dialogue consults backend

At this point, the Dialogue server will consult the Backend server and retrieve a database response. This is accomplished by the Dialogue server sending a new message to the Hub with an indication that it wants a response. The Hub creates a token,  finds an appropriate Hub program, and executes the program, and returns the updated token state to the Dialogue server when the program ends:

The current breakpoint is immediately after the Backend responds to the Hub, but immediately before the end of the program. Press the "Input <return>" button on the "Hub" pane to continue from the breakpoint; you'll see the token state returned to the Dialogue server immediately afterward:

[Hub pane]

Done with token 7 --> returning to owner Dialogue@localhost:18500
Destroying token 7

Step 5: Dialogue reply to generation and synthesis

Now, the Dialogue manager decides that it's time to say something to the user (in this case, to list flights). So it sends a new message to the Hub, which is routed through the Generator server to the Synthesizer server:

At this point, the IOMonitor has been notifed what the system is about to say, and you should be able to see the entire three-turn dialogue in the IOMonitor window.

The current breakpoint is set immediately before the call to the Synthesizer server. Press the "Input <return>" button on the "Hub" pane to continue to the final step.


Step 6: Audio output

Finally, the Synthesizer server produces a new message to notify the Audio server that audio is available, and the Audio server fetches the audio:

You can see the result in the "Audio client" process monitor:

[Audio client process monitor]

[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
...
[Audio data to user (1024 samples)]
[Audio data to user (1024 samples)]
[Audio data to user (252 samples)]
[Audio data to user is finalized (35068 samples).]
Hit <return> to send speech:

At this point, the exchange is over. Press the "Input <return>" button on the "Audio client process monitor"; since the server has no further inputs in its message script, it reports that audio is no longer available and shuts down.

Select "File -> Quit" in the "Toy travel single exchange" process monitor to shut down the toy travel demo.


Summary

In this lesson, you've seen how a plausible exchange between a user and system might proceed in a Communicator-compliant system. You've also learned a few more terms: In the next lessons, we'll learn more about these terms, and about how this demo is constructed.

Next: Introducing frames and objects


License / Documentation home / Help and feedback
Last updated August 8, 2002