GlassFish Wiki : SailfinCafeAPIDesign

Communications are fully meshed

In a communication all objects (participants or joinables) communicate in a fully meshed configuration, as far as their capabilities allow.

E.g., a conference with 3 participants; Alice has audio only, Bob has audio and video, Carol has video only. Then Alice and Bob will communicate with audio, Bob and Carol will communicate using video. Alice and Carol will not communicate directly.
Cafe introduces the necessary JSR 309 objects in order to make this possible.

E.g., in the above case three network connections are needed and a mixer.
Economy dictates that the most efficient media solution will be established.
This is not always easy or possible. For example, in the above mentioned conference no mixer would be needed since it could also be accomplished using two bridges..
The proposal is that for clarity all communications are fully-meshed and that Cafe will take care of this.

Communications can contain both Participants and joinables

A communication can contain a mix of participants, representing external parties assumed to the humans, and JSR 309 joinables (e.g., recorders or announcement machines).

The rules for fully meshed communication also apply in such a case.

Some communications must at least contain a human participant (might not be strictly needed in case of a conference?)

The proposal is to model media-groups as objects in a communication, so that by default everything is fully meshed. Applications that want more have to manipulate the 309 objects explicitly.

Type of communication determines visible objects

The type of interface determines which JSR 309 objects are visible to the application.

Transparent call
no JSR 309 objects are visible
Non-transparent call
Network connections are visible, and can be used as joinables in JSR 309
Conference
Network connections and one mixer are visible.

Using these JSR 309 object the application can create arbitrary complex connection graphs between the objects. E.g., an conference application might introduce subconferences etc.

TODO - what if the app introduces a mixer in a non-transparent call, how does the cafe affect the lifecycle then? And the other way around, what if the app gets the mixer in a conference and releases it...

Allow migrations between communication types

To allow the application access to the visible devices it might be needed to 'upgrade' a communication. This means that the communication is migrated, e.g., from a transparent call to a non-transparent call, or to a conference.

Downgrading might also be considered. E.g,, transform a 2 party conference into a transparent call.

Note that a application always has the possibility to migrate, by creating a new communication and moving participants to that communication, but this might lead to breaks in the media communication between the parties, which may be avoided (as far as possible) if migration is performed by Cafe.

Policies

Note that the previous descriptions did not mention any more explicit difference between transparent call, non-transparent call and conference than only the visibility of some JSR 309 objects. Nothing (yet) implies that calls can only have a limited number of participants.

So what is a two-party call?
It is a communication which has a limitation on the number of participants.
Such a policy might be established because of economics, e.g., to avoid adding a mixer in a call if we limit the number of objects in a call to maximum two. I might be to protect the application from making mistakes (e.g., accidently creating a conference because it forgets to remove a participant). Finally it might be because of familiarity, a two party call is by far the most common occurrence and that is something you might want to model separately.

Routing of incoming calls

For outgoing calls, it is the application that directly determines the type of the call (transparent, non-transparent, conference).
However, for incoming calls we also have to have a possibility to determine the type of the call. There are two obvious possibilities.

incoming calls always start out as transparent calls. It is then up to the application to upgrade these calls to the appropriate type.
We provide some kind of 'trigger criteria'.

For the moment, I think that the first option suffices.

Optimal media routing vs minimal signaling

There is a conflict between trying to keep the media optimal for every configuration and trying to keep the signaling to a minimum.

For example, when a user removes a participant in a transparent two party call, and then adds an announcement, there is (for a very short time) a one party call without any media flowing. During this time a most optimal solution would be to remove all the media towards the remaining participant. But when adding the media group later, the media has to be established again.
With respect to media handling the solution is most optimal for each configuration. But with respect to signaling, it would have been more optimal to connect the remaining participant to the media server, if an announcement is to be played immediately after (but not if another participant is added to the communication, instead of an announcement).

Such conflicts can be resolved by changing the API to allow a combination of actions (e.g., replace participant by a mediagroup, instead of removing an participant and adding a mediagroup). Another, more technically involved, solution would be to delay acting on some methods while waiting for the following actions. But this will get very tricky and would not be recommended.
The proposal here is to start with atomic actions, and just take the additional signalling costs. Then later try to optimise, e.g., by introducing such a replace method.

Implicit vs explicit participant creation

For incoming calls the participant indicates the party (s)he wants to communicate with. This can either be the application itself (e.g., some SPI) or it can be another user. The application must have the possibility to influence the routing.
There are two possibilities:

Cafe implicitly creates a second participant based on the to-header in the incoming request. Rerouting would then mean replacing this implicitly created participant with an new explicitly created participant.
The application always explicitly has to create the second participant, but it can easily access the to-header from the incoming request.

The proposal is to only have the second alternative which is more consistent with outgoing call creation scenarios.

Incoming call rejection

Since each incoming call is represented as a new one-party call towards the application, the application can easily reject a participant by rejecting the call.
MORE....