Nandeshou: Remote Services

Remote Services

This is a bit of a technical deep dive, but if you’re interested in the internal workings of Nandeshou then hopefully you’ll enjoy this insight.

When I began implementing Nandeshou, the initial services were all in a single process, so if a service needed to return another service then I could return a reference to it and it worked great. Until it didn’t.

My apoologies to those reading this on a mobile device; the code samples don’t render very well. My recommendation is that you view this on a larger screen if you are interested in the code.

Some Context

As soon as you start splitting your services into separate processes, this approach fails. But, of course it would, and no surprise. So, before I got to that point, I knew I would need to implement a solution.

A good example of this problem in Nandeshou is the Agent Service, which possibly should be renamed because it’s actually an Agent Factory Service. Its role is to instantiate AI agents and Copilots for specific roles and conversations with persistent history.

The UI Page Service is a consumer of a UI/UX Copilot which is well versed in UI/UX design and knows how to manipulate user-interface pages and components. When a client application wants to use this Copilot for a specific UI page or component, the Page Service calls the Agent Service and makes a request for the appropriate Copilot.

This Copilot quite likely resides in another process, and is most likely running in a container or instance on an entirely different machine, so returning a reference would not be appropriate, except that’s exactly what I wanted to do.

If I couldn’t do it, I at least wanted to make it appear like that’s what I’m doing.

Sometimes I dislike behind-the-scenes code, which is one of the reasons I truly dislike decorators and annotations that have side-effects (glaring at you, Spring), but, in this case, I’m making things work the way you’d expect, even if it shouldn’t work without a bit of magic.

My Solution

My solution to this ended up being fairly graceful.

For service to service / agent to agent / machine to machine communications, I am already using a compact and efficient dto backed by protocol buffers.

Note: dto is a data transfer object and generally is used to transfer information between client and server, or producer to consumer. Sometimes the client is another service.

I still de/serialize (a shortcut for serialize/deserialize, or sometimes simply deser) with JSON when communicating between a web client and a REST API server (another story, don’t judge me🙈), so this becomes a little verbose since I need to define both the Go language dto, and the Typescript dto which support JSON, as well as the protocol buffer message for supporting an efficient binary dto.

But, verbose isn’t very painful when you’re generating code, right? The goal is to keep things legible and graceful so that the humans reviewing the code can understand it.

Here’s the dto used to return the Copilot Chat service in Go:

1
2
3
4
type CreateChatResponse struct {
	dto.Response
	Chat *Chat
}

The dto.Response is common among all response messages and contains success/failure status and a human friendly message to provide details upon failure.

The tricky part is the *Chat which is a reference to the returned Chat Service.

Obviously this won’t work when we try to marshal this to transfer across the wire, so let’s add a bit of magic.

First, let’s define a Service that’s part of our protocol buffer message definition. This is a generic Service dto, so we won’t need to define a dto for every service:

1
2
3
4
5
message Service {
  string oid = 1;
  string type = 2;  
  string channel = 3;
}

And do one little update on our dto:

1
2
3
4
type CreateChatResponse struct {
	dto.Response
	Chat Chat `bdto:"Chat:Service"`
}

This bdto tag is used by the Binary DTO Un/Marshal code. Chat is the name of the field, which is optional here because the name can also be gleaned via reflection.

The important part is the ":Service" which tells the BDTO un/marshal code that it’s a special object of a type which implements this interface:

1
2
3
4
5
type Service interface {
	GetOID() string
	GetType() string
	GetChannel() string
}

Why are only the getters necessary? Because on the receiving end, generally a microservice proxy, we need to implement a factory that implements the following interface, and that factory sets the values upon service construction:

1
2
3
type ServiceFactory interface {
	CreateService(oid string, serviceType string, channel string) interface{}
}

The OID object ID is used when sending messages from the proxy to the service. It’s part of the payload to identify which service is the recipient of any given message. It can also be used by the service factory to cache objects.

The Type is used by the factory, too, in case there is more than one type and the factory is a Factory of Factories.

Channel is used as a way to identify the endpoint where the service resides (in this case, the endpoint where the Chat service resides). This could be any protocol, but by default I use NATS. It’s much faster, more flexible, and more resilient than using HTTPS.

This factory isn’t constructing a service, but rather it’s constructing a proxy service which packages requests, send it to the service endpoint, and then unpackages the response.

This provides a mechanism for the bdto to unmarshal the Service as a proxy, and the client code doesn’t need to know it’s using a proxy because the proxy implements the same interface as the requested service.

The Results

Voila! Remote instantiation of services, and using remote services a seamlessly as local in-process services.

For an extremely simple partial example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
package agent

type Service interface {
	GetOID() string
	GetType() string
	GetChannel() string
}

type CreateChatResponse struct {
  dto.Response
	Chat *ChatService `bdto:"Chat:Service"` 
}

type ChatService struct {
	oID           string
	objectType    string
	remoteChannel string
  // more stuff for the chat service ...
}

func (s *ChatService) GetOID() string {
	return s.oID
}

func (s *ChatService) GetType() string {
	return s.objectType
}

func (s *ChatService) GetChannel() string {
	return s.remoteChannel
}

type ServiceFactory struct {
}

func (f *ServiceFactory) CreateService(oid string, serviceType string, 
    channel string) interface{} {

	return &ChatService{
		oID:           oid,
		objectType:    serviceType,
		remoteChannel: channel,
	}
}

And without any context, here’s an example of how one would call the Unmarshal function:

1
  err := bdto.Unmarshal(buf, &r2, &ServiceFactory{})

Finally, the only other part missing is the the protocol message definition for the CreateChatResponse. This would be necessary even if I were passing values of primitive types. The only “extra” is that now the bdto can un/marshal services.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
syntax = "proto3";

package example;

option go_package = "./proto";

import "service.proto";
import "response.proto";

message CreateChatResponse {
  Response response = 1
  Service chat = 2;
}

Conclusion

So there you have it. A fairly simple solution to an otherwise complicated problem.