Tracing Spring asynchronous code with New Relic – a better way

|

In my earlier post about tracing Spring asynchronous code with New Relic I showed a simple solution using a subclass of ApplicationEvent to carry a New Relic token. It has some disadvantages:

  1. Code that uses it must explicitly declare New Relic tracing using the @Trace annotation, must create subclasses of TracedEvent and must call the TracedEvent#linkToken method on the event object.

  2. Each token can only be expired once, even if an event is listened to by multiple listeners.

A better way

This method uses an implementation of java.util.concurrent.Executor that wraps a delegate instance.

  1. The NewRelicTraceExecutor#execute method is called in the parent thread. It constructs a TracedRunnable that wraps the Runnable instance it is given.

  2. The TracedRunnable#run method is called in the child thread. It calls Token#linkAndExpire method before calling run on its delegate Runnable.

All the New Relic-specific code is in this one class, which can be wired into a Spring Boot application to be used with ApplicationEventMulticaster. Each event listener has its own Runnable instance with its own New Relic token.

package com.example.tracing;

import com.newrelic.api.agent.NewRelic;
import com.newrelic.api.agent.Token;
import com.newrelic.api.agent.Trace;

import java.util.concurrent.Executor;

public class NewRelicTraceExecutor implements Executor {

    private final Executor delegate;

    public NewRelicTraceExecutor(Executor delegate) {
        this.delegate = delegate;
    }

    @Override
    public void execute(Runnable command) {
        Token token = NewRelic.getAgent().getTransaction().getToken();
        delegate.execute(new TracedRunnable(command, token));
    }

    static class TracedRunnable implements Runnable {

        private final Runnable delegate;
        private final Token token;

        TracedRunnable(Runnable delegate, Token token) {
            this.delegate = delegate;
            this.token = token;
        }

        @Trace(async = true)
        @Override
        public void run() {
            token.linkAndExpire();
            delegate.run();
        }
    }
}

As before, there is a dependency on the New Relic API. In Gradle:

    implementation 'com.newrelic.agent.java:newrelic-api:5.11.0'

Tracing Spring asynchronous code with New Relic

|

I have been working with Spring Boot microservices in an environment that is monitored using New Relic. Applications instrumented by New Relic are deployed with agents that send status and other information to a central server for monitoring and analysis.

New Relic’s Distributed Tracing enables complex request flows to be traced through multiple services instrumented with its agents. This is a powerful tool for quickly finding interesting or anomalous traces so they can be examined. We instrumented the Spring Boot services with New Relic and were able to follow synchronous calls made to downstream services.

The problem

But it didn’t trace all calls to other services. We executed some code asynchronously using Spring’s custom application events. Events are published by an ApplicationEventMulticaster configured with a task executor, and subscribed to by asynchronous listenters. We found that New Relic trace context was not being transferred with the events to the listeners in different threads.

When the asynchronous listener code called other services, those services were not recognised by New Relic as participating in the same distributed trace.

A simple solution

Our solution was to extend the Spring ApplicationEvent class to carry with it a New Relic trace token, and for the listener code to link that token to its New Relic context.

Prerequisite

Include the New Relic agent in the project’s runtime dependencies. In Gradle:

    implementation 'com.newrelic.agent.java:newrelic-api:5.10.0'

The TracedEvent class

package com.example.events;

import com.newrelic.api.agent.NewRelic;
import com.newrelic.api.agent.Token;
import org.springframework.context.ApplicationEvent;

public class TracedEvent extends ApplicationEvent {
    
    private Token traceToken;
    
    TracedEvent(Object eventObject) {
        super(eventObject);
        traceToken = NewRelic.getAgent().getTransaction().getToken();
    }
 
    public void linkToken() {
        traceToken.linkAndExpire();
    }
}

There is no need for null checking on New Relic classes because NewRelic.getAgent() always returns a usable object. When the code executes without an actual agent connected, it returns an instance of NoOpAgent that returns a safe instance of Transaction that itself returns a safe, do-nothing instance of Token.

Listener code

Important parts of the code:

public class SomeEvent extends TracedEvent {
    // etc.
}
import com.newrelic.api.agent.Trace;

@Service
public class ExampleListener {

    @Trace(async = true) // Ensure New Relic traces this method’s thread
    public void onEvent(SomeEvent event) {
        event.linkToken(); // Do this first

        // Act on the event
    }
}

Future improvements

This simple solution was adequate for our immediate purposes but is not complete. With ApplicationEventMulticaster an event may be listened to by multiple listeners but the token will be expired by the first listener that uses it. In our case each event had only one listener.

It is valid to retrieve multiple tokens from a single New Relic transaction and use each one independently. We could fetch a token for each listener or to fetch a token for each thread used by the event multitasker’s task executor.

It is better to use Spring configuration to automatically fetch tokens and use them in new contexts. Spring Cloud Sleuth uses this technique to ensure tracing information is propagated to new threads.

Octopus Deploy server in AWS and polling tentacles

|

I am using Octopus Deploy on a current project to deploy to a number of targets in tightly-controlled, on-premises environments. We are using polling tentacles so we don’t need to get ingress firewall rules manually created for every deployment target.

Tentacle-to-server communication

Octopus Deploy server is deployed into AWS and needs to be configured to securely accept connections from polling tentacles:

  • the Octopus web portal on its assigned port
  • the Octopus server for tentacle instructions, usually on port 10943

The first connection is HTTP or HTTPS and can be secured simply in AWS with any load balancer that presents a certificate and offloads TLS, forwarding HTTP requests to the server.

The second connection is HTTPS but must be secured from end to end. On installation, both server and tentacle generate a self-signed certificate, which they use to secure all communication with each other. This means the Octopus Deploy server cannot be deployed behind a device that offloads the TLS certificate.

AWS Load Balancers

The current generation of AWS Elastic Load Balancers come in two types: Application Load Balancers and Network Load Balancers.

Application Load Balancers can route traffic based on host, header, path etc. and are very flexible. But they can only accept HTTP and HTTPS connections and always offload TLS in the latter case.

Network Load Balancers do not support complex routing rules but can offload certificates for some requests and allow TCP passthrough of others. This solution meets our needs:

  • A TLS listener on port 443 offloads the certificate on requests to the web portal, which are forwarded over HTTP.

  • A TCP listener on port 10943 passes requests through unchanged to port 10943 to the same server.

Security Group differences

AWS application and network load balancers work differently with security groups. Application load balancers have security groups attached to them and apply ingress rules. In contrast, network load balancers do not have security groups attached; here the security rules of target instances apply, using their listening ports.

In our configuration the security group for the server specifies port 10943 for the traffic that passes through the load balancer, and port 80 for the web portal traffic.

Docker-in-Docker builds in TeamCity agents on AWS ECS

|

I have been experimenting with running TeamCity in AWS, using the CloudFormation stack provided by JetBrains. This stack uses Docker images from JetBrains and runs them in AWS Elastic Container Service.

However the default configuration does not allow Docker-in-Docker builds. This is the situation where a TeamCity agent, itself running in a Docker container, needs to run a build step with Docker or Docker Compose.

The page for the JetBrains Docker image for agents gives two options for starting the agent container from the command line:

  • Docker from the host
  • Run in privileged mode

This post is about how to modify the JetBrains CloudFormation template to start the agent container in those two ways.

Docker from the host

This technique maps /var/run/docker.sock from the Docker host into the running container. Declare a host volume and mount it in the container (see comments):

  AgentTaskDefinition:
      Type: AWS::ECS::TaskDefinition
      Condition: ShouldLaunchAgents
      DependsOn:
        - PublicLoadBalancer
        - TCServerNodeService
      Properties:
        PlacementConstraints:
          - Type: memberOf
            Expression: attribute:teamcity.node-responsibility == buildAgent
        # Define the host volume to map.
        Volumes:
          - Name: "dockerSock"
            Host:
              SourcePath: "/var/run/docker.sock"
        ContainerDefinitions:
          - Name: 'teamcity-agent'
            Image: !Join [':', ['jetbrains/teamcity-agent', !Ref 'TeamCityVersion']]
            Cpu: !Ref AgentContainerCpu
            Memory: !Ref AgentContainerMemory
            Essential: true
            Environment:
              - Name: SERVER_URL
                Value: "https://teamcity.tawh.net"
            LogConfiguration:
              LogDriver: 'awslogs'
              Options:
                awslogs-group: !Ref ECSLogGroup
                awslogs-region: !Ref AWS::Region
                awslogs-stream-prefix: 'aws/ecs/teamcity-agent'
            # Mount the host volume in the container.
            MountPoints:
              - ContainerPath: "/var/run/docker.sock"
                SourceVolume: "dockerSock"

I have used this method successfully.

Run in privileged mode

Set privileged mode in the ECS Task defintion for the agent by adding Privileged: true to the container definitions:

  AgentTaskDefinition:
      Type: AWS::ECS::TaskDefinition
      Condition: ShouldLaunchAgents
      DependsOn:
        - PublicLoadBalancer
        - TCServerNodeService
      Properties:
        PlacementConstraints:
          - Type: memberOf
            Expression: attribute:teamcity.node-responsibility == buildAgent
        ContainerDefinitions:
          - Name: 'teamcity-agent'
            Image: !Join [':', ['jetbrains/teamcity-agent', !Ref 'TeamCityVersion']]
            Cpu: !Ref AgentContainerCpu
            Memory: !Ref AgentContainerMemory
            # Run this container in privileged mode.
            Privileged: true
            Essential: true
            Environment:
              - Name: SERVER_URL
                Value: !GetAtt [PublicLoadBalancer, DNSName]
            LogConfiguration:
              LogDriver: 'awslogs'
              Options:
                awslogs-group: !Ref ECSLogGroup
                awslogs-region: !Ref AWS::Region
                awslogs-stream-prefix: 'aws/ecs/teamcity-agent'

I have not tried this method yet.

Capturing multi-part paths in Spring controllers

|

A while back I worked on a Spring Boot application that stores and works with Swagger files. It has a controller that needs to capture file paths at the end of request URIs.

We need to extract three variables: project, repo and path, where path may traverse multiple layers. Some examples:

URI project repo path
/swagger/AAA/domain-api/swagger.yaml AAA domain-api swagger.yaml
/swagger/BBB/domain-api/swaggers/domain.json BBB domain-api swaggers/domain.json

This naive attempt at a solution does not work:

@GetMapping(path = "/swagger/{project}/{repo}/{path}")
public ResponseEntity<String> swaggerFile(@PathVariable("project") String project,
                                          @PathVariable("repo") String repo,
                                          @PathVariable("path") String path) {

With this, when using the first URI, the path variable gets the value swagger without the file extension. The second URI does not match at all.

The file extension can be captured by using a regular expression, for example:

@GetMapping(path = "/swagger/{project}/{repo}/{path:.+}")

Here, given the first URI, the path variable gets the value swagger.yaml as desired. But the second URI still does not match at all.

There is no simple way to extract a deep path into a single variable. I think this is because the Spring classes split the path into segments using slash characters before matching each segment with a regular expression.

A working solution is to get the entire path from the servlet request object and parse it locally.

@GetMapping(path = "/swagger/{project}/{repo}/**")
public ResponseEntity<String> swaggerFile(@PathVariable("project") String project,
                                          @PathVariable("repo") String repo,
                                          HttpServletRequest request) {
    String prefix = String.format("/swagger/%s/%s", project, repo);
    String path = request.getRequestURI().substring(prefix.length() + 2);
    // Use path
}