Middleware

Middleware

Many web API frameworks contain a concept called “middleware” (but every language/framework calls it differently - filters, middleware, etc). Essentially, the middleware performs some specific function on the HTTP request or response before or after the handler. Common tasks to offload to a middleware would be logging, authorization, body compression, etc.

RestRserve comes with several build-in middlewares (AuthMiddleware, CORSMiddleware, ETagMiddleware) and generic Middleware class which facilitates user to create a custom middleware.

Let’s see it in action in example below.

Logging

Let’s say you have a simple app which has only a single endpoint - it simply convert query string parameters into a JSON format and sends it back:

library(RestRserve)

app = Application$new(content_type = "application/json")

backend = BackendRserve$new()

app$add_get("/foo", function(.req, .res) {
  body = RestRserve::to_json(.req$parameters_query)
  .res$set_body(body)
  # specify that there is no need to specially encode the body as
  # we've already set it to a JSON
  .res$encode = identity
})

See it in action:

req = Request$new(path = "/foo", method = "GET", parameters_query = list(key1 = "value1", key2 = "value2"))
resp = app$process_request(req)
resp$body
#> [1] "{\"key1\":\"value1\",\"key2\":\"value2\"}"

Assume you would like to analyze how your web service works. For that you may need log every request and response in order to see whether service replies with errors and what can cause these errors. This is a perfect task for a middleware and here is how you can achieve this with RestRserve:

logging_middleware = Middleware$new(
  process_request = function(.req, .res) {
    msg = list(
      middleware = "logging_middleware",
      request_id = .req$id,
      request = list(headers = .req$headers, method = .req$method, path = .req$path), 
      timestamp = Sys.time()
    )
    msg = RestRserve::to_json(msg)
    cat(msg, sep = '\n')
  },
  process_response = function(.req, .res) {
    msg = list(
      middleware = "logging_middleware",
      # we would like to have a request_id for each response in order to correlate
      # request and response
      request_id = .req$id,
      response = list(headers = .res$headers, status_code = .res$status_code, body = .res$body),
      timestamp = Sys.time()
    )
    msg = to_json(msg)
    cat(msg, sep = '\n')
  },
  id = "logging"
)

app$append_middleware(logging_middleware)

Let’s test again:

req = Request$new(path = "/foo", method = "GET", parameters_query = list(key1 = "value1", key2 = "value2"))
resp = app$process_request(req)
#> {"middleware":"logging_middleware","request_id":"50b06e54-171b-11ef-b3c7-faffc2676f47","request":{"headers":{},"method":"GET","path":"/foo"},"timestamp":"2024-05-21 10:39:15"}
#> {"middleware":"logging_middleware","request_id":"50b06e54-171b-11ef-b3c7-faffc2676f47","response":{"headers":{"Server":"RestRserve/1.2.3; Rserve/1.8.13"},"status_code":200,"body":"{\"key1\":\"value1\",\"key2\":\"value2\"}"},"timestamp":"2024-05-21 10:39:15"}

Let’s see what will happen if we will send request to nonexistent endpoint:

req = Request$new(path = "/foo2", method = "GET", parameters_query = list(key1 = "value1", key2 = "value2"))
resp = app$process_request(req)
#> {"middleware":"logging_middleware","request_id":"50b26506-171b-11ef-b3c7-faffc2676f47","request":{"headers":{},"method":"GET","path":"/foo2"},"timestamp":"2024-05-21 10:39:15"}
#> {"middleware":"logging_middleware","request_id":"50b26506-171b-11ef-b3c7-faffc2676f47","response":{"headers":{"Server":"RestRserve/1.2.3; Rserve/1.8.13"},"status_code":404,"body":"404 Not Found"},"timestamp":"2024-05-21 10:39:15"}

Later you will see all the responses with errors in the log (status_code >= 400). Also you will be able to find corresponding requests by inspecting request_id field.

Middleware order

It is important to understand that middlewares are executed in order you’ve added them (that’s why it is called append_middleware). Flow is shown on the diagram below.

Compression

To demonstrate the order in which middleware called let’s consider another example.

Sometimes it it useful to compress response body in order to send less data over the wire. Here we will implement a middleware which will compress response with gzip.

gzip_middleware = Middleware$new(
  process_request = function(.req, .res) {
    msg = list(
      middleware = "gzip_middleware",
      request_id = .req$id,
      timestamp = Sys.time()
    )
    msg = to_json(msg)
    cat(msg, sep = '\n')
  },
  process_response = function(.req, .res) {
    
    # compress body
    .res$set_header("Content-encoding", "gzip")
    .res$set_body(memCompress(.res$body, "gzip"))
    
    msg = list(
      middleware = "gzip_middleware",
      request_id = .req$id,
      timestamp = Sys.time()
    )
    msg = to_json(msg)
    cat(msg, sep = '\n')
  },
  id = "gzip"
)
app$append_middleware(gzip_middleware)
req = Request$new(path = "/foo", method = "GET", parameters_query = list(key1 = "value1", key2 = "value2"))
resp = app$process_request(req)
#> {"middleware":"logging_middleware","request_id":"50b7433c-171b-11ef-b3c7-faffc2676f47","request":{"headers":{},"method":"GET","path":"/foo"},"timestamp":"2024-05-21 10:39:15"}
#> {"middleware":"gzip_middleware","request_id":"50b7433c-171b-11ef-b3c7-faffc2676f47","timestamp":"2024-05-21 10:39:15"}
#> {"middleware":"gzip_middleware","request_id":"50b7433c-171b-11ef-b3c7-faffc2676f47","timestamp":"2024-05-21 10:39:15"}
#> {"middleware":"logging_middleware","request_id":"50b7433c-171b-11ef-b3c7-faffc2676f47","response":{"headers":{"Server":"RestRserve/1.2.3; Rserve/1.8.13","Content-encoding":"gzip"},"status_code":200,"body":"eJyrVspOrTRUslIqS8wpTTVU0gHxjWB8I6VaAK3KCjs="},"timestamp":"2024-05-21 10:39:15"}

And now check what is actual decoded response body:

rawToChar(memDecompress(resp$body, "gzip"))
#> [1] "{\"key1\":\"value1\",\"key2\":\"value2\"}"