Copy value with jq

I use jq heavily in my day to day work. It's a powerful tool but not always easy, so I have piles of notes about how to do things with it.

I had to do a few iterations of copying a value from one JSON file to another the other day, and the files are large, and copying and pasting and launching editors was getting old, so I reached for jq. And after digging for a while and reading Stack Overflow answers interpreting the manual page for me, here's how to do it, in fish:

jq --argjson js (jq .path.field a.json) '.path.field = $js' b.json | sponge b.json

The subshell invoked by () reads the value from a.json, --argjson sets the value the variable js without extra quotes or escapes, as it's already a valid JSON value, and then the assignment just sets the value to the document read from b.json.

It also requires sponge because the jq developers don't want it to have an -i option. Oh well.

Parsing and evaluating mathematical expressions in Swift / Part 3: Interpreter

Ever needed to interpret mathematical expressions with variables, like a.field1 + (a.field2 - b.field1) * 2, in Swift? I did. This series of blog posts will walk you through my solution. This is part 3 of the series:

  1. Tokenization
  2. Building a syntax tree
  3. Evaluating the syntax tree

Evaluating the syntax tree

To recap, we started with the string a.field1 + (a.field2 - b.field1) * 2 and ended up with this Swift structure:

ExpressionTreeNode.operation(
    .addition,
    .atom(.name(["a", "field1"])),
    .operation(
        .multiplication,
        .operation(
            .subtraction,
            .atom(.name(["a", "field2"])),
            .atom(.name(["b", "field1"]))
        ),
        .atom(.number(2))
    )
)

Now we want to compute the value of that tree. That requires two things: a depth-first walk of the tree to compute the nested values first, and a context where we can find values for the names. Let's look at the name context first.

Binding values to names is something we're very well equipped for. It's just a simple dictionary. One thing we do have to consider is the types of values. My implementation deals in floating point values, but we have those names as arrays, allowing nested contexts. You might be able to go with enum values, but in this case I ended up with plain dynamic typing: values typed as Any, types checked at runtime. So our dictionary is typed as [String: Any].

Do we want to build up dictionaries of all our values? I suppose we could. But as you might have guessed from the names in our expressions, the values are actually derived from some actual live objects. It might be nicer to feed those objects directly to the evaluator, right? We can use a protocol to make it happen. Swift's dictionaries use Swift's subscript functionality to allow access to values, and you can adopt that in your own types, too. So let's define a protocol:

protocol ExpressionEvaluatorNameContext {
    subscript(name: String) -> Any? { get }
}

So what do we want, the dictionary I discussed earlier or types implementing that protocol? Swift allows us to have our cake and eat it too:

extension Dictionary: ExpressionEvaluatorNameContext where Key == String, Value == Any {}

And if we want to wrap our objects in that protocol, we can define simple wrapper types:

struct EvaluatorContextUIViewWrapper {
    private let view: UIView

    init(view: UIView) {
        self.view = view
    }
}

extension EvaluatorContextUIViewWrapper: ExpressionEvaluatorNameContext {
    subscript(name: String) -> Any? {
        switch name {
        case "height": return Double(self.view.bounds.height)
        default: return nil
        }
    }
}

Now that we have our context types defined, we can start digging for the values. First let's define a couple of error types that we'll use in our evaluator. EvaluationError is what we'll use to signal problems outside, and NameLookupInContextFailed is used to accumulate errors in recursive name resolution.

enum EvaluationError: Error {
    case emptyName
    case unknownName([String], Error)
    case invalidOperation(ExpressionOperator, Any, Any)
}

struct NameLookupInContextFailed: Error {
    let name: String
    let cause: Error?
}

Then define a couple of functions for finding values in those nested contexts. One provides the outside interface, the other recursively calls itself with names from the list until reaches the end or runs into a problem:

func contextValue(for name: [String], context: ExpressionEvaluatorNameContext) throws -> Any {
    guard let nameHead = name.first else { throw EvaluationError.emptyName }
    do {
        return try contextValue(for: nameHead, tail: name.dropFirst(), in: context)
    } catch {
        throw EvaluationError.unknownName(name, error)
    }
}

func contextValue(
    for nameHead: String,
    tail: ArraySlice<String>,
    in context: ExpressionEvaluatorNameContext
) throws -> Any {
    let maybeValue = context[nameHead]

    guard let value = maybeValue else {
        throw NameLookupInContextFailed(name: nameHead, cause: nil)
    }

    if let tailHead = tail.first {
        guard let subContext = value as? ExpressionEvaluatorNameContext else {
            throw NameLookupInContextFailed(name: nameHead, cause: nil)
        }

        do {
            return try contextValue(for: tailHead, tail: tail.dropFirst(), in: subContext)
        } catch {
            throw NameLookupInContextFailed(name: nameHead, cause: error)
        }
    } else {
        return value
    }
}

That's pretty straightforward — look for a value, check if we have still more parts of the name to resolve, if so recurse, otherwise return the value, check types and throw errors as necessary. With those functions in place, we can resolve names in our expressions.

We can now move on to expression node evaluation. We have two kinds of expression nodes: ones with just a simple value, and ones with an operation on other nodes.

func evaluate(expression: ExpressionTreeNode, context: ExpressionEvaluatorNameContext) throws -> Any {
    switch expression {
    case let .atom(atom):
        return try evaluate(atom: atom, context: context)
    case let .operation(op, n1, n2):
        return try evaluate(operation: op, node1: n1, node2: n2, context: context)
    }
}

For atoms, it can be a number which we can return directly, or a name, in which case we'll use the contextValue functions we defined earlier.

func evaluate(atom: ExpressionTreeAtom, context: ExpressionEvaluatorNameContext) throws -> Any {
    switch atom {
    case let .name(name): return try contextValue(for: name, context: context)
    case let .number(num): return num
    }
}

And for operators we need to find a function that implements the operator and then evaluate both sides so we get the final operand values for the operation.

func evaluate(
    operation: ExpressionOperator,
    node1: ExpressionTreeNode,
    node2: ExpressionTreeNode,
    context: ExpressionEvaluatorNameContext
) throws -> Any {
    let opf = opFunc(operation)
    let val1 = try evaluate(expression: node1, context: context)
    let val2 = try evaluate(expression: node2, context: context)
    return try opf(val1, val2)
}

All right, almost done! Except for the implementation of opFunc which should give us actual operator implementations. We need functions that are ready to deal with two Any values and do math of them. I'm going to present simple versions here that deal only with Doubles for brevity. In reality you may want to support other numeric types too, as well as deal with Optionals. The op function is generic and will happily deal with any types you throw at it, but optionals would require a bit more work to unwrap them.

func opFunc(_ op: ExpressionOperator) -> (Any, Any) throws -> Any {
    switch op {
    case .addition: return opAddition(_:_:)
    case .subtraction: return opSubtraction(_:_:)
    case .multiplication: return opMultiplication(_:_:)
    case .division: return opDivision(_:_:)
    }
}

func opAddition(_ lhs: Any, _ rhs: Any) throws -> Any {
    if let result = op(Double.self, +, lhs, rhs) { return result }
    throw EvaluationError.invalidOperation(.addition, lhs, rhs)
}

func opSubtraction(_ lhs: Any, _ rhs: Any) throws -> Any {
    if let result = op(Double.self, -, lhs, rhs) { return result }
    throw EvaluationError.invalidOperation(.subtraction, lhs, rhs)
}

func opMultiplication(_ lhs: Any, _ rhs: Any) throws -> Any {
    if let result = op(Double.self, *, lhs, rhs) { return result }
    throw EvaluationError.invalidOperation(.multiplication, lhs, rhs)
}

func opDivision(_ lhs: Any, _ rhs: Any) throws -> Any {
    if let result = op(Double.self, /, lhs, rhs) { return result }
    throw EvaluationError.invalidOperation(.division, lhs, rhs)
}

func op<T>(_ type: T.Type, _ operation: (T, T) -> T, _ lhs: Any, _ rhs: Any) -> T? {
    if let lhsValue = lhs as? T, let rhsValue = rhs as? T {
        return operation(lhsValue, rhsValue)
    }

    return nil
}

Conclusion

That's it! That'll take you from the expression as a string to a final double value, as long as you have suitable values in the context. If we give it a spin, the following results in 42:

let context: [String: Any] =  [
    "b": ["field1": 4.0] as [String: Any],
    "a": ["field1": 10.0, "field2": 20.0] as [String: Any]
]

let tokenized = try tokenizeExpression(input: "a.field1 + (a.field2 - b.field1) * 2")
let node = try parseTokenizedExpression(expr: tokenized)
try evaluate(expression: node, context: context)

This was a fun problem to work on. I got to use a very functional approach to a problem I hadn't had to tackle before. I'm sure a lot of this is the kind of thing the first chapters of a compilers course would cover, but it was unfamiliar territory for me.

There's lots of room to expand. In the part about parser combinators I left lots of details uncovered. The evaluator only supports non-optional Doubles. And the expression language is very limited with only four operators and no function calls (spoiler: they're easy to add.) But if you ever need to work on something like this and don't have a modern compiler text book handy, this might get you off to a decent start.

Parsing and evaluating mathematical expressions in Swift / Part 2: Building a syntax tree

Ever needed to interpret mathematical expressions with variables, like a.field1 + (a.field2 - b.field1) * 2, in Swift? I did. This series of blog posts will walk you through my solution. This is part 2 of the series:

  1. Tokenization
  2. Building a syntax tree
  3. Evaluating the syntax tree

Parsing a list into a syntax tree

A refresher: we're looking at this expression: a.field1 + (a.field2 - b.field1) * 2. We've now chopped it up into a list of tokens.

What you want next: something that lets you interpret it easily. One option would be a stack, like in Forth or Reverse Polish notation. That would mean transforming the expression to into a Swift array like this:

[.name(["a", "field1"]), .name(["a", "field2"]), .name(["b", "field1"]), .subtraction, .number(2), .multiplication, .addition]

That would allow you interpret it with a stack machine, popping operands and operators off the stack one by one.

Another option, and the option I'm going for, is a syntax tree. It's a tree you can walk to evaluate your expression. Make a tree where each operator is a node and its operands subnodes, then when evaluating first evaluate the subnodes before evaluating the node itself. A diagram of the tree would look like this:

          ┌───┐           
      ┌───│ + │───┐       
      ▼   └───┘   ▼       
┌──────────┐    ┌───┐     
│ a.field1 │ ┌──│ * │──┐  
└──────────┘ │  └───┘  │  
             ▼         ▼  
           ┌───┐     ┌───┐
      ┌────│ - │────┐│ 2 │
      │    └───┘    │└───┘
      ▼             ▼     
┌──────────┐  ┌──────────┐
│ a.field2 │  │ b.field1 │
└──────────┘  └──────────┘

To model that tree in Swift, you'll want a few types. Indirect enums are great for tree nodes:

enum ExpressionOperator {
    case addition
    case subtraction
    case multiplication
    case division
}

enum ExpressionTreeAtom {
    case number(Double)
    case name([String])
}

indirect enum ExpressionTreeNode {
    case atom(ExpressionTreeAtom)
    case operation(ExpressionOperator, ExpressionTreeNode, ExpressionTreeNode)
}

To get from the token list to a data structure built with those types, the thing we have to figure out is operator precedence. There are a bunch of algorithms for solving that. I picked Eli Bendersky's article on precedence climbing as my guide and was happy with the result. The idea is to define precedence levels (* is solved before +) and associativity (in what order is a list of equal-precedence operators solved) for the operators we support, then recursively handle the tokens.

Let's start by defining precedences and associativities:

enum Associativity: Equatable {
    case left
    case right
}

struct OperatorInfo {
    let precedence: Int
    let associativity: Associativity
}

extension ExpressionOperator {
    var operatorInfo: OperatorInfo {
        switch self {
        case .addition: return OperatorInfo(precedence: 1, associativity: .left)
        case .subtraction: return OperatorInfo(precedence: 1, associativity: .left)
        case .multiplication: return OperatorInfo(precedence: 2, associativity: .left)
        case .division: return OperatorInfo(precedence: 2, associativity: .left)
        }
    }
} 

With those definitions ready, we're ready for the algorithm itself. It's two mutually recursive functions, walking an ArraySlice of the tokens with the space tokens filtered out. computeAtom handles individual atoms, computeExpr compound expressions. When computeAtom runs into an parentheses it calls computeExpr, and computeExpr calls computeAtom to get a value for the first operand and itself to solve the second one. It's a pretty direct translation of Bendersky's example to Swift, with the difference that he computed values while walking the token list, whereas my code creates the syntax tree for evaluation later.

The precedence values define how expressions are divided into subexpressions, and so control how the recursive computeExpr calls consume parts of the expression.

func computeAtom(tokens: inout ArraySlice<ExpressionToken>) throws -> ExpressionTreeNode {
    guard let tok = tokens.first else {
        throw ExpressionParseError.unexpectedEndOfExpression
    }
    switch tok {
    case .openParen:
        tokens = tokens.dropFirst()
        let subNode = try computeExpr(tokens: &tokens, minimumPrecedence: 1)
        guard case .closeParen = tokens.first else {
            throw ExpressionParseError.unbalancedParentheses
        }
        tokens = tokens.dropFirst()
        return subNode
    case .addition,
         .subtraction,
         .multiplication,
         .division:
        guard let op = ExpressionOperator(token: tok) else {
            preconditionFailure("Failed to convert token \(tok) to Operator")
        }
        throw ExpressionParseError.unexpectedOperator(op)
    case .closeParen:
        throw ExpressionParseError.unbalancedParentheses
    case let .number(num):
        tokens = tokens.dropFirst()
        return .atom(.number(num))
    case let .name(name):
        tokens = tokens.dropFirst()
        return .atom(.name(name))
    case .space:
        preconditionFailure("Unexpected space token")
    }
}

private func computeExpr(
    tokens: inout ArraySlice<ExpressionToken>,
    minimumPrecedence: Int
) throws -> ExpressionTreeNode {
    var node1 = try computeAtom(tokens: &tokens)
    while let tok = tokens.first {
        guard let op = ExpressionOperator(token: tok) else { break }
        let opInfo = op.operatorInfo
        guard opInfo.precedence >= minimumPrecedence else { break }
        let nextMinimumPrecedence = opInfo.precedence + (opInfo.associativity == .left ? 1 : 0)
        tokens = tokens.dropFirst()
        let node2 = try computeExpr(tokens: &tokens, minimumPrecedence: nextMinimumPrecedence)
        node1 = .operation(op, node1, node2)
    }
    return node1
}

We can kick off the computation by filtering out spaces and converting our tokens array into an ArraySlice.

func parseTokenizedExpression(expr: TokenizedExpression) throws -> ExpressionTreeNode {
    var tokens = expr.tokens.filter { $0 != .space }[...]
    return try computeExpr(tokens: &tokens, minimumPrecedence: 1)
}

That'll give us this tree:

ExpressionTreeNode.operation(
    .addition,
    .atom(.name(["a", "field1"])),
    .operation(
        .multiplication,
        .operation(
            .subtraction,
            .atom(.name(["a", "field1"])),
            .atom(.name(["b", "field2"]))
        ),
        .atom(.number(2))
    )
)

Whew. We have now tokenized our string and generated a syntax tree. Only one thing left to do in part 3: evaluating the expression so we can get a value out of it.

© Juri Pakaste 2023