It is my contention that naked regular expression use (regexes) within SRE and DevOps software and tooling provides for:

  • Hard to read code
  • Difficult to reason when errors occur
  • Bad matches that are assumed to be good data
  • Causing outages or unintentional consequences in tooling

It is not the contention that regexes are bad in nature, but that their ease of use combined with abilities to match large and variable amounts of characters (wildcards) allow for problematic use cases. Not all use cases are equal.

This paper concentrates on a single use case of which I have experience and the introduction of a framework to alleviate problems with this use case, called the HalfPike method.

Before going further

It should be noted that like a lexer/parser implementation, this methodology takes longer to develop.

You are trading these:

  • 10x longer to develop a solution
  • Much more verbosity (100x or more)
  • Must teach new people the methodology (about 20 minutes)

For these advantages:

  • Debug time is almost instantaneous
  • Code can be read and deciphered quickly
  • Validation of your data before use
  • Death to stringly typed fields (was it “Enabled”, “enabled”, “ENablEd”)

Many of these advantages can be overcome with regexes. But like many good programming paradigms, the structure is a guard rail that encourages best practices across large groups of people. This encouragement is powerful in creating good software. Regexes tend to be bad at this encouragement.

There is upfront cost, but I believe over time these costs are paid back with interest in time savings from debugging and operational errors.

Finally, the halfpike package and examples can be found here.

Documentation can be found here


During my years working on network automation systems for Google we had a constant problem. We needed to talk with network devices and extract various states from a human readable string format to concrete types in our native languages . SNMP could not provide complete information and streaming telemetry was still a pipe dream.

Of our early platforms, only Juniper routers could give us the data in a native format, XML. The XML was quite painful to use and IOS, IOSX, EOS, whatever Brocade calls their OS could not export to machine readable format like JSON or Protocol Buffer. This would change in the future, but at that time we could not wait until the vendors could deliver.

For years, the common way we did this was using regular expressions.

This was not a well thought out decision on our part. Most of us doing this type of work had never written a lexer or had done so long ago and were not eager to relive the pain. Regexes were simply something that we could get off the shelf.

Our output was often multi-line, so the early versions either used some type of loop with detection of a starting point where we then regexed a line we understood or built extra-ordinarily hard to read regexes that could handle multiple lines of output.

Rarely did we use named matches, instead choosing to use positional arguments. Worse yet, we assumed the matches were correct (I mean, they matched our 1 line of test data didn’t they!?).

One of my co-workers eventually created a framework around regexes. His implementation allowed you to build complex multi-line regexes that would handle getting matches back as a table in which you could then use the positional arguments of the table matches to store into an object’s attributes.

This certainly looked like a massive improvement at first, but uses of the tooling proved hard to debug. Like most regexes that were written, new versions of an OS or under new internal conditions might alter the output. Because the software output was provided by third-party software outside our control, there was no dictionary of terms we could use to be 100% sure of what would be returned over every version.

The solution to the debugging problem was to create a debugging tool to help resolve issues that arose. However, I felt that needing a debugger for my regex matching issues was proof that this wasn’t the method we should be using (very similar to how I felt when our Python programs needed us to substitute our own malloc to make them work, #deathtopython).

In addition, the framework would often use overly greedy regexes to match something incorrectly and cause operational problems by providing us with bad data. When 30 engineers might make changes, things slip through if no guard rails exist.

This caused configuration problems and tooling issues in different automation software when it would act incorrectly.

I began researching various lexing software such as ANTLR. I was sure I was on the right track, but I just couldn’t seem to make it through more than 2 pages of the ANTLR book without falling asleep. The only other technical book that had this effect on me was the Sendmail book where I found if I was having insomnia, I would just try to read a chapter.

Eventually, because of my work using Golang #deathtopython#, I came across a talk by Rob Pike on lexers and I was intrigued by Pike’s turn on the standard state machine for lexing. It is not an easy talk to get through, but once you get the concept down (on my 3rd viewing), you appreciate the methodology.

However, the lexer/parser he described was not a perfect fit for us. It was for a well described language where the syntax was completely known ahead of time. It also is word oriented, which provides for more verbosity.

Writing lexer/parsers like this was also too far of a leap to bring along my SRE colleagues from “simple regexes”. Looking at lexer/parser code is much more verbose and until you reason about them for a bit, they look harder to understand, not easier.

But I had hard requirements for a new system:

  • Do not want to decipher long regexes
  • No special debuggers, when it breaks I want to know exactly why
  • Bad data can never sneak through, so everything must validate
  • Avoid stringly typed fields, instead use enumeration.

What emerged was a technique I called the HalfPike. The proving ground was to be a command/configuration abstraction service I was writing. Up to this point, different automations each wrote and read commands via a connection abstraction service to different device types. The new service created a common way to command/configure a device with common data structures, regardless of the device type. This allowed all automation services to use a common way to ask about BGP state, have a device ping a neighbor address or configure an interface. What OpenConfig wants to do with configuration).

The service was written to extract and transform data from devices that had machine readable formats (JSON, XML, Protocol Buffers) into our common data types when available. When it wasn’t, the human readable format would be converted via a HalfPike lexer/parser into the common data type.

In both cases, we would transform string data into enumerated types when possible.

Finally, we required that each HalfPike implement a Validate() method that checked each attribute for the correct values or combination of values required. By codifying this as a requirement , we avoided misconfiguration of devices with bad data, instead causing the read to fail.

Fixes went from days to minutes and updates to systems in a few hours.

What Is A HalfPike

The HalfPike method derives its name from borrowing half of it’s implementation from Rob Pike’s lexer talk. Mr. Pike certainly has not endorsed this.

The HalfPike differs from Pike’s lexer/parser in the following ways:

  • Lexed Item(s) fall into predefined categories and there is a common lexer for all use cases
  • Uses line boundaries over word boundaries in the Parser
  • May use regexes for complex decomposition of an Item or Line
  • Provides methods for skipping lines or finding a particular line in output

The HalfPike methodology differs from standard regex based approaches by:

  • Preference on enumerations over string values
  • Requiring validation of data fields
  • Encouraging numeric conversion from strings
  • Line based approach (regexes can be multi-line)

None of this is 100% foolproof. But we found that using this methodology we were never surprised by our field data. We would need to adjust our parser for command output when we would encounter new data output we had not seen before or the format would change when a new OS version was being tested.

This technique was used until we could deprecate older devices unable to do structured output and/or get vendor’s to implement structured output for their devices. However, many shops still have these problems (not everyone can update hardware with new OS capabilities so quickly) and the technique is usable for other system output that needs to be parsed from human readable to machine readable structured data.

The original version that I used at Google was simply a codified state machine, a few interfaces and enforcement by code review.

Below I will provide a framework to make this as simple as possible for a new user to get into.

HalfPike Lexer

To start, if you have not seen Rob Pike’s [Lexer talk](Lexical Scanning in Go - Rob Pike - YouTube), this provides for a good introduction to a lexing engine used for a templates.

Our HalfPike lexer emits items similar to the item type in Pike’s talk. Here’s a look:

// Item represents a token created by the Lexer.
type Item struct {
	// Type is the type of item that is stored in .Val.
	Type ItemType
	// Val is the value of the item that was in the text output.
	Val string

We support a few ItemType(s) that the Parser will have to deal with:

const (
	// ItemUnknown indicates that the Item is an unknown. This should only happen on
	// a Item that is the zero type.
	ItemUnknown ItemType = iota
	// ItemEOF indicates that the end of input is reached. No further tokens will be sent.
	// ItemText indicates that it is a block of text separated by some type of space (including tabs).
	// This may contain numbers, but if it is not a pure number it is contained in here.
	// ItemInteger indicates that an integer was found.
	// ItemFloat indicates that a float was found.
	// ItemEOL indicates the end of a line was reached.

The lexer will only emit these tokens. The most common will be ItemText. But if the text contained a pure integer or pure float, these will be emitted.

Once a ItemEOF is reached the lexer will be done emitting tokens. An ItemUnknown should never be emitted and if it is seen by the parser it always indicates an internal error in the framework.

Spaces are never emitted nor are blank lines.

In our HalfPike framework, the lexer is hidden. The user simply has to deal with items output by our lexer and use the Parser framework to parse the output.

HalfPike Parser

The Parser is where all the magic comes in for the user. Here we have our structure for storing data we will parse, helpers to skip through input, etc….

But to talk about the Parser correctly, we need to talk about a few other constructs.

Line Objects

// Line represents a line in the input.
type Line struct {
	// Items are the Item(s) that make up a line.
	Items   []Item
	// LineNum is the line number in the content this 
  // represents, starting at 1.
	LineNum int
	// Raw is the actual raw string that made up the line.
	Raw     string

The Line object details the content of a line. .Items gives us the list of lexed Items that make up the line. No spaces are provided, but each Line will end with a combination of either a ItemEOL or ItemEOL+ItemEOF.


type ParseFn func(ctx context.Context, p *Parser) ParseFn

The ParseFn is where you write the meat of the program. You receive our Parser object and use it to loop through Line objects until an ItemEOF is reached.

Once you finished with a line or set of lines, you return the next ParseFn that will handle content. This makes up a basic state machine.

If you return nil, then parsing stops.

Within your ParseFn, you need to move through content. This is where the Parser comes in.


The Parser has a few methods worth noting:

// Errorf records an error in parsing and returns a nil ParseFn.
func (p *Parser) Errorf(str string, args ...interface{}) ParseFn {...}

You will use return p.Errorf() whenever you want to return an error and stop parsing. It returns a nil ParseFn so you don’t have to do a separate “return nil”.

// Next moves to the next Line sent from the Lexer. That Line is returned. If we haven't
// received the next Line, the Parser will block until that Line has been received.
func (p *Parser) Next() Line {...}

Next() is our basic method of getting content. You call Next() to receive the next Line object in the content.

// Backup undoes a Next() call and returns the items in the previous line.
func (p *Parser) Backup() Line {...}

Backup() goes back one Line of content and returns that Line. It is often used after an initial ParseFn is used to detect the start of input but another ParseFn will do the parsing.

// EOF returns true if the last Item in []Item is a ItemEOF.
func (p *Parser) EOF(line Line) bool {...}

EOF() is used to detect if a Line is the end of the file. Next() will continue to return the last Line once the end of content is reached. EOF() allows you to detect and break out of any loop.

// Peek returns the item in the next position, but does not change the current position.
func (p *Parser) Peek() Line {...}

Peek() is used to see the next Line of content without moving to that Line.

// FindStart looks for an exact match of starting items in a line represented by Line
// continuing to call .Next() until a match is found or EOF is reached.
// Once this is found, Line is returned. This is done from the current position.
func (p *Parser) FindStart(find []string) (Line, error) {...}

FindStart() takes a list of strings that represent Item.Val at the beginning of a line and calls Next() until it finds a line that has a match.

A special constant called Skip can be used to match any content. An error is returned if the end of content is reached and we have not found a Line with a match.

// FindUntil searches a Line until it matches "find", matches "until" or reaches the EOF. If "find" is
// matched, we return the Line. If "until" is matched, we call .Backup() and return true. This
// is useful when you wish to discover a line that represent a sub-entry of a record (find) but wish to
// stop searching if you find the beginning of the next record (until).
func (p *Parser) FindUntil(find []string, until []string) (matchFound Line, untilFound bool, err error) {

FindUntil() is similar to FindStart(), except it will stop searching if the "find" or “until” argument is found. If "find" is found, the line is returned. If "until" is found, .Backup() is called and we return true.

This allows searching through entries that belong to a record for "find" but stopping if we find the beginning of the next record denoted by "until".

// IsAtStart checks to see that "find" is at the beginning of "line".
IsAtStart(line Line, find []string) bool {...}

IsAtStart() is the basis for FindStart() and FindUntil(). This can be used to make your own searching method if the others don’t fit.

FindREStart(find []*regexp.Regexp) (Line, error) {...}
IsREStart(line Line, find []*regexp.Regexp) bool {...}

There are the same as FindStart() and IsAtStart() except that instead of exact string matches it uses regular expressions.

Let’s Parse

Let’s have a really simple example for parsing:

Physical interface: ge-3/0/2, Enabled, Physical link is Up
  Link-level type: 52, MTU: 1522, Speed: 1000mbps, Loopback: Disabled,
Physical interface: ge-3/0/3, Enabled, Physical link is Up
  Link-level type: 52, MTU: 1522, Speed: 1000mbps, Loopback: Disabled,

This is output for a Juniper router using the “show interfaces brief”. Now there is actually more output here, but these are two lines at the start of an entry I care about in our example.

Data types to store in

// Interfaces is a collection of Interface information for a device.
type Interfaces []*Interface
func (i Interfaces) Validate() error {
	for _, v := range i {
		if err := v.Validate(); err != nil {
			return err
	return nil

type LinkLevel int8
const (
	LLUnknown LinkLevel = 0
	LL52 LinkLevel = 1
	LLPPP LinkLevel = 2
	LLEthernet LinkLevel = 3

type InterState int8
const (
	IStateUnknown InterState = 0
	IStateEnabled InterState = 1
	IStateDisabled InterState = 2

type InterStatus int8
const (
	IStatUnknown InterStatus = 0
	IStatUp InterStatus = 1
	IStatDown InterStatus = 2

// Interface is a brief decription of a network interface.
type Interface struct {
	// VendorDesc is the name a vendor gives the interface, like ge-10/2/1.
	VendorDesc string
	// Blade is the blade in the routing chassis.
	Blade int
	// Pic is the pic position on the blade.
	Pic int
	// Port is the port in the pic.
	Port int
	// State is the interface's current state.
	State InterState
	// Status is the interface's current status.
	Status InterStatus
	// LinkLevel is the type of encapsulation used on the link.
	LinkLevel LinkLevel
	// MTU is the maximum amount of bytes that can be sent on the frame.
	MTU int
	// Speed is the interface's speed in bits per second.
  	Speed int

  	initCalled bool

// init initializes Interface.
func (i *Interface) init() {
		i.Blade = -1
		i.Pic = -1
		i.Port = -1
		i.MTU = -1
		i.Speed = -1
		i.initCalled = true

// Validate implements halfpike.Validator.
func (i Interface) Validate() error {
	if i.VendorName == "" {
		return fmt.Errorf("interface did not have a valild VendorName")
	switch -1:
	case Blade, Pic, Port:
		return fmt.Errorf("interface %s did not have a valid Blade/Pic/Port(%s/%s/%s)", i.VendorName, i.Blade, i.Pic, i.Port)
	case MTU:
		return fmt.Errorf("interface %s did not have a valid MTU", i.VendorName)
	case Speed:
		return fmt.Errorf("interface %s did not have a valid Speed", i.VendorName)

	if i.State == IStateUnknown {
		return fmt.Errorf("interface %s did not have a valid state", i.VendorName)

	if i.Status == IStatUnknown {
		return fmt.Errorf("interface %s did not have a valid status", i.VendorName)

	if i.LinkeLevel == LLUnknown {
		return fmt.Errorf("interface %s did not have a valid link level", i.VendorName)

	return nil

Note: This is not a particularly good format. First, I’d use protocol buffers or some other cross language type for storage. Second, not everything has a blade and a pic. You need to have better methodology to handle these cases. But this isn’t a lesson on network representation formats. Also you should be able to get structured format from the Juniper, this is just an example.

Define our parsing states with ParseFn(s)

To make this transformation, we will need to create states in a state machine to handle searching output and turning it into this format.

I’m choosing here to bundle our ParseFn(s) into a type called interfaceBriefParsers. It stores the interfaces we find and a copy of our *Parser object.

type interBriefParsers struct {
	parser *Parser
	inters Interfaces
func (i *interBriefParsers) errorf(s string, a ...interface{}) ParseFn{
	if len(i.inters) > 0 {
		v := i.current().VendorDesc
		if v != "" {
			return i.parser.Errorf("interface(%s): %s", v, fmt.Sprintf(s, a...))
	return i.parser.Errorf(s, a...)

Here we have a convenience wrapper for writing errors. If we were able to parse the VendorDesc of an interface (like ge-0/1/1), we use that in our error output. If not, we just detail the error.

var phyStart = []string{"Physical", "interface:", Skip, Skip, "Physical", "link", "is", Skip}

// Physical interface: ge-3/0/2, Enabled, Physical link is Up
func (i *interBriefParsers) findInterface(ctx context.Context, p *Parser) ParseFn {
	if i.parser == nil {
		i.parser = p

	// The Skip here says that we need to have an item here, but we don't care what it is.
	// This way we can deal with dynamic values and ensure we
	// have the minimum values we need.
	// p.FindItemsRegexStart() can be used if you require more
	// complex matching of static values.
	_, err := p.FindStart(phyStart) 
	if err != nil {
		if len(i.inters) == 0 {
			return i.errorf("could not find a physical interface in the output")
		return nil
	// Create our new entry.
	inter := &Interface{}
	i.inters = append(i.inters, inter)

	p.Backup() // I like to start all ParseFn with either Find...() or p.Next() for consistency.
	return i.phyInter

Here is our starting ParseFn. Simply we use the FindStart() to locate the first line that matches phyStart. This is the beginning of a record for us to store.

Once found, we create a new Interface{} object and append it to our list of interfaces we find.

Finally, we do a .Backup() and pass the line to a ParseFn called phyInter to break down the line. We could have just done this here, but I find this cleaner, at least for the first line of a record.

var toInterState = map[string]InterState{
	"Enabled,": IStateEnabled,
	"Disabled,": IStateDisabled,

var toStatus = map[string]InterStatus{
	"Up": IStatUp,
	"Down": IStatDown,

// Physical interface: ge-3/0/2, Enabled, Physical link is Up
func (i *interBriefParsers) phyInter(ctx context.Context, p *Parser) ParseFn {
	// These are indexes within the line where our values are.
	const (
		name = 2
		stateIndex = 3
		statusIndex = 7
	line := p.Next() // fetches the next line of ouput.

	i.current().VendorDesc = line.Items[name].Val // this will be ge-3/0/2 in the example above
	if err := i.interNameSplit(line.Items[name].Val); err != nil {
		return i.errorf("error parsing the name into blade/pic/port: %s", err)
	state, ok := toInterState[line.Items[stateIndex].Val]
	if !ok {
		return i.errorf("error parsing the interface state, got %s is not a known state", line.Items[stateIndex].Val)
	i.current().State = state

	status, ok := toStatus[line.Items[statusIndex].Val]
	if !ok {
		return i.errorf("error parsing the interface status, got %s which is not a known status", line.Items[statusIndex].Val)
	i.current().Status = status
  return i.findLinkLevel

Note: There is a convenience method called .current() that gives us the current Interface{} that we are working on.

phyInter starts by grabbing the line via the .Next() call. We want to record the Vendor’s description of the interface before we break it down. There are a few constants that record the index in the Line.Items where that entry should be located. We record the VendorDesc attribute by simply referencing the index where it should be stored.

Then we need to break down the vendor’s description and turn it into our Blade/Pic/Port entries. To do this we pass that string representation to interNameSplit(). We will detail a little further.

Next, we want to get our interface state and status. This is very similar to VendorDesc, except that we want to convert to a known enumerator type. We define a few maps, toInterState and toStatus to handle this.

Finally, if we have had no issues, we return our next state, which is findLinkLevel.

Now, let’s go back to interNameSplit().

// ge-3/0/2
var interNameRE = regexp.MustCompile(`(?P<inttype>ge)-(?P<blade>\d+)/(?P<pic>\d+)/(?P<port>\d+),`)

func (i *interBriefParsers) interNameSplit(s string) error {
	matches, err := Match(interNameRE, s)
	if err != nil {
		return fmt.Errorf("error disecting the interface name(%s): %s", s, err)

	for k, v := range matches {
		if k == "inttype" {continue}
		in, err := strconv.Atoi(v)
		if err != nil {
			return fmt.Errorf("could not convert value for %s(%s) to an integer", k, v)
		switch k {
		case "blade":
			i.current().Blade = in
		case "pic":
			i.current().Pic = in
		case "port":
			i.current().Port = in
	return nil

Here we will use a regex in a very limited capacity. We use named matches to break apart the interface name into the interface type, the blade, the pic and finally the port. Below, we convert those entries to their numerical representation and store it. We skip the interface type, because this isn’t important (in real life this might actually be important on a platform once your reach speeds like 40g, 100g, 400g that can be broken into multiple logical ports via breakouts). We will simply use the Speed of the port to understand what the speed is.

// Link-level type: 52, MTU: 1522, Speed: 1000mbps, Loopback: Disabled,
func (i *interBriefParsers) findLinkLevel(ctx context.Context, p *Parser) ParseFn {
	const (
		llTypeIndex = 2
		mtuIndex = 4
		speedIndex = 6

	line, until, err := p.FindUntil([]string{"Link-level", "type:", Skip, "MTU:", Skip, "Speed:", Skip}, phyStart)
	if err != nil {
		return i.errorf("did not find Link-level before end of file reached")
	if until {
		return i.errorf("did not find Link-level before finding the next interface")

	ll, ok := toLinkLevel[line.Items[llTypeIndex].Val]
	if !ok {
		return i.errorf("unknown link level type: %s", line.Items[llTypeIndex].Val)
	i.current().LinkLevel = ll

	mtu, err := strconv.Atoi(strings.Split(line.Items[mtuIndex].Val, ",")[0])
	if err != nil {
		return i.errorf("mtu did not seem to be a valid integer: %s", line.Items[mtuIndex].Val)
	i.current().MTU = mtu

	if err := i.speedSplit(line.Items[speedIndex].Val); err != nil {
		return i.errorf("problem interpreting the interface speed: %s", err)

	return i.record

Here we search through all of the records entries until we either find the Link-Level line or we find the next record.

If we find a record without finding a Link-Level line, it is an error.

Similar conversion is done for numeric values. There is a speedSplit() method for converting the measured multiplier (kbps, mbps, gbps) to allow recording of the value as bps, the common denominator for all interfaces.

If everything is successful, we move to a state that records our record.

// record our data back to the parser.
func (i *interBriefParsers) record(ctx context.Context, p *Parser) ParseFn {
	i.parser.Validator = i.inters
	return i.findInterface

Here we assign our internal slice back to our parser’s Validator attribute.

And finally, we go searching for more interfaces going back to the start of our state machine (findInterface()).

Now, let’s do some parsing!

func main() {
	// Creates our parers object that our various ParseFn functions will use to move
	// through the input.
	p, err := NewParser(showIntBrief, Interfaces{})
	if err != nil {

	// An object that contains various ParseFn methods.
	states := &interBriefParsers{}

	// Parses our content in showBGPNeighbor and begins parsing with states.FindPeer
	// which is a ParseFn.
	if err := Parse(context.Background(), p, states.findInterface); err != nil {

	fmt.Println(pretty.Sprint(p.Validator.(Interfaces) ))

Here we create our Parser via NewParser(). We pass it the content we wish to parse (showIntBrief) and what we will store the data into (Interfaces{}) which must satisfy the Validator interface.

Next we create an instance of our state machine and assign that to states.

Finally, we start our parsing by calling Parse() and pass it the start of our state machine, states.findInterface.

Parse() will run through all the states and call Validate() on the object that is stored in the Parser.Validator attribute. If that fails it will cause Parse() to return an error.

As a necessity I needed to distriguish between a zero value for numbers and an attribute being set. I did this by including an .init() function on Interface{} objects. This sets numeric fields to values like -1 which are not valid values for the attribute. Validate can check those values. To ensure that .init() is called, it sets a private variable called initCalled = true.

When Validate() is run, it is automatically failed if !initCalled.


Sometimes human readable data needs to be converted to concrete representations. While working with vendors and upstream providers to fix this issue, sometimes we need to get work done.

To this end, regexes tend to be problematic in large development environments for data quality that can affect operational reliability. No data is worse than bad data in this regard. Bad assumptions lead to bad operations that can be catastrophic.

The HalfPike is one way to mitigate those issues in an easy to reason and diagnose method.

Happy coding and may your pager stay silent!

About the Author

John Doak is the manager of Process Engineering for the Azure Fleet Program and the Principal Automation SWE for Azure Fleet at Microsoft.

Previously he was a Google Staff Site Reliability Engineer, a Network Systems Engineer (a now defunct subtype of SRE for Network Systems of which he was Google’s first), and a Network Engineer (among other titles).

In a previous life he worked on movies and games for LucasArts/LucasFilm/ILM as a Network Engineer/Systems Admin.

Website (Golang, SRE):,
Website (Photography):