Steve Randich, FINRA’s CIO, Describes the Fun and Forward Thinking Behind the Regulator’s Move to the Public Cloud and the Journey Ahead as CAT Milestones Continue to be Met
What Steve Randich has achieved at the US Financial Industry Regulatory Authority in terms of converting a regulatory agency into a technology company is nothing short of revolutionary and truly visionary. He has been the catalyst for a line of thinking that now flows naturally through that organization and is transforming the profile of the modern regulator. What has happened there sets a perfect example to the members of FINRA (and all other financial services firms) that now benefit from the innovation that has been supercharged by this approach.
When you were thinking about doing this, in terms of jettisoning a data-center approach and harnessing the public cloud, how did you convince everyone, pitch this transformation and make it digestible?
I arrived at FINRA in March 2013 (which was an eternity ago in public cloud years), and the industry was contemplating building the Consolidated Audit Trail (CAT). FINRA has been running OATS and surveillance for many of the stock exchanges for 20 years and you can view CAT as a modernization of that. We were looking at the volume projections and the current operational challenges with what was already being processed; the team was looking at buying bigger hardware kits to support it. That view was troubling as there was not much that was commercially available and had been proven as viable to accommodate this. We were Greenplum’s biggest commercial client and had Greenplum and Netezza data warehouse appliances which were highly scalable for that era, but it was vertical scale rather than horizontal. When you run out of scale, you have to buy a bigger box. These vendors were nearing end of life as they had pushed to the limits of Moore’s Law. So when I first arrived, while there was a core of the team looking at the potential for bigger boxes, there was also a skunkworks team that already believed that was not an option. These folks were allowing developers to literally think outside the data storage box and go play with Amazon’s new toys in late 2012. This skunkworks unit were already believers in an alternative approach when I landed.
Our view was that the only way to handle this scale for what we had, and certainly for CAT, was to have a horizontal scalable architecture. You just can’t do that in your own data center when you are the size of FINRA. We are not a huge government agency or a Fortune 50 company with hundreds of thousands of square feet of data center space. You could build this horizontal structure in a lot of the firms that are building hybrid or internal clouds, but it was not economically viable for us. We opened our eyes and realized we had to go public cloud; it took three months of analysis to convince ourselves we had to do it. It became self-evident after all our research. We had no choice and that argument drove the debate for anyone who had to approve it. By July 2013, all the principals that mattered were on board and we turned to the devils in the details. How secure would it be? We took six months dealing with the security challenges while we were building the architecture. The design and vendor selection ran in parallel with the security resolution. All this was done by the end of 2013, and early 2014 we were well on our way to starting this journey with Amazon as our partner.
Were there naysayers then? People obsessed with security? How did you overcome those who were big negative influencers pushing against this substantial change?
We still have them. Not as many, but people are scared to death of the data security concerns and having their data not only in a vendor but in an environment where you are co-resident with other clients. It is literally virtual and ubiquitous. So we created a document comparing what we would do in the cloud to what we were doing in our own data center and from a logical and physical security standpoint we demonstrated that if we went to the public cloud and did it right (and that is a key point), it would be at least as secure as what we were doing privately. That is the argument we took to senior management, our board, the SEC and other interested parties like exchanges and broker-dealers (BDs). We told this story again and again for two years while we had already begun migrating our crown jewel applications to the cloud. In the fall of 2015, halfway through our journey, the tech industry (the Gartners, the Forresters, bloggers and pundits) suddenly caught on to the potential to go to the public cloud (if done right) and how it could improve security compared to what CIOs were doing in their own data centers. Fast forward to now and I think most smart people in the industry know that.
When you have to balance security with agility, what are the key considerations?
I have run tech at two exchanges, a huge BD/bank and now FINRA. I observed at Citi that complexity increases disproportionately with size. A BD that is twice the size of another BD is far more complex. We are a medium-sized firm and can manage our perimeters, policies and compliance policies fairly well because of our size. We don’t have huge complexity. On top of that we had outsourced all IT and development to EDS in the 90s and it was insourced just before I arrived. Part of that saw us making some smart choices to clean up, commoditize and standardize our architecture, and retire legacy platforms to help further reduce complexity. More importantly, the way we did our cloud and our dev ops architecture and processes relies on extreme automation. Cloud providers say this has to be in your culture and DNA to drive speed, agility, security. This removes the element of human judgment.
Highly automated dev ops allows application of patches automatically as part of our build with no human involvement and limited overhead for doing patches. The security and compliance checks are part of that automation. We have robotized compliance so security is built into our agility. Lots of firms recognize this but find it hard to achieve as you cannot just take a dev ops process that is automated and modern, and apply it to legacy platforms. It needs to be integrated from an architectural level from top to bottom in the stack. So when we migrated our apps to the cloud, we addressed the entire stack to provide the level of automation that our dev ops need. So a big BD that does not have automation in dev ops and has not moved to cloud but has lots of legacy platforms, is going to struggle mightily with this question of agility balanced with security. But for us they are naturally not a trade off.
What is cloud done right?
Whenever we read about a security breach with a cloud provider, our antennae start twitching as we know industry people will question what we are doing at FINRA. Capital One is a classic example, but there have been others; when you look into those you will see they were not doing cloud right. They were going away from cloud providers’ intended use of their services. This might be because IT folks tend to think they are generally smarter than others. We respect Amazon’s expertise, as our primary cloud provider and partner, the design of their architecture and its intended use. One example of that is a challenge today, and that is where we might want to go away from their recommendation.
In the terrorist attack of 2001 and its impact on the securities industry, one of the regulatory mandates was the need to have geographically diverse location of data centers to 300 miles and we had this in our on-prem environment, and we did for decades. The problem is that when you are in the cloud and you want to benefit from multiple availability zones and not have speed of light issues – so your data and apps are ubiquitously running in different data centers and availability zones – it does not allow you to have this classic geographical diversity of a dual data center approach. So many clients and FINRA and Amazon looked to craft something to comply with this across regions but it would involve more cost, complexity and less reliability.
We did manage to convince Amazon to provide that virtual and ubiquitous footprint from a compute (data and processing) standpoint across multiple regions. They are building this for FINRA and it will be generally available for the industry. If we had done this independently without Amazon, it would have been cloud done wrong, and we knew that. There are lots of examples with clients who have specific requirements and go coding around the offering from Amazon that compromises security, reliability and other important parameters.
You are a technologist through and through and always have been. How do you best educate your colleagues and fellow senior management who are not as comfortable and knowledgeable about tech?
I have been fortunate that in three of the four CIO roles I have had, going back to the 90s, I was in the C-suite rather than residing too far down in a big organization. So I went to all the board meetings and I would often have to present to some people who were not technical. You have to know your audience and base your notes, writeups and explanations assuming they know very little about the subject. I also use common knowledge analogies, like automotive and manufacturing, to remove the IT context. This is better for those who are naturally uncomfortable with technology.
Tech fear is twofold: where people don’t have the confidence to understand it, like the fear of math, and also when they don’t want to be exposed to their peers as to their knowledge level. And this is not improving nearly enough. You hear many companies boasting they are really technology companies yet there has been surprisingly little progress or substance behind those claims.
What are your views on machine learning and its potential application in the next three to five years? Are there any boundaries?
I am a big believer in AI, machine learning, deep learning. We have seen trends come and go. The web was a big hype cycle that proved true, the PC was transformative, but I think this one will change the world and it could be bigger than any of them in terms of its impact on labor and the workforce. When I got my degree in the early 80s, there was AI going on. We have been talking about it for 35 years, but what is different now is the ability to bring the compute and data together at scale to offer infinite horizontal scalability. This grants the processing power to do the work that has been limited by capacity.
We had to go through many cycles to complete proper surveillance five years back. We would see something in the market that was interesting so we would want to take a closer look. This required human intervention to stage things. It was like a gourmet chef who has to cook dinner for 22 people on a two-burner stove, constantly moving things around, planning, dealing with mistakes. Very hard to do well. But now we can program machine learning algorithms without any human element in that lifecycle. The machine can do more unencumbered by human intervention. Being able to self-train the algo with some human guidance, with IT operations being disintermediated, opens things up for the technology and Moore’s law to be unconstrained.
The next five to 10 years will see tech advances the like of which we have never experienced. This has been documented recently in the Second Machine Age (MIT) where the professors are comparing where tech is today to where science and society was at the advent of electricity and the steam engine which were available for 50 years before factory design could actually take advantage. Computers have been around for about 50 years but they represent the infancy compared to the 50 ahead, and I think machine learning is going to be at the center of that.
What excites you most about CAT?
Right in the here and now, what is most fun is the actual delivery of major milestones, making the dates, to everyone’s surprise because of the long build-up to CAT. Whenever we hit a date there is an internal celebration with feelgood high-fives in the hallways, which is indescribable. We’re doing it. That is the obvious one. Secondly it is the data volumes. When CAT was projected, the spec was for 58bn daily market events and at that time our peak was about 30, so twice our norm. It looked a big number. This week (Jan 20) we did 265bn and it keeps going up. These volumes are unbelievable. And this is occurring without any operational blips or exceptions. The scale and the Amazon service is automatically provisioned and can process a record peak and then shut down to nothing, without any human intervention. That’s our architecture and that is fun and exciting. That number could go to a trillion and hopefully our architecture won’t know it. A normal approach would call for 70,000 instances of a server. Our total data center pre-cloud running every application that FINRA needed was 3,000. Do the math – we went from 3 to 70 but our costs went down on a massive footprint where we could never have provisioned this and had backup in two data centers before. Our aggregate IT infrastructure costs went down; this is all virtualized and commoditized and we have programmed our algo to find spot pricing on Amazon, so we can run our stuff in the middle of the night when demand is lowest and it costs less despite our deploying more applications to the cloud.
How much interest is there in what you are doing and a desire to learn from this progress where you have forged ahead? It must be huge.
This may be the most fun part of the story. When we decided to do this we recognised there was no one else doing it in the same area (regulator or financial services firm) or of the same size. We started a program called Raising FINRA’s Technology Profile (RFTP). This meant we were then committed to anything that would raise the profile of what we were doing such as speaking at events, doing articles like this, running hackathons at colleges, applying to win awards, media appearances, etc. Anything that associated FINRA technology with public cloud technology; and it got a lot of attention and also helped our tech recruiting enormously. But the unintended consequence was a stream of something like 200 companies coming to us to find out specifically about our cloud journey and the best practices. So this includes major manufacturers, foreign and US regulators, pharma companies, BDs, fast food chains, and the list goes on.
We tell the story as we have a standard script that nine of us can run in our sleep now. We can listen to these people and know exactly where they are in the lifecycle. In five years it has evolved from disbelief, to suspicion of hype to needing to know more – they come back and bring their CISOs and other leaders and so now it feels like people come converted and discover what they need to do, but often they cannot quite get the momentum they need to do that internally. I have views on why that is, but that is for another interview!
Build or buy – what are the major considerations in making that call based on your experiences? Can FS firms really be tech firms?
My bias at FINRA is to build unless it is obvious software like email systems, spreadsheets CRMs etc. These are off-the-shelf commodities that are not going to differentiate your competitiveness. But core business applications that drive your revenue and mission need to be built. I have had the opportunity through luck to have been at firms that have great development pedigrees. FINRA has pound for pound the best software development team I have ever managed and that is saying a lot. So if you have that asset and culture and the amount of innovation and capability for automated dev ops to achieve cloud migration, you need to optimize that asset. Buying software needs vendor and contract management and that is not our best skill, we are developers and that is our core competency. But there is a hybrid approach as we look at open source as a route to use thousands of specific components at very little cost, like pieces in a puzzle, to build our custom apps. We use open source on a huge scale so we don’t go building these components to craft the solution we need.
As a leader of 1,200 people at FINRA, what is your approach to this and how do you connect with such a large group?
We have a comprehensive communications strategy which includes newsletters, announcements, emails, memos and “town halls”; and in those “town halls” I tend to be frank, honest and transparent even in front of large crowds. So people come as they know they will hear some news. Usually we do about three a year. They are standing room only with an eager group of people waiting to hear what will be shared – it’s a huge connector. We build a culture of collaboration and innovation where we have lots of events. We are doing our fifth annual Createathon, which has half the employees engaged for three consecutive days with the main focus being innovation. We do fun stuff together.
We are not dispersed but all on open floors in one building in Rockville where we took down all the walls three years ago to open it out so you can walk the floor and pretty much see everyone rather than have people hidden away behind drywall. So lots goes on that is impromptu, like celebrations, and lunch and learns. 1,200 is not that big – compare Citi where there were 18,000 across different countries. So my management approach is in line with that. I walk the floors, I talk to people, I want them to see me, and know that I am here as an active participant in our work. I make sure the management team that is in place approaches it similarly
In your time at other firms and here, what do you think affects culture most?
Size and footprint first of all and I like smaller companies. My management style is conducive to be able to see everything boundary to boundary easily. Now with 1,200 people I will not know everyone’s name and I think humans, unless you have special skills or are a politician, have a 200 to 300 person limit to actual relationships. It also needs self-realization of your skillset – what you can manage from a personal perspective in terms of size.
The second piece is the command and control versus collaborative culture and the big firms I have worked at have been collaborative, with matrix reporting like Citi and IBM. Nasdaq and Chicago Stock Exchange were hierarchical management cultures and FINRA is too. FINRA is very similar to Nasdaq as they used to be the same company and were basically parent-child previously. So because of that I tend to do well in that sort of company. Your decision authority is very clear and I am confident in my decisions and when I know what direction I want to go I don’t want to spend six months re-asking for permission, and constantly having to re-evaluate an obvious decision. So it gives you an ability and agility to move forward. We never would have got this cloud migration done had we been a less command and control business. It is a hidden benefit that is part of our secret sauce. I like to show my work because I want to bring people along, but people respect decisions under command and control and it is easier for leaders to make calls and get stuff done.
Your father was at AT&T and you coded from the get-go. In terms of young people today and where they should be headed, would you have done anything differently?
No. There is a progression in your childhood. In the early part you don’t have any limitations so rock star, astronaut, race car driver or even more impractical career aspirations, like basketball star seem viable. In eighth grade and high school you start to ask what you will do with your life and think more about being a psychologist, doctor, lawyer. For me at that first mature moment of self awareness it was computer the whole time.
My dad grew up very poor in the Depression and he was really afraid my brothers and I would be in the house in our 20s and 30s! A common concern then and now. So he pushed us into fields where we would be getting multiple job offers when we graduated. He saw I was good at math and science in my early years and liked it, and he was in the computer field himself so he sort of ordained I would go that way. He pushed a bit and I was willing. I still code; I have my own development environments that I use on occasion. The last software I wrote commercially was when I was at IBM on a GE contract in ‘96.
What do you do to decompress from work?
My biggest hobby now is hi-tech aquaria, with several hundred gallon tanks of multiple varieties at home which is extremely challenging. It is very different to my work and is actually very technical from a water chemistry and biological filtration perspective. It gives me the departure from work I need.
Technologists like interesting books – any recommendations?
I look at the world through the logic and reason lens. So I only read non-fiction. In the last year I read Lawrence in Arabia by Scott Anderson which`is a tremendous book. I also re-read the Undoing Project by Michael Lewis, who I think gets some of his books wrong because of limited research as he publishes so much. Flash Boys was not so good, but this one he got right based on research from the 60s and 70s about human judgment. The Second Machine Age is a great book, as is Sapiens. I like the historical view of human development more than where it might go. I just started Bad Blood and I think the whole Elizabeth Holmes story is just incredible – when the company mission is so virtuous and good for humankind, people will lie, cheat and be corrupt to justify the means to achieving the mission. They felt they could overlook that they were defrauding the public. That leads me to Fewer, Richer, Greener which is the counter argument to the whole climate change and green argument that the world is about to end, but I only just started this in the last few days.