The Trouble With Machines: SEC Big Data Czar on Smarter, Cleaner Regulatory Submissions

Published On January 28, 2019
7 MINUTE READ

Radar met Scott Bauguess, deputy chief economist and deputy director, Division of Economic and Risk Analysis at the Securities and Exchange Commission, to talk big data, machine learning, and AI in compliance and filing.

The Securities and Exchange Commission is on a mission to clean up regulatory reporting and firms who muddy disclosure data are being put on notice, a senior director at the agency told Radar in an exclusive interview.

The proliferation of unclean, inaccurate or entirely unsuitable information filed is rife, despite modern techniques allowing for sophisticated data that is both machine-usable and human-readable, said Scott Bauguess, deputy chief economist and deputy director in the SEC’s Division of Economic and Risk Analysis (DERA).

Bauguess told Radar that while artificial intelligence is transforming machine readability in regard to decision-relevant information, market participants who demand better quality data from firms to make more informed choices are being short-changed.

“We have been working for a number of years to provide registrants with help in making their information more machine-usable,” said Bauguess.

“One of the issues we face today is the difference between complying with the letter of the rule and the spirit of the requirement; there are a lot of things you can do to make your data machine-readable, but not machine-usable.”

Firms and individuals who file consistently inconsistent data risk further investigation by the team, whose role is to detect fraud and misconduct in SEC investigation and examination programs, specifically in the areas of corporate issuers, broker-dealers, and asset managers.

“In some cases, it is just a matter of better understanding the technology; firms often don’t know they are making their disclosures hard to use,” said Bauguess, who also business manages the SEC’s Tips, Complaints, and Referral (TCR) system, launched in 2010 to further combat market misconduct.

In other cases, the SEC is examining where deliberately obstructive data may have been submitted as a way of making it difficult for rivals to steal a march on each other.

As an example, a filer can comply with reporting standards by entering a valid date, but if the date doesn’t match to the event or action being reported, then a machine learning algorithm will be assessing incorrect information.

“No amount of data format validation can fix a reporting error.”

Bauguess told Radar the agency had been building up to the statement in AI regulatory reporting for “quite some time”, and that over the last decade the SEC had tweaked the nature of many financial disclosures to make them more machine-readable, with varying degrees of success.

Embracing big data, and more widespread gathering and cleaning of structured and unstructured data for use inside a business, whether for regulatory filing or for analysis in other matters, requires a cultural shift that will take time, Bauguess recognizes.

A similar evolution has taken place at the SEC, where the DERA team also use sophisticated data analytics to snuff out potential market misconduct.

The EDGAR filing system contains financial information covering more than $82trn of assets under management by registered investment advisors, and hosts financial statements by publicly-traded companies with an aggregate market cap of approximately $30trn.

Since its inception, there have been more than 11mn filings by more than 600,000 reporting entities using 478 unique form types, and during the calendar year 2016 alone, there were more than 1.5bn unique requests for this information through the SEC.gov website.

It is therefore essential for the regulator to have high-quality and usable data, and parity of technology between the SEC and the sector may be a lot closer to home than some would think, as the direction of travel is of further digitalisation of financial information.

In Europe, the European Securities and Markets Authority (ESMA) is introducing a single electronic reporting format from 2020. In February, the UK Financial Conduct Authority asked the industry for their views on how to make the filing of compliance documents machine-readable in order to simplify the process.

Both regulators take their cues from the US. One recently proposed rule requires SEC regulated firms to file their periodic reports in Inline XBRL. Currently, filers report a human-readable HTML version of a periodic report separately and a machine-readable version in an eXtensible Business Reporting Language (XBRL) format. If adopted, the rule would combine the two requirements and create a single document designed to be read equally well by humans and machines.

From a machine learning perspective, this standardized data combined with other relevant financial information and market participant actions can establish patterns that may warrant further inquiry from regulators.

This can ultimately lead to predictions about potential future registrant behavior, which are precisely the types of algorithms that DERA staff are currently developing.

It has been an issue Bauguess has kicked around since joining the agency in 2007, and the big data market has also sensed opportunity in the new age for financial reporting.

According to research by Deloitte, at least 25 new regulatory reporting companies offering artificial intelligence solutions have appeared in the last five years, more than double the number of traditional software providers originally serving the market for the previous 10 years.

“Something that has happened recently that has been a boon for us is the take-off of machine learning and AI, and greater awareness of the need for more organized data,” Bauguess said.

“Everyone wants to have systems that can process the data and generate new insights, but it’s only going to work if you have good, clean well-structured data.”

Bauguess said his team is concerned the emergence of some regulatory technology solutions bear signs of Hans Christian Andersen’s weavers supplying new clothes to the Emperor.

“I view the space right now as something like the hybrid electric vehicles on our roads; we weren’t ready to go straight to electric, hybrid exists because the technology wasn’t there,” he said. “It’s the same with machine learning, there still needs to be a human connector; someone to understand a sufficient amount of what an algorithm is doing to make the results actionable.”

In his role helping oversee the Division’s risk assessment and data-driven, predictive analytics development, Bauguess is especially wary of the types of black box solution that come with promises of minimal oversight from the compliance officers relying on it.

“What I see happening is that machine learning algorithms will circle towards collaboration with the human compliance officer as opposed to performing the actual compliance function.”

“That is similar to what we are doing at the SEC, we’re giving staff analytical tools to accelerate their existing work processes, but none of them are relying on machine learning algorithms alone to make decisions.”

He said he believes truly machine driven decision making is “years away”. “In general, I think there is a lot of hype around what is possible with this technology today,” he said. “I believe we are still in the ‘proof of concept’ phase with most aspects of supervisory and compliance functions.”

Alongside a PhD in Finance, Bauguess also holds a BS and MS in Electrical Engineering and prior to his doctoral studies spent six years working as a technology engineer.

He is a new breed of regulator that understands and is comfortable using the technology he and his SEC colleagues are monitoring, but he believes the next generation coming through will move things along even faster.

“I feel I am becoming part of the old guard, while still being part of the new guard,” he said. “We are finding attorneys that can program, it’s the nature of the next generation. They grew up with phones in their hand and are hungry for this. The older generation will resist, and I’m not sure where the tipping point is. I think in five to 10 years we will see the culture of organizations really change to embrace it.”.