• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
Newsletters

Why teaching A.I. to read is a lifelong endeavor

By
Jonathan Vanian
Jonathan Vanian
Down Arrow Button Icon
By
Jonathan Vanian
Jonathan Vanian
Down Arrow Button Icon
October 27, 2020, 12:34 PM ET

It’s not just tech giants that are using artificial intelligence to understand human language, so that products like digital assistants can respond to basic questions.

More conventional businesses are also increasingly using a subset of A.I. called natural language processing (NLP) to create more powerful software to help answer basic customer call center queries or create summaries of long, complicated documents. 

LexisNexis, for instance, has been using NLP to improve the legal research software that lawyers, journalists, and analysts use to find relevant court documents. It’s light years ahead of the user unfriendly Boolean search system that I regularly used over a decade ago as a cub reporter.

With A.I., LexisNexis’ search interface is more intuitive. That’s partly because the company used Google’s free, open-source language model BERT as the foundation. The BERT model, trained on a vast amount of web data including Wikipedia pages, helps software better understand how some words mean different things depending on the context in which they appear.  

But LexisNexis can’t use BERT for all of its language needs because the company deals with information that is specific to the legal industry. This particular data can’t be found on the open web, which means the information doesn’t come baked into BERT.

Min Chen, vice president and chief technology officer for the Lexis Nexis Asia-Pacific and global search team, said that BERT “provides a good base model to start with.” But the company must fine-tune the technology with additional legal data so that it understands legal linguistics.

This fine-tuning is increasingly common for many companies operating in areas like finance or healthcare. Every industry has its own lingo that makes no sense in another context.

Chen said it took LexisNexis 12 months to train a version of BERT that understands case citations and even Latin. If someone wants to find a document showing that a case has been adjudicated, or closed, the technology knows to look for documents with the Latin term res judicata (claim preclusion, or a matter decided). 

As Amanda Stent, an NLP expert for financial news and information service Bloomberg, explained, technologies like BERT are important because they remove a lot of the grunt work required to train a language model from scratch. For a 10-word sentence, Stent said, “the combinations [of words] are astronomical,” and having a powerful language model like BERT as a starting point is very helpful.

But as other A.I. researchers have pointed out, because language models are typically trained on Internet data, they sometimes parrot back the offensive text they’ve scanned. You’ll be happy to know that companies can take precautions to make this less likely.

Stent and her colleagues recently published a best practices that companies can follow when training A.I.-powered language models and other machine learning systems. They recommended using human subject-matter experts to help annotate and label the text used for training (to ensure data is labelled accurately) and ensuring that product managers and engineers coordinate on big projects (to help ensure that problems don’t slip through the cracks).

The goal is to eliminate any problems before companies introduce new products. After all, no user wants to be bombarded with vile language.  

One thing companies should be prepared for is that data training projects are never done. There’s always room for improvement. 

Said Stent, “It never stops.”

Jonathan Vanian 
@JonathanVanian
jonathan.vanian@fortune.com

A.I. IN THE NEWS

Speaking of NLP. Eye on A.I.’s Jeremy Kahn takes a look at AI21 Labs, a NLP-focused startup founded by prominent machine learning researchers that aims “to fundamentally transform how we read and write." As opposed to other language models like OpenAI’s GPT-3, Kahn writes that the startup’s “system is a fusion between neural network-based language models and an older form of artificial intelligence that seeks to represent human knowledge, like vocabulary and the meaning of words, in a graph structure.”

Enter the A.I. Threat Matrix. The non-profit and security focused MITRE Corporation, Microsoft, IBM, Nvidia, Bosch and a host of other companies teamed up to release the Adversarial ML Threat Matrix, which VentureBeat described as “an industry-focused open framework designed to help security analysts to detect, respond to, and remediate threats against machine learning systems.” The goal is to help companies better secure their machine learning systems by thoroughly understanding all of the ways hackers can crack modern A.I. software. The authors of the threat matrix said via GitHub, “Data can be weaponized in new ways which requires an extension of how we model cyber adversary behavior, to reflect emerging threat vectors and the rapidly evolving adversarial machine learning attack lifecycle.

How to bring “dead languages” back to life. MIT researchers are using machine learning to “automatically decipher lost languages that can no longer be understood,” technology publication CNET reported. The researchers created an algorithm that analyzes the patterns of how languages develop over time to help uncover the forgotten languages.  From the report: “Going forward, the team hopes to expand its work to identify the semantic meaning of words, even if they're not readable yet. It ultimately hopes to be able to resurrect lost languages using just a few thousand words.”

The FDA sounds the A.I. bias alarms. Bakul Patel, the director of the U.S. Food and Drug Administration’s new Digital Health Center of Excellence, explained during an online meeting how biased and unclean data could cause machine learning software to misfire and “negatively impact patient care,” industry publication MedTech Dive reported. “We don't want to set up a system and we would not want to figure out after the product is out in the market that it is missing a certain type of population or demographic or other aspects that we would have accidentally not realized," Patel said.

EYE ON A.I. TALENT

Censiahas picked Deborah Leff to join the enterprise software startup’s board. Leff was previously the global leader and industry chief technology officer for data science and A.I. at IBM.

Nautilus hired Garry Wiseman to be the fitness company’s senior vice president and chief digital officer. Wiseman was previously the senior vice president of digital customer experience for Dell Technologies.

EYE ON A.I. RESEARCH

When auditing A.I. research, look at the conferences. Technology analysis website TechTalks looks into a recent research paper describing the review process that researchers face when attempting to submit their papers to The International Conference on Learning Representations. The authors of the research paper, who are currently anonymous, claim that they have found some problems with the submission process, including “evidence for a gender gap, with female authors receiving lower scores, lower acceptance rates, and fewer citations per paper than their male counterparts.”

As TechTalk notes, the research paper notes several instances of bias, including the conference organizers showing “significant preference for Carnegie Mellon, MIT, and Cornell universities.” Researchers who published their papers on the popular arXiv preprint server prior to submission also did better, especially if they came from those top-tier universities.

From TechTalk:

Interestingly, their research did not find a significant bias toward large tech companies such as Google, Facebook, and Microsoft, which house reputable AI researchers. At first glance, this is a positive finding, because big tech already has a vast influence over commercial AI and, by extension, on AI research.

But as other authors have pointed out, the same academic institutions that are very well represented at AI conferences serve as talent pools for big tech companies and receive much of their funding from those same organizations. So this just creates a feedback loop of a narrow group of people promoting each other’s work and hiring each other at the expense of others.

FORTUNE ON A.I.

Former Facebook employee’s new book exposes Big Tech’s dirty secrets—By Danielle Abril

Startup cofounded by A.I. heavy hitters debuts editing tool it hopes will ‘transform writing’—By Jeremy Kahn

Is it time for a new agency to oversee Big Tech? Many say yes—By Jeff John Roberts

Here’s what Amazon’s new Echo speakers are like—By Jonathan Vanian

How Lyft became the company with nine lives—By Beth Kowitt

A possible semiconductor shortage looms over Huawei’s new smartphone launch—By Naomi Xu Elegant

BRAIN FOOD

A.I. takes to space. Researchers have found machine learning technology to be an excellent tool for analyzing space data. In 2017, for instance, NASA and Google used neural networks to comb through imagery data captured from the Keplar space telescope, and uncovered a couple of planets far outside of our solar system. More recently, researchers from NASA’s Jet Propulsion Laboratory have used machine learning to identify recently formed craters on the surface of Mars. Space.com reports:

Scientists have fed the algorithm more than 112,000 images taken by the Context Camera on NASA's Mars Reconnaissance Orbiter (MRO). The program is designed to scan the photos for changes to Martian surface features that are indicative of new craters. In the case of the algorithm's first batch of finds, scientists think these craters formed from a meteor impact between March 2010 and May 2012. 

About the Author
By Jonathan Vanian
LinkedIn iconTwitter icon

Jonathan Vanian is a former Fortune reporter. He covered business technology, cybersecurity, artificial intelligence, data privacy, and other topics.

See full bioRight Arrow Button Icon

Latest in Newsletters

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Newsletters

woman typing on a computer.
NewslettersMPW Daily
The ‘AI gender gap’ narrative is missing the full picture
By Emma HinchliffeApril 9, 2026
10 hours ago
Even Nvidia’s own research teams can’t get enough GPUs amid the race for AI computing power
NewslettersEye on AI
Even Nvidia’s own research teams can’t get enough GPUs amid the race for AI computing power
By Sharon GoldmanApril 9, 2026
11 hours ago
Senior executive team together in conference meeting room in contemporary modern office bright sunny daylight sunset dusk talking discussing planning organizing strategy.
NewslettersCFO Daily
The white-collar jobs most exposed to AI, according to Anthropic’s own data
By Sheryl EstradaApril 9, 2026
15 hours ago
Bobby Healy stands in front of a Manna drone with his arms crossed.
NewslettersTerm Sheet
ARK Invest is betting on underdog drone delivery company Manna to beat out Alphabet and Zipline
By Lily Mae LazarusApril 9, 2026
15 hours ago
Why CEO Michelle Gass is thriving at Levi’s after stumbling at Kohl’s
NewslettersCEO Daily
Why CEO Michelle Gass is thriving at Levi’s after stumbling at Kohl’s
By Phil WahbaApril 9, 2026
17 hours ago
Meta chief AI officer Alexandr Wang in New Delhi on February 19, 2026. (Photo: Ludovic Marin/AFP/Getty Images)
NewslettersFortune Tech
Meta takes the wraps off Muse Spark
By Andrew NuscaApril 9, 2026
17 hours ago

Most Popular

The U.S. government is spending $88 billion a month in interest on national debt—equal to spending on defense and education combined
Economy
The U.S. government is spending $88 billion a month in interest on national debt—equal to spending on defense and education combined
By Fortune EditorsApril 9, 2026
16 hours ago
Gen Z doesn't want your full-time job. They want several part-time roles, and it's reshaping the entire workforce
Success
Gen Z doesn't want your full-time job. They want several part-time roles, and it's reshaping the entire workforce
By Fortune EditorsApril 9, 2026
19 hours ago
2 years ago, Saudi Arabia quietly canceled the ‘petrodollar’ deal with America that wired the world economy for 50 years. Then war broke out in Iran
Energy
2 years ago, Saudi Arabia quietly canceled the ‘petrodollar’ deal with America that wired the world economy for 50 years. Then war broke out in Iran
By Fortune EditorsApril 7, 2026
2 days ago
A Meta employee created a dashboard so coworkers can compete to be the company's No. 1 AI token user—and Zuckerberg doesn't even rank in the top 250
AI
A Meta employee created a dashboard so coworkers can compete to be the company's No. 1 AI token user—and Zuckerberg doesn't even rank in the top 250
By Fortune EditorsApril 9, 2026
18 hours ago
Self-made billionaire MrBeast says his work-life balance is nonexistent and calls it a ‘miracle’ if he works less than 15-hour days: ‘I live to work’
Success
Self-made billionaire MrBeast says his work-life balance is nonexistent and calls it a ‘miracle’ if he works less than 15-hour days: ‘I live to work’
By Fortune EditorsApril 8, 2026
1 day ago
White-collar workers are quietly rebelling against AI as 80% outright refuse adoption mandates
AI
White-collar workers are quietly rebelling against AI as 80% outright refuse adoption mandates
By Fortune EditorsApril 9, 2026
17 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.