Infrastructure

America’s DataHub Consortium: Seeing — and Understanding — the Entire Elephant

April 5, 2022 1577
Woodcut shows parable of blind monks examining an elephant-
The Buddhist parable of blind monks examining an elephant, here rendered by Hanabusa Itchō (1652-1724). finds each man reaching a different conclusion about the animal based on which part of the elephant he touches. (Image: Library of Congress via Picryl)

The phrase “data-driven” is a modern cliché. It’s generally used to characterize decisions or strategies as being based on some sort of objective data. But was that data actually relevant to the situation at hand, or was it missing something important? Much like the ancient parable of the blind men and the elephant, it’s all too easy to rely on data that is incomplete or lacking in context; one person touches the trunk while another touches a tusk and they come away with two very different conclusions about the animal. Both data-driven, neither correct.

By simultaneously examining the entire “data animal” from every angle — trunk to tail — what could be learned? A new partnership of public and private data-research organizations aims to find out. Led by the U.S. National Science Foundation’s National Center for Science and Engineering Statistics, or NCSES, America’s DataHub Consortium is exploring what is possible when complex data from different sources are linked and analyzed in new ways. 

Emilda Rivers

“The secret sauce of America’s DataHub is bringing people together who normally would be in their siloed places working on important problems independently,” says NCSES Director Emilda Rivers.

That means linking data stored across the federal government, including within its 13 principal statistical agencies, explains Rivers. Each of those agencies individually collects statistical information for their particular area of interest such as health, economics, labor, agriculture, energy and others.

“We’re a big country,” says Rivers of the broad challenges in collecting critical data about so many diverse aspects of American society. “Our statistical agencies focus on meeting their individual missions and they do it extremely well,” she says. “But we need more collaboration.”

Logos for National Science Foundation, National Center for Science and Engineering Statistics, and America's DataHub Consortium
(Images: National Science Foundation)

As the lead agency for America’s DataHub, NCSES is building upon its expertise in linking disparate data sources to understand the progress and trajectory of science and engineering in the U.S. and globally.

“The idea behind America’s DataHub has a long lineage at NCSES because we do it all the time,” says NCSES Deputy Director Vipin Arora. “Getting access to different data sources, bringing them together, linking them to do analysis — this is the kind of work we do on statistical products like the Science and Engineering Indicators and the Women, Minorities, and Persons with Disabilities in Science and Engineering reports.”

NCSES’s ultimate goal is ambitious: Un-stovepipe the nation’s elephant-sized treasure trove of data so that leaders at the federal, state and local level can use it to understand the issues they face and make informed decisions that help their communities and citizens.

Building evidence from a sea of data

It is hard to overstate the value of reliable, useful information when contemplating a difficult decision. That is true for just about everyone, including business owners, entrepreneurs, educators, healthcare providers and members of Congress. In fact, Congress formally recognized the value of data in 2018 with the Foundations for Evidence-Based Policymaking Act. The act required federal agencies to figure out methods and analytical approaches for developing evidence that supports policymaking and to make their data “accessible and useful to the public.”

NSF logo
This post by Jason Stoughton originally appeared at Science Matters by the U.S. National Science Foundation and is reposted with permission.

“America’s DataHub is about building evidence, writ large,” says Arora.

That evidence can be used to identify the best paths to achieve things of immense value like accelerating technological innovation or expanding job opportunities on a local or even national level.

That grand mission comes with comparably grand technical challenges, such as how to access and link myriad sets of statistical data about everything from automobile ownership rates to the average annual income for farmers to community college enrollment numbers.

“The 2018 evidence-based policymaking act has something called the presumption of accessibility,” explains Arora. “The idea that statistical agencies can go out and get data from any agency to use. But that’s not really been tested. So how do you do that? Where are the choke points? Figuring that out is a big part of the innovation that’s going to happen.”

The really big questions 

To start solving those puzzles, NCSES has identified some complex questions that the consortium partners of America’s DataHub are digging into now. The first task is to analyze the availability of and demand for scientists and engineers on a global scale. That includes building evidence to fully understand the public value of recruiting scientists and engineers from other countries and training them in U.S. universities and labs. The consortium of public and private organizations undertaking this project through America’s DataHub includes Accenture Federal Services, Clarivate, The Coleridge Initiative, NORC at the University of Chicago, and RTI International.

“Those aren’t small questions,” says Arora. For example, if the U.S. funds the education and training of foreign-born scientists and engineers, what is the total benefit to U.S. taxpayers now and into the future? How many jobs and new industries would be generated? How many resulting inventions, medical therapies, must-have gizmos and other innovations would be created in the U.S. versus other countries? How many American kids might be inspired to pursue a career in science and engineering?

“This question about foreign-born scientific talent is far-reaching,” adds Rivers. “The evidence that America’s DataHub is building could have a huge impact, like how we set up our graduate degree programs in the U.S. and the sorts of visa policies we put in place.

“It can also help us broaden participation in STEM and understand where the ‘missing millions’ are,” she says, describing Americans who are not part of the science and engineering workforce because they have not had the necessary educational and professional opportunities.

Ideas + innovation + data geeks 

“What excites me is that we’re connecting NSF to our citizenry,” says Keith Boyea, deputy director of NSF’s Division of Acquisition and Cooperative Support and one of many NSF staff members working behind the scenes on the contractual underpinnings of America’s DataHub. “We are reaching people that we’ve never reached before.”

Some of those people include the staff and leaders of state and local governments.

“State and local governments provide data to federal agencies that go into our official statistics,” says Rivers. “But right now, they can’t get that data back to use for their own state or county. Imagine a scenario in which they can securely link to data and use it for state and local government decision-making.”

“That means putting data in the public’s hands. It’s not just people who are data geeks like me that can have access to data. That’s what I see America’s DataHub growing into.”

When more people with different perspectives have access to the best data, they bring ideas and innovation that can lead to original ways to solve problems.

“Ideas plus innovation plus data geeks equals America’s DataHub,” says Rivers. “So, bring your ideas.”

Jason Stoughton is a science writer and communications specialist for social, behavioral and economic sciences at the National Science Foundation. In addition to being a science communicator — and unabashed science fanboy — he is also a photographer and video producer. Before coming to NSF, Stoughton worked in public affairs at the National Institute of Standards and Technology and also in the television and video production industry in a variety of roles from creative director to chief coffee fetcher.

View all posts by Jason Stoughton

Related Articles

Young Scholars Can’t Take the Field in Game of  Academic Metrics
Infrastructure
December 18, 2024

Young Scholars Can’t Take the Field in Game of Academic Metrics

Read Now
From the University to the Edu-Factory: Understanding the Crisis of Higher Education
Industry
November 25, 2024

From the University to the Edu-Factory: Understanding the Crisis of Higher Education

Read Now
Exploring the Citation Nexus of Life Sciences and Social Sciences
Industry
November 6, 2024

Exploring the Citation Nexus of Life Sciences and Social Sciences

Read Now
New Initiative Offers Grants for Canadian Research on Research
Announcements
November 5, 2024

New Initiative Offers Grants for Canadian Research on Research

Read Now
Diving Into OSTP’s ‘Blueprint’ for Using Social and Behavioral Science in Policy

Diving Into OSTP’s ‘Blueprint’ for Using Social and Behavioral Science in Policy

Just in time for this past summer’s reading list, in May 2024 the White House Office of Science and Technology Policy (technically, […]

Read Now
Revisiting the ‘Research Parasite’ Debate in the Age of AI

Revisiting the ‘Research Parasite’ Debate in the Age of AI

The large language models, or LLMs, that underlie generative AI tools such as OpenAI’s ChatGPT, have an ethical challenge in how they parasitize freely available data.

Read Now
Partnership Marks Milestone in Advancing Black Scholarship 

Partnership Marks Milestone in Advancing Black Scholarship 

Three years ago, on the heels of a Black Lives Matter Movement energized after the horror of George Floyd’s murder, the global academic publisher Sage partnered with the Black-owned Universal Write Publications (UWP).  

Read Now
0 0 votes
Article Rating
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments