Graphic illustration of Charts and diagrams with finanical statistics

Sometimes the data an organization needs to make critical business decisions is simply not available, or at least not ready and waiting. Sometimes a dataset needs to be created before a company can begin to get insights that will help inform its strategy. And sometimes teams need to scale their processes around gathering and analyzing data without increasing manual effort — data that may itself be time-sensitive.

The information that companies need to make the right decisions can vary widely, but clinical trials, technology transfer offices, patent, conference, and competitive company information, and global regulatory, corporate, and organization news all offer data from which insights can be derived.

In this series, we explore the value that web crawling, creation of curated datasets, and delivery of targeted and relevant intelligence can provide to an entire organization, including data scientists, information managers, competitive intelligence professionals, and business development teams seeking information that will help them make the right decisions for their organization.

Finding Crucial Data in the Gaps

Organizations are increasingly seeking to make data-driven decisions, as doing so can: 

  • Contribute to revenue growth 
  • Improve productivity
  • Reduce costs
  • Streamline product development
  • Result in a stronger understanding of the competitive landscape 

 To become data-driven effectively, organizations naturally seek more and better data. 

Responsible organizations must consider where the data that supports their decisions and analyses is sourced. Typical sources include the databases used by many businesses, including those for published literature and grants, as well as regulatory databases.

At CCC, we’ve discovered from our work with clients that desirable data for making better-informed decisions can exist outside these common sources — this data is found in “the gaps.” While licensed or structured datasets can be beneficial, they do not cover everything and sometimes an organization needs to fill the gaps between their data sources to create a single source that is unique and particularly relevant to the organization’s needs. When everyone is licensing and subscribing to the same datasets, a competitive advantage can come from using information somewhere in the middle, or in the gaps.

One example of data found in the gaps pertains to monitoring social media — organizations are interested in what others are saying about their products, and they are also interested in what their competitors are saying about themselves in the form of company news and press releases. Beyond social media, “gap data” can also be found in obscure, but critical, areas such as tracking product shipments across borders.

The Limitations of Search Engines

The limitations of using common search engines such as Google make it difficult to find vital data in the gaps. A gap, for example, might be key documents or pages that aren’t indexed by common search engines. Additionally, search engines commonly organize the results they present by what most users are searching for, so an engine such as Google uses its algorithm (“what do most people who are searching on this term want to see?”) to create a common answer.

But what if an organization is seeking an uncommon answer? To find it, a search might need to go well beyond the first few pages of results presented, potentially leading to a significant investment of time. Rather than use search engines that may apply an undesirable context to results, companies are better served controlling how their search results are presented by being able to apply their own context to their search. 

While utilizing data found in the gaps can be hugely beneficial, there are challenges to doing so. This data likely hasn’t been tagged, curated, or normalized, leaving it unstructured and distributed and making the process to collect it manually intensive. Further, the information organizations are seeking can often lie alongside data not relevant to the context or need, leading to a lot of noise to filter out. Gathering relevant data in the gaps might bring in human error, then, as organizations assign workers who might not have deep knowledge of the business to sift through data. 

Crawl, Enrich, and Deliver with Deep Search

Organizations seeking to successfully locate and utilize data found in the gaps have a solution they can turn to: “deep search.” Deep search is a solution encompassing a set of tools and processes that, subject to the terms of access and use of the relevant data source, can crawl targeted sources of data, enrich them, link and curate them, and then deliver them to a focused audience. 

A deep search approach can help organizations get more value from their existing datasets, help them create new datasets from one or more sources, and even assist them in efficiently scaling processes and quickly collecting time-sensitive data with reduced manual effort. At CCC, we have more than 40 years of expertise in licensing and developing related solutions. If you’d like to learn more, we’re here to help 

This is the first in a four-part series on how deeper, automated searching can help your organization more easily find the information needed to make the right business decisions. Subscribe to CCC’s Velocity of Content blog to receive the next installments directly in your inbox. 

Topic:

Author: Carl Robinson

Carl Robinson is Senior Corporate Solutions Director for CCC. He focuses on helping clients look at business vision, goals and strategies around their content and tooling to enable flexibility and readiness to meet the ever-changing demands of the digital market. Carl has been in publishing since 1995 and has worked for Pearson Education, Macmillan Education and Oxford University Press.

Author: Stephen Howe

Stephen has spent his career working at the intersection of publishing, education, and technology, holding positions in sales, sales management, production, project management, digital publishing, digital editorial, and product management. Trained in the liberal arts tradition, Stephen holds a BA and MA in philosophy, an MBA in management, and a Masters in Analytics. Stephen currently works as the Senior Product Manager - Analytics at CCC and serves on the advisory board at Brandeis University for the Masters in Strategic Analytics program.