ProductOpen DatasetsApps MarketSolutionsResourcesCompany

[Ed's TALK] Getting the Most from Unstructured Data

Published at2022-01-06

If you're analyzing only structured data, you're missing a wealth of insights. Enterprises start to struggle to mine the wealth of information contained in unstructured data. What are they doing right and wrong? In this blog, Edward Cui, founder of Graviti, shares his perspective in a Q&A with TDWI.

What progress has been made in the last 10 years in using unstructured data? Why hasn't more progress been made?

Edward Cui: The world of ten years ago was dominated by structured data. After 2012, though, as sensors became cheaper, cell phones gradually became smartphones, and cameras were installed to make shooting easier. With this, a large amount of unstructured data was generated, and enterprises entered uncharted territory, making progress slow. Some of the inhibitors to progress in this area include:

  • Complexity: Unlike structured data which can be analyzed intuitively, unstructured data needs to be further processed and then analyzed, usually best done through artificial intelligence. Machine learning algorithms classify and label content from it. However, it is not easy to identify high-quality data from the data set due to the large amount and complexity of unstructured data -- this has been painful for developer teams and a key challenge to data architectures that are already complex.

  • Cost: Although the enterprise recognizes the value of unstructured data, the cost can be a potential obstacle to making use of it. The cost of enterprise infrastructure, human resources, and time can hinder the implementation and development of AI and the data it analyzes.

What recommendations or advice can you give enterprises that want to get started analyzing unstructured data? How should they begin? What best practices will make their job easier?

We recommend that enterprises have a fully prepared plan in place before they start analyzing unstructured data. Because the amount of unstructured data grows rapidly, enterprises must consider these questions clearly before collecting data: where to store the data (in the cloud or locally); how to identify high-quality data; how to develop the training model and iteration with newly collected data, and so on. Additionally, artificial intelligence professionals can help enterprises figure out the questions (and answers) before they begin collecting unstructured data for analyzing.

How is Graviti working to make unstructured data easier to use?

Graviti aims to launch the first data platform that enables organizations to work with large volumes of unstructured data to power innovative AI applications. This platform eliminates the hassle and helps developers manage large amounts of unstructured data with the team.

Although most of the available information in AI development is low quality and unstructured, development teams usually spend over half of their time not on building models but rather on identifying, augmenting, or cleansing unstructured data, and that's just the beginning of their work. Graviti offers a more expert data management way to free developers and gives them more time to analyze unstructured data and train artificial intelligence models. We help developers in three dimensions: data discovery, data iteration, and workflow automation.

If you want to learn more about Graviti data platform, check out CEO's TALK: What exactly is the Graviti Data Platform?