The answer is probably, no one. While you weren’t looking, your company must have planted a lot of data around yours, so now you can’t locate your favorite tree in the forest.
A data catalog might be able to help. It democratizes enterprise data for the uninitiated who have not mastered the topology of their data systems. On their own, these people wouldn’t know where to find the systems that hold the metrics and attributes they need for their reports. But armed with a data catalog, they can do data discovery and self-service BI like pros.
Higher data quality and lower data duplication are the hallmarks of well-maintained data catalogs. When users know where to find their data, they are less likely to recreate it, thus supporting the noble mission of maintaining a single source of truth.
Data catalogs don’t copy source data, only metadata. Since Lore IO handles metadata like a boss, it natively offers a searchable data catalog. Business and IT teams use Lore IO both to discover their data and run their queries in a one-stop shop fashion, and so collaboratively, encouraging everyone — including the uninitiated — to contribute as much (or as little) as they can in requesting, documenting, and generating transformed data.
Automated data discovery
Lore IO keeps a watchful eye on the source systems. Like an experienced tracker, it looks for any sign of change in the raw data, recursively crawling the contents of a top subfolder. As soon as new data is detected, say an existing file is updated, Lore IO jumps all over it, parses it, and updates its metadata in the catalog.
Lore IO’s catalog differentiates between three types of tables: source tables that point to the raw data, target tables that point to the transformed data, and event tables for change data capture.
When Lore IO detects a new file, it automatically creates a tabular view for its columns and metadata. It handles structured and semi-structured file formats, such as AVRO, XML, JSON, EDI, etc. Lore IO flattens nested objects into a relational view for easy data discovery.
To establish table relationships, Lore IO periodically scans the output of DDL statements on the raw data, looking for any captured relationships in the original data. To fill the gaps, an innovative recommendation engine scans the tables and suggests additional relationships for consideration. Customers can add relevant relationships via the Lore IO web interface.
Finding data faster
Lore IO customers search the data catalog, discovering relationships, metrics, cohorts, and funnels. Users can find their data faster by filtering on tables, concepts, owners, status, visibility, lineage, or data type.
Additionally, customers can see all of their tables in a visual form via automatically created entity relationship diagrams that highlight all the relationships as they are created.
To improve understanding of and trust in the data, Lore IO offers data lineage from the top transformed datasets all the way down to the raw data that Lore IO scans. And to ensure data privacy, Lore IO customers can apply various masking techniques.
Finally, an auto-complete feature, available both in modeling (expressions and functions) and in data definitions, surfaces metadata suggestions as customers create their documentation and transformations.