Zeenea Product Recap: A look back at 2023

Zeenea Product Recap: A look back at 2023

2023 was another big year for Zeenea. With more than 50 releases and updates to our platform, these past 12 months were filled with lots of new and improved ways to unlock the value of your enterprise data assets. Indeed, our teams consistently work on features that simplify and enhance the daily lives of your data and business teams.

In this article, we’re thrilled to share with you some of our favorite features from 2023 that enabled our customers to:

  • Decrease data search and discovery time
  • Increase Data Steward productivity & efficiency
  • Deliver trusted, secure, and compliant information across the organization
  • Enable end-to-end connectivity with all their data sources

Decrease data search and discovery time

 

One of Zeenea’s core values is simplicity. We strongly believe that data discovery should be quick and easy to accelerate data-driven initiatives across the entire organization.

In fact, many data teams still struggle to find the information they need for a report or use case. Either because they couldn’t locate the data because it was scattered in various sources, files, or spreadsheets, or maybe they were confronted with an overwhelming amount of information they didn’t even know how to begin their search.

In 2023, we designed our platform with simplicity. By providing easy and quick ways to explore data, Zeenea enabled our customers to find, discover, and understand their assets in seconds.

A fresh new look for the Zeenea Explorer

 

One of the first ways our teams wanted to enhance the discovery experience of our customers was by providing a more user-friendly design to our data exploration application, Zeenea Explorer. This redesign included:

New Homepage

 

Our homepage needed a brand-new look and feel for a smoother discovery experience. Indeed, for users who don’t know what they are looking for, we added brand-new exploration paths directly accessible via the Zeenea Explorer homepage.

 

  • Browsing by Item Type: If the user is sure of the type of data asset they are looking for, such as a dataset, visualization, data process, or custom asset, they directly access the catalog with it pre-filtered with the needed type of asset.
  • Browsing through the Business Glossary: Users can quickly navigate through the enterprise’s Business Glossary by directly accessing the Glossary assets that were defined or imported by stewards in Zeenea Studio.
  • Browsing by Topic: The app enables users to browse through a list of Items that represent a specific theme, use case, or anything else that is relevant to business (more information below).
New Zeenea Explorer Homepage 2023

New Item Detail Pages

 

To understand a catalog Item at a glance, one of the first notable changes was the position of the Item’s tabs. The tabs were originally positioned on the left-hand side of the page, which took up a lot of space. Now, the tabs are at the top of the page, more closely reflecting the layout of the Studio app. This new layout allows data consumers to find the most significant information about an Item such as:

  • The highlighted properties, defined by the Data Steward in the Catalog Design,
  • Associated Glossary terms, to understand the context of the Item,
  • Key people, to quickly reach the contacts that are linked to the Item.

In addition, our new layout allows users to find all fields, metadata, and all other related items instantly. Divided into three separate tabs in the old version, data consumers now find the Item’s description and all related Items in a single “Details” tab. Indeed, depending on the Item Type you are browsing through, all fields, inputs & outputs, parent/children Glossary Items, implementations, and other metadata are in the same section, saving you precious data discovery time.

Lastly, the spaces for our graphical components were made larger – users now have more room to see their Item’s lineage, data model, etc.

New Item Detail Page Zeenea Explorer

New Filtering system

 

Zeenea Explorer offers a smart filtering system to contextualize search results. Zeenea’s preconfigured filters can be used such as by item type, connection, contact, or by the organization’s own custom filters. For even more efficient searches, we redesigned our search results page and filtering system:

 

  • Available filters are always visible, making it easier to narrow down the search,
  • By clicking on a search result, an overview panel with more information is always available without losing the context of the search,
  • The filters most relevant to the search are placed at the top of the page, allowing to quickly get the results needed for specific use cases.
New Filtering System Explorer

Easily browsing the catalog by Topic

 

One major 2023 release was our Topics feature. Indeed, to enable business users to (even more!) quickly find their data assets for their use cases, Data Stewards can easily define Topics in Zeenea Studio. To do so, they simply select the filters in the Catalog that represent a specific theme, use case, or anything else that is relevant to business.

Data teams using Zeenea Explorer can therefore easily and quickly search through the catalog by Topic to reduce their time searching for the information they need. Topics can be directly accessed via the Explorer homepage and the search bar when browsing the catalog.

Browse By Topic Explorer New

Alternative names for Glossary Items for better discovery

 

In order for users to easily find the data and business terms they need for their use cases, Data Stewards can add synonyms, acronyms, and abbreviations for Glossary Items!

Ex: Customer Relationship Management > CRM

Alternative Names Zeenea Studio

Improved search performance

 

Throughout the year, we implemented a significant amount of improvements to enhance the efficiency of the search process. The addition of stop words, encompassing pronouns, articles, and prepositions, ensures a more refined and pertinent outcome for queries. Moreover, we added an “INFIELD:” operator, enabling users the capability to search for Datasets that contain a specific field.

Search In Fields Explorer

Microsoft Teams integration

 

Zeenea also strengthened our communication and collaboration capacities. Specifically, when a contact is linked to a Microsoft email address, Zeenea now facilitates the initiation of direct conversations via Teams. This integration allows Teams users to promptly engage with relevant individuals for additional information on specific Items. Other integrations with various tools are in the works. ⭐️

Microsoft Teams Zeenea Explorer

Increase Data Steward productivity & efficiency

 

Our goal at Zeenea is to simplify the lives of data producers so they can efficiently manage, maintain, and enrich the documentation of their enterprise data assets in just a few clicks. Here are some features and enhancements that help to stay organized, focused, and productive.

Automated Datasets Import

 

When importing new Datasets in the Catalog, administrators can turn on our Automatic Import feature which automatically imports new Items after each scheduled inventory. This time-saving enhancement increases operational efficiency, allowing Data Stewards to focus on more strategic tasks rather than the routine import process.

Auto Import Zeenea Studio 2

Orphan Fields Deletion

 

We’ve also added the to manage Orphan Fields more effectively. This includes the option to perform bulk deletions of Orphan Fields, accelerating the process of decluttering and organizing the catalog. Alternatively, Stewards can delete a single Orphan Field directly from its detailed page, providing a more granular and precise approach to catalog maintenance.

Orphan Field Details

Building reports based on the content of the catalog

 

We added a new section in Zeenea Studio – The Analytics Dashboard – to easily create and build reports based on the content and usage of the organization’s catalog.

Directly on the Analytics Dashboard page, Stewards can view the completion level of their Item Types, including Custom Items. Each Item Type element is clickable to quickly view the Catalog section filtered by the selected Item Type.

For more detailed information on the completion level of a particular Item Type, Stewards can create their own analyses! They select the Item Type and a Property, and they’re able to consult, and for each value of this property, the completion level of all your Item’s template, including its description, and linked Glossary Items.

New Analytics Dashboard Gif Without Adoption

New look for the Steward Dashboard

 

Zeenea Explorer isn’t the only application that got a makeover! Indeed, to help Data Stewards stay organized, focused, and productive, we redesigned the Dashboard layout to be more intuitive to get work done faster. This includes:

 

  • New Perimeter design: A brand new level of personalization when logging in to the Dashboard. The perimeter now extends beyond Dataset completion – it includes all the Items that one is a Curator for, including Fields, Data Processes, Glossary Items, and Custom Items.
  • Watchlists Widget: Just as Data Stewards create Topics for enhanced organization for Explorer users, they can now create Watchlists to facilitate access to Items requiring specific actions. By filtering the catalog with the criteria of their choice, Data Stewards save these preferences as new Watchlists via the “Save filters as” button, and directly access them via the Watchlist widget when logging on to their Dashboard.
  • The Latest Searches widget: Caters specifically to the Data Steward, focusing on their recent searches to enable them to pick up where they left off.
    The Most Popular Items widget: The most consulted and widely used Items within the Data Steward’s Perimeter by other users. Each Item is clickable, giving instant access to its contents.

 

View the Feature Note

 

New Steward Dashboard Studio

Deliver trusted, secure, and compliant information across the organization

Data Sampling on Datasets

 

For select connections, it is possible to get Data Sampling for Datasets. Our Data Sampling capabilities allow users to obtain representative subsets of existing datasets, offering a more efficient approach to working with large volumes of data. With Data Sampling activated, administrators can configure fields to be obfuscated, mitigating the risk of displaying sensitive personal information.

This feature carries significant importance to our customers, as it enables users to save valuable time and resources by working with smaller, yet representative, portions of extensive datasets. This also allows early identification of data issues, thereby enhancing overall data quality and subsequent analyses. Most notably, the capacity to obfuscate fields addresses critical privacy and security concerns, allowing users to engage with anonymized or pseudonymized subsets of sensitive data, ensuring compliance with privacy regulations, and safeguarding against unauthorized access.

Data Sampling Zeenea Studio

Powerful Lineage capabilities

 

In 2022, we made a lot of improvements to our Lineage graph. Not only did we simplify its design and layout, but we also made it possible for users to display only the first level of lineage, expand and close the lineage on demand, and get a highlighted view of the direct lineage of a selected Item.

This year we made significant other UX changes, including the possibility to expand or reduce all lineage levels in one click, hide the data processes that don’t have at least one input and one output, and easily view the connections via a tooltip for connections that have long names.

However, the most notable release is the possibility to have Field-level lineage! Indeed, it is now possible to retrieve the input and output Fields of tables and reports, and for more context, add the operation’s description. Then, users can directly view their Field level transformations over time in the Data Lineage graph in both Zeenea Explorer and Zeenea Studio.

Field Level Lineage Zeenea Studio 2

Data Quality Information on Datasets

 

By leveraging GraphQL and knowledge graph technologies, Zeenea Data Discovery Platform provides a flexible approach to integrating best-of-breed data quality solutions. It synchronizes datasets via simple query and mutation operations from third-party DQM tool via our Catalog API capabilities. The DQM tool will deliver real-time data quality scan results to the corresponding dataset within Zeenea, enabling users the ability to conveniently review data quality insights directly within the catalog.

This new feature includes:

  • A Data Quality tab in your Dataset’s detail pages, where users can view its Quality checks as well as the type, status, description, last execution date, etc.
  • The possibility to view more information on the Dataset’s quality directly in the DQM tool via the “Open dashboard in [Tool Name]” link.
  • A data quality indicator of Datasets directly displayed in the search results and lineage.

 

View the Feature Note

Zeenea Explorer Data Quality Graph

Enable end-to-end connectivity with all their data sources

 

At Zeenea, connect to all your data sources in seconds. Our platform’s built-in scanners and APIs enable organizations to automatically collect, consolidate, and link metadata from their data ecosystem. This year, we made significant enhancements to our connectivity to enable our customers to build a platform that truly represents their data ecosystem.

Catalog Management APIs

 

Recognizing the importance of API integration, Zeenea has developed powerful API capabilities that enable organizations to seamlessly connect and leverage their data catalog within their existing ecosystem.

In 2023, Zeenea developed Catalog APIs, which help Data Stewards with their documentation tasks. These Catalog APIs include:

Query operations to retrieve specific catalog assets: Our API query operations include the retrieval of a specific asset, using its unique reference or by its name & type, or retrieving a list of assets via connection or a given Item type. Indeed, Zeenea’s Catalog APIs enable flexibility when querying by being able to narrow results to not be overwhelmed with a plethora of information.

Mutation operations to create and update catalog assets: To save even more time when documenting and updating company data, Zeenea’s Catalog APIs enable data producers to easily create, modify, and delete catalog assets. It enables the creation, update, and deletion of Custom Items and Data Processes as well as their associated metadata, and update Datasets and Data Visualizations. This is also possible for Contacts. This is particularly important when users leave the company or change roles – data producers can easily transfer the information that was linked to a particular person to another.

 

Read the Feature Note

Property & Responsibility Codes management

 

Another feature that was implemented was the ability to add code to properties & responsibilities to easily use them in API scripts for more reliable queries & retrievals.

For all properties and responsibilities that were built in Zeenea (e.g.: Personally Identifiable Information) or harvested from connectors, it is possible to modify its name and description to better suit the organization’s context.

Property Responsibility Codes Studio

More than a dozen more connectors to the list

 

At Zeenea, we develop advanced connectors to automatically synchronize metadata between our data discovery platform and all your sources. This native connectivity saves you the tedious and challenging task of manually finding the data you need for a specific business use case that often requires access to scarce technical resources.

In 2023 alone, we developed over a dozen new connectors! This achievement underscores our agility and proficiency in swiftly integrating with diverse data sources utilized by our customers. By expanding our connectivity options, we aim to empower our customers with greater flexibility and accessibility.

 

View our connectors

The top 5 benefits of data lineage

The top 5 benefits of data lineage

Do you have the ambition to turn your organization into a data-driven enterprise? You cannot escape the need to accurately map all your data assets, monitor their quality and guarantee their reliability. Data lineage can help you accomplish this mission. Here are some explanations.

To know what data you use, what it means, where it comes from, and how reliable it is throughout its life cycle, you need a holistic view of everything that is likely to transform, modify or alter it. This is exactly the mission that data lineage fulfills, which is a data analysis technique that allows you to follow the path of data from its source to its final use. A technique that has many benefits!

Benefit #1: Improved data governance

 

Data governance is a key issue for your business and for ensuring that your data strategy can deliver its full potential. By following the path of data – from its collection to its exploitation – data lineage allows you to understand where it comes from and the transformations it has undergone over time to create a rich and contextualized data ecosystem. This 360° view of your data assets guarantees reliable and quality data governance.

Benefit #2: More reliable, accurate, and quality data

 

As mentioned above, one of the key strengths of data lineage is its ability to trace the origin of data. However, another great benefit is its ability to identify the errors that occur during its transformation and manipulation. Hence, you are able to take measures to not only correct these errors but also ensure that they do not reoccur, ultimately improving the quality of your data assets. A logic of continuous improvement that is particularly effective for the success of your data strategy.

Benefit #3: Quick impact analysis

 

Data lineage accurately identifies data flows, making sure you never stay in the wrong for too long. The first phase is based on the detailed knowledge of your business processes and your available data sources. When critical data flows are identified and mapped, it is possible to quickly analyze the potential impacts of a given transformation on data or a business process. With the impacts of each data transformation assessed in real-time, you have all the information you need to identify the ways and means to mitigate the consequences. Visibility, traceability, reactivity – data lineage saves you precious time!

Benefit #4: More context to the data

 

As you probably understood by now, data lineage continuously monitors the course of your data assets. Therefore, beyond the original source of the data, you have full visibility of the transformations that have been applied to the data throughout its journey. This visibility also extends to the use that is made of the data within your various processes or through the
applications deployed in your organization. This ultra-precise tracking of the history of interactions with data allows you to give more context to data in order to improve data quality, facilitate analysis and audits, and make more informed decisions based on accurate and complete information.

Benefit #5: Build (even more!) reliable compliance reports

 

The main expectations of successful regulatory compliance are transparency and traceability. This is the core value promise of data lineage. By using data lineage, you have all the cards in your hand to reduce compliance risks, improve data quality, facilitate audits and verifications, and reinforce stakeholders’ confidence in the compliance reports produced.

Breaking down Data Lineage: typologies and granularity

Breaking down Data Lineage: typologies and granularity

As a concept, Data Lineage seems universal: whatever the sector of activity, any stakeholder in a data-driven organization needs to know the origin (upstream lineage) and the destination (downstream lineage) of the data they are handling or interpreting. And this need has important underlying motives.

For a Data Catalog vendor, the ability to manage Data Lineage is crucial to its offer. As is often the case however, behind a simple and universal question lies a world of complexity that is difficult to grasp. This complexity is partially linked to the heterogeneity of answers that vary from one interlocutor to another in the company.

In this article, we will explain our approach to breaking down data lineage according to the nature of the information sought and its granularity.

 

The typology of Data Lineage: seeking the origin of data

There are many possible answers as to the origin of any given data. Some will want to know the exact formula or semantics of the data. Others will want to know from which system(s), application(s), machine(s), or factory it comes from. Some will be interested in the business or operational processes that produced the data. Some will be interested in the entire upstream and downstream technical processing chain. It’s difficult to sort through this maze of considerations!

A layer approach

To structure lineage information, we suggest emulating what is practiced in the field of geo-mapping by distinguishing several superimposable layers. We can identify three:

  • The physical layer, which includes the objects of the information system – applications, systems, databases, data sets, integration or transformation programs, etc.
  • The business layer, which contains the organizational elements – domains, business processes or activities, entities, managers, controls, committees, etc.
  • The semantic layer, which deals with the meaning of the data – calculation formulas, definitions, ontologies, etc.
    data-lineage-layers-EN-zeenea

    A focus on the physical layer

    The physical layer is the basic canvas on which all the other layers can be anchored. This approach is again similar to what is practiced in geo-mapping: above the physical map, it is possible to superimpose other layers carrying specific information.

    The physical layer represents the technical dimension of the lineage; it is materialized by tangible technical artifacts – databases, file systems, integration middleware, BI tools, scripts and programs, etc. In theory, the structure of the physical lineage can be extracted from these systems, and then largely automated, which is not generally the case for the other layers.

    The following seems fundamental: for this bottom-up approach to work, it is necessary that the physical lineage be complete.

    This does not mean that the lineage of all physical objects must be available, but for the objects that do have lineage, this lineage must be complete. There are two reasons for this. The first reason is that a partial (and therefore false) lineage risks misleading the person who consults it, jeopardizing the adoption of the catalog. Secondly, the physical layer serves as an anchor for the other layers which means any shortcomings in its lineage will be propagated.

    In addition to this layer-by-layer representation, let’s address another fundamental aspect of lineage: its granularity.

     

    Granularity in Data Lineage

    When it comes to lineage granularity, we identify 4 distinct levels: values, fields (or columns), datasets and applications.

    The values can be addressed quickly. Their purpose is to track all the steps taken to calculate any particular data (we’re referring to specific values, not the definition of any specific data). For mark-to-model pricing applications, for example, the price lineage must include all raw data (timestamp, vendor, value), the values derived from this raw data as well as the versions of all algorithms used in the calculation.

    Regulatory requirements exist in many fields (banking, finance, insurance, healthcare, pharmaceutical, IOT, etc.), but usually in a very localized way. They are clearly out of the reach of a data catalog, in which it is difficult to imagine managing every data value! Meeting these requirements calls for either a specialized software package or a specific development.

    The other three levels deal with metadata, and are clearly in the remit of a data catalog. Let’s detail them quickly.

    The field level is the most detailed level. It consists of tracing all the steps (at the physical, business or semantic level) for an item of information in a dataset (table or file), a report, a dashboard, etc., that enable the field in question to be populated.

    At the dataset level, the lineage is no longer defined for each field but at the level of the field container, which can be a table in a database, a file in a data lake, an API, etc. On this level, the steps that allow us to populate the dataset as a whole are represented, typically from other datasets (we also find on this level other artifacts such as reports, dashboards, ML models or even algorithms).

    Finally, the application level enables the documentation of the lineage macroscopically, focusing on high-level logical elements in the information system. The term “application” is used here in a generic way to designate a functional grouping of several datasets.

    It is of course possible to imagine other levels beyond those 3 (grouping applications into business domains, for example), but increasing the complexity is more a matter of flow mapping than lineage.

    Finally, it is important to keep in mind that each level is intertwined with the level above it. This means the lineage from the higher level can be worked out from the lineage of the lower level (if I know the lineage of all the fields of a dataset, then I can infer the line age of this dataset).

    We hope that this breakdown of data lineage will help you better understand it for your organization. In a future article, we will share our approach so that each business can derive maximum value from Lineage thanks to our typology / granularity / business matrix. 

    To learn more about Data Lineage best practices, download our eBook: All you’ve ever wanted to know about Data Lineage!

    data-lineage-white-paper-mockup-en
    What is Data Lineage?

    What is Data Lineage?

    In order to access and exploit your data assets on a regular basis, your organization will need to know everything about your data! This includes its origins, transformations over time, and overall life cycle. All of this knowledge can be gathered from Data Lineage! 

    In this article, we will define Data Lineage, give an analogy, and explain its main benefits for data-driven organizations. 

    After human resources, data has become the most valuable asset for business today. 

    It is the foundation that links companies, clients, and partners together. Knowing this, data must be preserved and leveraged as it contains all of an organization’s intelligence.

    However, with great information, comes great responsibility for those who manage or use this data. On one hand they must identify the data that reveals strategic insights for the company, and on the other, they must appropriate the right security measures to prevent devastating financial and reputational consequences. 

    With the arrival of data compliance laws such as the BCBS-239 or the GDPR, the person in charge (usually the DPO) of data compliance must put in place transparent conditions to ensure that no data will be exploited to the detriment of a customer. 

    This is where Data Lineage intervenes. Behind the word lineage lies an essential concept: data traceability. This traceability covers the entire life cycle of the data, from its collection to its use, storage and preservation over time.

     

    How Data Lineage works

    As mentioned above, the purpose of Data Lineage is to ensure the absolute traceability of your data assets. This traceability is not limited to knowing the source of an information. It goes much further than that! 

    To understand the nature of lineage information, let’s use a little analogy.

    Imagine that you are dining in a gourmet restaurant. The menu includes dishes with poetic names, composed of many more or less exotic ingredients, some of which are foreign to you. When the waiter brings you your plate, you taste, appreciate, and wonder about the origin of what you are eating.

    Depending on your point of view, you will not expect the same answer.

    As a fine cuisine enthusiast, you will want to know how the different ingredients were transformed and assembled to obtain the finished product. You will want to know the different steps of preparation, the cooking technique, the duration, the condiments used, the seasoning, etc. In short, you are interested in the most technical aspects of the final preparation: the recipe.

    As a controller, you will focus more on the complete supply and processing chain: who the suppliers are, places and conditions of breeding or cultivation of raw products, transport, packaging, cutting and preparation, etc. You will also want to make sure that this supply chain complies with the various labels or appellations that the restaurant owner highlights (origin of ingredients, organic, “home-made”, AOC, AOP, etc.).

    Others may focus on the historical and cultural dimensions – from what region or tradition is the dish derived or inspired from? When and by whom was it originally created? Others (admittedly rarer) will wonder about the phylogenetic origin of the breed of veal prepared by the chef…

    In short, when it comes to gastronomy, the question of origin does not wait for a unique and homogeneous answer. And the same is true for data.

    Indeed, with Data Lineage, you will have access to a real-time data monitoring tool.

    Once collected, the data is constantly monitored in order to :

    • detect and monitor any errors in your data processing,
    • manage and continuously monitor all process changes while minimizing the risks of data degradation,
    • manage data migrations,
    • have a 360° view on metadata.

    Data Lineage ensures that your data comes from a reliable and controlled source, that the transformations it has undergone are known, monitored, and legitimate, and that it is available in the right place, at the right time and for the right user. 

    Acting as a control tool, the main mission of Data Lineage is to validate the accuracy and consistency of your data. 

    How do you do this? By allowing your employees to conduct research on the entire life cycle of the data, both upstream and downstream, from the source of the data to its final destination, in order to detect and isolate any anomalies and correct them.

     

    The main advantages of Data Lineage

    The first benefit of Data Lineage has to do with compliance. It helps identify and map all of the data production and exploitation processes and limits your exposure to the risk of non-compliance of personal data. 

    Data Lineage also facilitates data governance because it provides your company and its employees with a complete repository describing your data flows and metadata. This knowledge is essential to design a 100% operational data architecture. 

    Data Lineage makes it easier to automate the documentation of your data production flows. So, if you are planning to increase the importance of data in your development strategy, Data Lineage will allow you to save a considerable amount of time in the deployment of projects where data is key. 

    Finally, the last major benefit of Data Lineage concerns your employees themselves. With data whose origin, quality and reliability are guaranteed by Data Lineage, they can fully rely on your data flows and base their daily actions on this indispensable asset. 

    Save time, guarantee the compliance of your data, make the action of your teams more fluid while inscribing your company in a new dimension, based on an uncompromising data strategy… Don’t wait any longer, get started now!