Data Governance

Data Lineage (Metadata Management):

Metadata Management brings powerful business capabilities to the modern enterprise to harvest and govern metadata across its whole Data Management technologies. By being able to provide data transparency even in 3rd party technology.

Metadata Management is a must have technology for any organization looking to seriously tackle Governance, Productivity Improvement and Lifecycle Management challenges.

Metadata Management is essential to solve a wide variety of critical business and technical challenges. They include but are not limited to how report figures are calculated, understanding the impact of changes to data upstream, and surfacing data lineage reports in a business friendly way in the browser and providing reporting capabilities on the entire metadata of an enterprise for analysis and improvement.

Metadata Management is built to solve all these pressing needs for customers in a light and browser based interface.

Metadata Manager is managing Changes Effectively in Enterprise Data Integration Environments. It collects metadata from a data integration environment and provides a visual map of the data flows within that environment.

Metadata Manager—along with Business Glossary provides the visibility and control needed to manage change, reduce errors caused by change, and ensure data integrity.

Metadata Manager shows how data objects will be impacted by a proposed change before it is implemented. By providing full visibility into the potential impact of data changes, Metadata Manager helps your IT organization obtain more accurate cost estimates and accelerate delivery time.

With Metadata Manager, your IT organization can deliver trusted data to key business initiatives quickly.

Metadata Manager gathers technical metadata from a wide variety of sources, including mainframes, applications, databases, data modeling tools, business intelligence tools, and extract-transform-load (ETL) tools.

Metadata is stored in a central repository that is shared with Business Glossary. This technical metadata can be linked to business metadata created in Business Glossary to add meaning and context to data assets.

Metadata Manager automatically creates a graphical view of data as it flows through the data integration environment. This view is what gives developers the visibility and control to understand and manage data integration environments.

With Metadata Manager, all parts of the organization can collaborate faster and easier to keep data accurate, consistent, and available when and where needed.

Data architects use Metadata Manager’s integration metadata catalog to browse and search for metadata definitions, define new custom models, extend existing metadata, and maintain common interface definitions for data sources, warehouses, business intelligence, and other applications that enterprise data integration projects use.

Metadata Manager gives them the global visibility needed to manage the environment. Data integration developers use Metadata Manager to perform detailed impact analysis when changes are made to metadata used in mappings, sessions, workflows, sources, and targets.

Comprehensive impact analysis capabilities enable developers to quickly identify which design artifacts are affected and need to be modified accordingly.

Business Glossary (business metadata):

We plan to utilize the enhanced capabilities of Business Glossary to connect business and IT and eliminate the data ambiguities that exist across the enterprise.

By giving business context to our data assets in a central integration metadata repository, IT can ensure business has data they can trust.”

Business Glossary is establishing Ownership and Accountability for Data. This tool enables data analysts, business analysts, and data stewards to work together to create, manage, and share a common vocabulary of data integration business terms.

This feature fosters cross-functional alignment and helps all parts of the business better understand the context and usage of data. IT organizations can answer the most common questions that business consumers have about data, such as “What does this data mean?”, “Where does the data come from?”, and “Who has responsibility for this data?”

By providing business context to technical artifacts related to data integration, the Business Glossary makes it possible to catalog, govern, and use data consistently and efficiently.

This feature helps establish data ownership and manage accountability through auditable data trails that are critical for successful data governance programs and governance, risk, and compliance initiatives.

Business Glossary supplies business context to technical artifacts related to data integration so that data can be cataloged, governed, and used consistently and efficiently across the enterprise.

Information Steward (Business Term):

Information Steward is a data quality analysis toolset that can enable Business and IT users to monitor and manage the enterprise data quality from multiple data sources and systems.

The Metadata Management module of Information Steward can discover metadata about OLAP, reports, dashboards, folders and systems, server instances, application users and user groups.

Information Steward (the Analyst tool) is used by business users to collaborate on projects within an organization. For example, business analysts can use the Analyst tool to collaborate on data integration projects in an organization.

Information Steward combine data profiling, data lineage, and metadata management to gain continuous insight into the integrity of your enterprise data model. Understand how the quality of your data impacts your business processes to enhance operational, analytical, and data governance initiatives.

Impact analysis will allow you to view the objects that are affect by data within a particular object like impact analysis for an OLAP Cube object lists the objects that are affected by the data within the object itself. 

You can also take note of the report consumers, both users and user groups that have permissions to access these particular reports.  And, why is this important?  Simple.  It answers the question, what is the downstream impact if I change an OLAP or that query?  And, this includes not only what does it impact, but also who and how many?

Information Steward Metadata Management information can be accessed directly from within Business Glossary to view the data lineage of already developed or self-services report by enabling direct access for report developers and consumers to understand where the data is coming from and how that data is being transformed.

Not only can you help report developers and consumers to understand where the data is coming from, you can additionally instill a degree of trust in that data by allowing them to see how good the data really is. The lineage information provided via Information Steward Integration includes and highlights the quality scores of the specific data assets and allows the user to drill into those scores to see the details as to the data quality rules, failed data as well as profiling results, if these rules and results are available.

Business Glossary terms can also be associated with report metadata objects and a link within Information Steward  allow BI users to access the business terms that have been associated with a particular report directly.

This promotes a common understanding of business concepts and terminology through Business Glossary as your central location for defining standard business vocabulary (words, phrases, or business concepts)

With the data migration example, so what if you want to expose your central repository of business terms to additional applications or locations to promote a common understanding across the two newly joined companies?  Good news!  Business Glossary content can also be accessed via WebServices, which includes APIs that support searching the Business Glossary repository terms, descriptions, authors, synonyms, categories, etc.

Companies have typically tackled the problems of unreliable data by implementing multiple master data management (MDM) point solutions in specific domains, such as a customer MDM solution for customer data and a product information management (PIM) solution for product data. This is a sensible approach, but some companies are realizing that they have solved one set of problems only to create another:

They now maintain multiple sets of master data that are established in functional silos, making it impossible to view the data across domains. Compounding the problem, customer data alone is often mastered in multiple systems such as

Marketing systems, CRMs, FATCA, Credit Initiation and call centers across local MDM solutions, cloud services, and legacy systems.

Without an enterprise view into master data, some companies cannot link product data from their PIM application to customer and channel-partner data from their MDM solution to answer basic questions, such as “Which products did Bob Smith buy and through which channels?” Without a view into the relationships among different types of data, businesses find it extremely challenging to automate profitable business processes such as order-to-cash or procure-to-pay.

Similarly, they cannot generate a list of interactions that a customer engaged in within a given time frame, such as calling tech support, making a purchase, or signing up for new publications.

Universal MDM sits in a layer above a company’s disparate applications such as CRMs, marketing systems, call centers, and MDM solutions, unifying them into a single, all-encompassing MDM solution. With Universal MDM, a company can maintain a single view into all of its master data. A company can gain a view not only into the relationships among entities across different domains but also into the interactions that touch these different domains.

Universal MDM promises a wealth of business benefits. To choose one example, customer service representatives would always know the full history of every customer, in order to go above and beyond customer satisfaction; by solving more problems more quickly, they would be advantageously well positioned to offer special discounts.

However, Universal MDM has a set of comprehensive requirements. It needs to work with any source and any target, and it needs to work with any domain and any style (Registry, Consolidation, Coexistence, and Centralized). Finally, it needs to be able to work with any client on any platform or device.

Fundamentally, Universal MDM must provide:

  • Universal services. Services that work universally across all master data in the enterprise
  • Universal domains. Universal support for all key data domains
  • Universal governance. Control over all master data through a common interface
  • Universal solutions. Solutions that work universally for companies in key vertical industries

Data quality refers to the condition of a set of values of qualitative or quantitative variables. There are many definitions of data quality but data is generally considered high quality if it is fit for [its] intended uses in operations, decision making and planning.

Alternatively, data is deemed of high quality if it correctly represents the real-world construct to which it refers. Furthermore, apart from these definitions, as data volume increases, the question of internal data consistency becomes significant, regardless of fitness for use for any particular external purpose. People's views on data quality can often be in disagreement, even when discussing the same set of data used for the same purpose. Data cleansing may be required in order to ensure data quality

Data cleansing is dangerous mainly because data quality problems are usually complex and interrelated. Fixing one problem may create many others in the same or other related data elements.

For decision making, having good data quality means having accurate and timely information to manage products and services from R&D through to the sale. Poor data quality can lead to the wrong insight and therefore the wrong decisions.

The Data Quality Framework (DQF) provides an industry-developed best practices guide for the improvement of data quality and allows companies to better leverage their data quality programs and to ensure a continuously-improving cycle for the generation of master data.

The dimensions explored in the DQAF include completeness, validity, timeliness, consistency and integrity. Data quality dimensions are important because they enable people to understand why data is being measured.

Improve your data where it exists. Data Quality meets you where you are, addressing your data quality issues without requiring you to move your data. You’ll work faster and more efficiently. With role-based security, you won’t put sensitive data at risk.

Manage the entire data quality life cycle. Data quality isn’t something you do just once; it’s a process. We help you at every stage, making it easy to profile and identify problems, preview data, and set up repeatable processes to maintain a high level of data quality.

Promote collaboration by empowering every team. With Data Quality, IT is no longer spread too thin, because we give business users the power to update and tweak data themselves. Out-of-the-box capabilities don’t require extra coding.

Big Data Quality works on Hadoop, supporting up to ten data analysts. It includes basic profiling, identity matching, data domain discovery, and grid computing for Hadoop. Data quality can help you put it at the core of everything you do.

It support traditional relational databases and emerging big data technologies such as Hadoop with enterprise-grade support and scalability. Whether your data is in-stream, in-database, in-memory or in-batch, we help you get it right.

Big data transforms the way businesses innovate and improve their operational processes. However, as organizations begin to bring big data into their environments they struggle to make these projects pay off. One of the key challenges is that data quality issues degrade the integrity and trust in big data assets.

Any question of data quality is a serious—if not insurmountable—obstacle to an organizations’ ability to make smart decisions, reduce costs, generate growth, and promote innovation. Relevant, timely, and trustworthy data is essential for success.

Big Data Quality empowers your company to take a holistic approach to managing data quality and leveraging the power of Hadoop. The software transforms your data quality processes to be a collaborative effort between business users and IT. This creates a true data-driven environment that supports better business decision making and analytics regardless of your data’s size, format, or platform. It delivers authoritative, trusted data to all stakeholders, projects, and business applications—on Hadoop, on-premise, or in the cloud.

Type of Archive Source data

  • Document Archive: An archive is an accumulation of historical records or the physical place they are located. Archives contain primary source documents that have accumulated over the course of an individual or organization's lifetime, and are kept to show the function of that person or organization
  • Email archiving is the act of preserving and making searchable all email to/from an individual. Email archiving solutions capture email content either directly from the email application itself or during transport. The messages are typically then stored on magnetic disk storage and indexed to simplify future searches.
  • Database offloading definitions: Data Archiving, in general means deleting the huge volumes of the data that is no longer required in the database to some file system or any third party storage system. Data archiving is the process of moving data that is no longer actively used to a separate storage device for long-term retention. Archive data consists of older data that is still important to the organization

Data Archive is a scalable solution that can help organizations manage and support their database archiving strategies. It can help to control growing data volumes and associated storage costs while improving application performance and minimizing the risk associated with data retention and compliance. Whether applied to packaged or custom applications or data warehouse environments, Data Archive can provide benefits to both IT groups and business units enabling them to intelligently archive and manage historical data throughout its lifecycle.

Benefits of DATA Archiving

  1. Reduces the costs of memory, disk and also administration costs.
  2. Ensures cost efficient system upgrades and migration.
  3. Improved system performance due to shorter response time.
  4. Reduces the cost of maintenance and run of growing application infrastructure.

Smart Data Catalog is an AI-powered data catalog that provides a machine learning-based discovery engine to scan and catalog data assets across the enterprise—across cloud and on-premises, and big data anywhere.

Smart Data Catalog is provides intelligence in terms of leveraging metadata to deliver intelligent recommendations, suggestions and automation of data management tasks.

This enables IT users to be more productive and business users to be able to be full partners in the management and use of data.

Benefits:

  • Automatically catalog and classify all types of data across the enterprise using an AI-powered catalog
  • Identify domains and entities with intelligent curation
  • Enrich data assets with governed and crowd sourced annotations