You can’t manage what you can’t measure. Just as software engineers need a complete picture of application and infrastructure performance, data engineers need a complete picture of data system performance. In other words, data engineers need data observability.
Data observability can help data engineers and their organizations ensure the reliability of their data pipelines, gain visibility into their data stacks (including infrastructure, applications, and users), and identify, investigate, prevent and resolve data issues. Data observability can help solve all kinds of common business data problems.
Data observability can help solve data and analytics platform scaling, optimization, and performance issues, by identifying operational bottlenecks. Data observability can help avoid cost and resource overruns, providing operational visibility, safeguards and proactive alerts. And data observability can help prevent data quality and data outages, by monitoring the reliability of data in pipelines and frequent transformations.
Acceldata Data Observability Platform
Acceldata Data Observability Platform is an enterprise data observability platform for the modern data stack. The IT platform provides complete visibility, giving data teams the real-time insights they need to identify and prevent issues and make data stacks more reliable.
Acceldata Data Observability Platform supports data sources such as Snowflake, Databricks, Hadoop, Amazon Athena, Amazon Redshift, Azure Data Lake, Google BigQuery, MySQL, and PostgreSQL. The Acceldata platform provides information on:
- Calculate – Optimize the compute, capacity, resources, costs and performance of your data infrastructure.
- Reliability – Improve data quality, reconciliation and determine schema drift and data drift.
- Pipelines – Identify issues related to transformation, events, applications and provide alerts and information.
- Users – Real-time insights for Data Engineers, Data Scientists, Data Administrators, Platform Engineers, Data Stewards, and Platform Managers.
The Acceldata Data Observation Platform is designed as a set of microservices that work together to manage various business outcomes. It gathers various metrics by reading and processing raw data as well as meta information from underlying data sources. It allows data engineers and data scientists to monitor compute performance and validate data quality policies defined in the system.
Acceldata’s data reliability monitoring platform allows you to define different types of policies to ensure that the data in your pipelines and databases meets required quality levels and is reliable. Acceldata’s compute performance platform displays all compute costs incurred on the customer’s infrastructure and allows you to set budgets and configure alerts when spending hits budget.
Acceldata’s data observation platform architecture is divided into a data plane and a control plane.
Data plane
The Acceldata platform data plane connects to the underlying databases or data sources. It never stores data and returns metadata and results to the control plane, which receives and stores the results of executions. Data parser, query parser, crawlers, and Spark infrastructure are part of the data plane.
The data source integration comes with a microservice that parses data source metadata from their underlying meta store. Any profiling, strategy execution, and data sampling job is converted to a Spark job by the analyzer. Job execution is managed by Spark clusters.
control aircraft
The control plane is the orchestrator of the platform and is accessible through the UI and API interfaces. The control plane stores all metadata, profiling data, task results, and other data in the database layer. It manages the data plane and, if necessary, sends requests to run jobs and other tasks.
The Data Computing Monitoring section of the platform obtains metadata from external sources via REST APIs, collects it from the data collection server, and then publishes it to the data ingestion module. Agents deployed near data sources periodically collect metrics before publishing them to the data ingestion module.
The database layer, which includes databases such as Postgres, Elasticsearch, and VictoriaMetrics, stores data collected from agents and the data control server. The data processing server facilitates the correlation of data collected by the agents and the data collector service. Dashboard Server, Agent Control Server, and Management Server are the data compute monitoring infrastructure services.
When a major event (errors, warnings) occurs in the system or subsystems monitored by the platform, it is either displayed on the user interface or notified to the user via notification channels such as Slack or email using the platform’s alert and notification server.
Key Capabilities
Detect issues early in data pipelines to isolate them before they reach the warehouse and affect downstream analytics:
- Move left to files and streams: perform reliability analysis in the “raw landing zone” and “enriched zone” before data reaches the “consumption zone” to avoid wasting expensive cloud credits and making bad decisions because of bad data.
- Data reliability powered by Spark: Fully inspect and identify issues at petabyte scale, with the power of open-source Apache Spark.
- Reconciliation between data sources: Run reliability checks that join disparate flows, databases, and files to ensure the accuracy of migrations and complex pipelines.
Get multi-layered operational insights to quickly resolve data issues:
- Know why, not just when: Debug data delays at their root by correlating data and compute spikes.
- Discover the true cost of bad data: Identify IT dollars wasted on untrusted data.
- Optimize data pipelines: Whether drag-and-drop or code-based, single-platform or polyglot, you can diagnose data pipeline failures in one place, at every layer of the stack.
Maintain a constant, comprehensive view of workloads and quickly identify and remediate issues through the operational control center:
- Built by data experts for data teams: Alerts, audits, and reports tailored for today’s leading cloud data platforms.
- Accurate Spend Intelligence: Predict costs and control usage to maximize ROI, even as platforms and prices change.
- Single pane of glass: Budget and monitor all your cloud data platforms in a single view.
Complete data coverage with flexible automation:
- Fully automated reliability checks: Immediately discover missing, late, or erroneous data on thousands of tables. Add an advanced data drift alert with a single click.
- Reusable SQL and User-Defined Functions (UDFs): Reusable, domain-centric reliability checks in five programming languages. Apply segmentation to understand reliability across dimensions.
- Extensive data source coverage: Apply enterprise data reliability standards across your enterprise, from modern cloud data platforms to traditional databases to complex files.
Acceledata’s Data Observability Platform works across various technologies and environments and provides enterprise data observability for modern data stacks. For Snowflake and Databricks, Acceldata can help maximize ROI by providing insights into performance, data quality, cost, and more. For more information, visit www.acceldata.io.
Ashwin Rajeeva is co-founder and CTO at Acceldata.
—
The New Tech Forum provides a venue to explore and discuss emerging enterprise technologies with unprecedented depth and breadth. The selection is subjective, based on our selection of the technologies that we think are important and most interesting for InfoWorld readers. InfoWorld does not accept marketing materials for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.
Copyright © 2023 IDG Communications, Inc.