Microservices architectures and stream processing systems have become key technologies for modern distributed applications that require scalability and real-time data processing. In this context, asynchronous event-based communication enables loosely coupled service interactions but also introduces challenges related to event management, service dependencies, and data quality.
This research project investigates practical challenges in the development of event-driven systems and real-time data pipelines, focusing on issues such as event modeling and management, observability of event flows, and anomalies in streaming data, for example, data duplication. The goal is to analyze current practices and tools while exploring mechanisms to improve reliability, performance, and data quality in distributed architectures.
There is a strong demand for automation of Database Management Systems (DBMS) tasks, as those related to self-management and self-tuning activities. However, whenever automatic decisions are made, there is also a lack of clearness about the considered decisions and actions. This thesis proposes a framework to support the DBA (and possibly other database users) on choices concerning tuning activities. This research work includes the proposal of an ontology for (autonomous or not) database tuning that enables a formal approach for decisions and inferences. The goals are to offer transparency and confidence on the available tuning alternatives with respect to the possible DBMS scenarios through a concrete justification about the decisions that are made. Moreover, new tuning practices may be obtained automatically as soon as new rules, concepts and practices are known. Finally, our approach enables an actual combination, at a high level of abstraction, of distinct database tuning strategies.
Although most methodologies for ontology development emphasize reuse of existing ontologies, it is still often too complicated for people to understand the available ontologies to minimize redundancies via ontological analysis. In this context, we developed an Ontology Pattern Language to facilitate the construction of a well-founded ontology in the configuration management domain.
DBX is a tool for automatic maintenance of the physical design in relational databases. DBX is, in fact, a framework, since it can be instantiated for different database tuning activities, such as: materialized views and indexes maintenance. DBX is DBMS-independent and does not require any DBA intervention. It is based on heuristics, which run continuously, enabling, this way, dynamic reactions to workload changes by modifying the current physical design and, eventually, improving the database system’s performance. The self-tuning activities are chosen based on the analysis of the captured query execution plans and possible modifications on hypothetical plans. DBX was developed in Java and its source code is fully available for download in GitHub platform.
Hypothetical indexes are simulated index structures created solely in the database catalog. This type of index has no physical extension and, therefore, cannot be used to answer queries. The main benefit is to provide a means for simulating how query execution plans would change if the hypothetical indexes were actually created in the database. Thus this feature is useful for database tuners and DBAs. Index selection tools, such as Microsoft's SQL Server Index Tuning Wizard, make use of hypothetical indexes in the database server to evaluate candidate index configurations. We have made some server extensions to PostgreSQL 7.4 beta 3, PostgreSQL 8 and PostgreSQL 9.0.1 to include the notion of hypothetical indexes in the system. We have introduced three new commands: create hypothetical index, drop hypothetical index and explain hypothetical.