Know Your Numbers. A Quick Analysis with AWR and Characteristic Numbers. Part II.
In Part II we put our knowledge of characteristic time ranges to a service to find a bottleneck in one particular SQL statement that was a part of one particular initial ETL run. The session was moving data from a set of source tables into a target table
Know Your Numbers. A Quick Analysis with AWR and Characteristic Numbers. Part I.
SQL statements fired by ODI for ELT jobs sometimes can be extremely complicated. There are huge tables, dozens of joins, sorts, complex predicates with CASE expressions, aggregate and analytical functions and all that jazz. The same applies (and occasionally to a larger extent) to physical queries queries generated by a BI server.
OBIEE PERF_PREFER_INTERNAL_STITCH_JOIN: To stitch or not to stitch
Modern BI Reporting tools allow end users to put together reports, ad-hoc queries, dashboards with few mouse clicks, thus hiding the complexity of physical data model, design and architecture in their Implementations. It is quite naive to believe that the masked complexity would not impact performance, scalability or worst, the combination of performance and scalability on large volume databases. The tools haven't reached the state of perfection in generating the efficient and performing database queries. And most of them generate database SQLs, overloaded with redundant SQL sub-queries, unnecessary functions, repeating SQL patterns. It is also true that database engines have been improving, parsing more and more complex SQLs, but so far BI tools produce more and more complex generated SQLs at faster pace, compared to database SQL optimization additions in DB engines.
Analytics Time Machine
A typical BI Analytics implementation (and here I am talking about classic Data Warehouse, populated via ETL / ELT, and queried with Reported Tools) generates and logs plethora of day-to-day operational information, starting with database, producing workload statistics (Oracle AWR for example), ETL tools, generating tons of logged data in proprietary log file formats as well as their repositories, and Reporting Tools, producing their own logging information. If you were lucky to trace all of it, you would take backups of Informatica session logs (otherwise they would be overwritten by the next daily ETL) and Oracle BI Server Enterprise Edition logs, and increase log retention in ETL repositories as some data may get purged. And then you end up with an impressive pile of logs for parsing and mining for bad symptoms, and / or manual matching with repositories data. From my experience, most of BI Admins keep the latest snapshot only. And I agree, what is the use if you don't have good tools to digest all this information.
Sizing considerations for BI Analytic Applications Implementation
Every time I hear people talking about sizing for their BI Implementations, I am tempted to ask what exactly do they mean under 'sizing', what kind of recommendations and suggestions they would like to get back. Why cannot someone (BI solution providers, for example) come up with a magic formula or an Excel spreadsheet, which produce the most accurate estimates? Well, I can assure you that various consulting businesses do use actively such spreadsheets and templates and they do make it big deal to provide sizing recommendations. And I agree that sizing is a serious consulting project, though I have my opinion about templates and spreadsheets. So, let's look into BI Sizing in more details.