Cost-Based Oracle Fundamentals (Expert's Voice in Oracle) (v. 1)
|
| List Price: | $49.99 |
| Price: | $40.45 & eligible for FREE Super Saver Shipping on orders over $25. Details |
Availability: Usually ships in 24 hours
Ships from and sold by Amazon.com
45 new or used available from $10.20
Average customer review:Product Description
The insights that Jonathan provides into the workings of the cost-based optimizer will make a DBA a better designer, and a Developer a better SQL coder. Both groups will become better troubleshooters
— Thomas Kyte, VP (Public Sector), Oracle Corporation
The question, "Why isn't Oracle using my index?" must be one of the most popular (or perhaps unpopular) questions ever asked on the Oracle help forums. You've picked exactly the right columns, you've got them in the ideal order, you've computed statistics, you've checked for null columns&emdash;and the optimizer flatly refuses to use your index unless you hint it. What could possibly be going wrong?
If you've suffered the frustration of watching the optimizer do something completely bizarre when the best execution plan is totally obvious, or spent hours or days trying to make the optimizer do what you want it to do, then this is the book you need. You'll come to know how the optimizer "thinks," understand why it makes mistakes, and recognize the data patterns that make it go awry. With this information at your fingertips, you will save an enormous amount of time on designing and trouble-shooting your SQL.
The cost-based optimizer is simply a piece of code that contains a model of how Oracle databases work. By applying this model to the statistics about your data, the optimizer tries to efficiently convert your query into an executable plan. Unfortunately, the model can't be perfect, your statistics can't be perfect, and the resulting execution plan may be far from perfect.
In Cost-Based Oracle Fundamentals, the first book in a series of three, Jonathan Lewis&emdash;one of the foremost authorities in this field&emdash;describes the most commonly used parts of the model, what the optimizer does with your statistics, and why things go wrong. With this information, you'll be in a position to fix entire problem areas, not just single SQL statements, by adjusting the model or creating more truthful statistics.
Product Details
- Amazon Sales Rank: #104156 in Books
- Published on: 2005-10-31
- Original language: English
- Number of items: 1
- Binding: Paperback
- 536 pages
Features
- ISBN13: 9781590596364
- Condition: NEW
- Notes: Brand New from Publisher. No Remainder Mark.
- Click here to view our Condition Guide and Shipping Prices
Editorial Reviews
About the Author
Jonathan Lewis has been involved in database work for more than 19 years, specializing in Oracle for the last 16 years and working as a consultant for the last 12 years. Jonathan is currently a director of the UK Oracle User Group (UKOUG) and is well known for his many presentations at the UKOUG conferences and SIGs. He is also renowned for his tutorials and seminars about the Oracle database engine, which he has held in various countries around the world.
Jonathan authored the acclaimed book Practical Oracle 8i (Addison-Wesley, 2001), and he writes regularly for the UKOUG magazine and occasionally for other publications, including OTN and DBAZine.
Customer Reviews
If you have ever wanted to understand what Oracle is doing...
This is the book for you.
This book is, well, in a word amazing. If you have ever been baffled or bemused by why the heck did the optimizer do that, or as Jonathan wrote on page 299:
"I am reluctant to call something a bug unless I can work out what Oracle is doing and can prove that its doing something irrational. Too many people say, Its a bug when they really mean I dont know why this happened."
You will absolutely love this book. In it you will discover the hows and whys of the optimizer. Why statistics matter, how they matter. Whats up with histograms when and where do we need them, what affect do they have.
Sprinkled throughout the book are random insights like this one:
"There are many ways to implement Oracle systems badly, and as a general rule, anything that hides useful information from the optimizer is a bad idea. One of the simple, and highly popular, strategies for doing this is to stick all of your reference data into a single table with a type column. The results can be catastrophic as far as the optimizer is concerned."
And then is goes on to say why. That is what I really really like it goes on to say why. I hate it when statements are made and no reasoning is made why. You will find none of that in this book.
Jonathan did one thing in this book that Ill definitely be stealing myself. One neat thing is every chapter ends with a list of script names and descriptions. In the text, he references these script names as well. That way, when you download the code you have a straight reference to the sample you should be running. Ive used the (extremely poor) naming convention of demo001.sql, demo002.sql and so on. Next book theyll all have names and Ill be referencing exactly like he did. Very nice.
The attention to detail, the simplicity of presentation (I dont care what level of Oracle user you are you will be able to read this book and get it). If you are advanced (ok, Ill put myself into that category), youll learn things you did not know before. If you are beginner, youll know lots more than some advanced people after reading it. The surprising thing? It isnt that hard. Well, it wasnt to me anyway maybe the math background I have helped. You do not need 10 years of experience with Oracle to get this stuff, and if you have 10 years of experience with Oracle you will get new knowledge you never had.
Im on my second scan of it re-reading things that I didnt fully absorb. What Ill be doing lots in the future is referring to it. I got the gist of everything, I know where to go when I need to explain why. Or maybe Ill just post the link to the book.
And remember, this is I of III, two more to come
The Real Cost of Oracle
The beauty of reading a book by a publisher not sanctioned by Oracle and by an author who doesn't work for Oracle is that they can openly mention bugs. And there are oh-so-many! This book is a superb introduction to the Cost Based Optimizer, and is not afraid to discuss it's many shortcomings. In so doing it also explains how to patch up those shortcomings by giving the CBO more information, either by creating a histogram here and there, or by using the DBMS_STATS package to insert your own statistics in those specific cases where you need to.
Another interesting thing is how this book illustrates, though
accidentally, the challenges of proprietary software systems. Much of this book and the authors time is spent reverse engineering the CBO, Oracle's bread and butter optimizing engine. Source code, and details about its inner workings are not published or available. And of course that's intentional. But what's clear page after page in this book is that for the DBA and system tuner, going about their day to day tasks, they really need inside information about what the optimizer is doing, and so this book goes on a long journal to illuminate much of what the CBO is doing, or in some cases provide very educated guesses and some speculation. In contrast, as we know and hear about often, the Open Source alternative provides free access to source code, though not necessarily to the goods themselves. What this means in a very real way is that a book like this would not need to be written for an alternative open source application, because the internal code would be a proverbial open book. That said it remains difficult to imagine how a company like Oracle might persue a more open strategy given that their bread and butter really is the secrets hidden inside their Cost Based Optimizing engine. At any rate, let's get back to Jonathan's book.
Reading this book was like reading a scientists notebook. I found it:
o of inestimable value, but sometimes difficult to sift through
o very anecdotal in nature, debugging, and constantly demonstrating that the CBO is much more faulty and prone to errors than you might imagine
o may not be easy to say I have a query of type X, and it is behaving funny, how do I lookup information on this?
o his discussion of the evolution of the product is so good I'll quote it:
"A common evolutionary path in the optimizer code seems to be the following: hidden by undocumented parameter and disabled in first release; silently enabled but not costed in second release; enabled and costed in third release."
o has excellent chapter summaries which were particularly good for sifting, and boiling down the previous pages into a few conclusions.
o it will probably be of particular value to Oracle's own CBO development teams
Some chapter highlights
-------------------
CH2 - Tablescans
explains how to gather system stats, how to use dbms_stats to set ind. stats manually, bind variables can make the CBO blind, bind variable peeking may not help, partition exchange may break global stats for table, use CPU costing when possible
CH3 - Selectivity
big problem with IN lists in 8i, fixed in 9i/10g, but still prob. with NOT IN, uses very good example of astrological signs overlapping birth months, and associated CBO cardinality problems, reminds us that the optimizer isn't actually intelligent per se, but merely a piece of software
CH4 BTree Access
cost based on depth, #leaf blocks, and clustering factor, try to use CPU costing (system statistics)
CH5 - Clustering Factor
mainly a measure of the degree of random distribution of your data, very important for costing indx scans, use dbms_stats to correct when necessary, just giving CBO better information, freelists (procID problem) + freelist groups discussion with RAC
CH6 - Selectivity Issues
there is a big problem with string selectivity, Oracle uses only first seven characters, will be even more trouble for urls all starting with "http://", and multibyte charactersets, trouble when you have db ind. apps which use string for date, use histrograms when you have problems, can use the tuning advisor for "offline optimization", Oracle uses transitive closure to transform queries to more easily opt versions, moves predicates around, sometimes runs astray
CH7 - Histograms
height balanced > 255 buckets (outside Oracle called equi-depth),
otherwise frequency histograms, don't use cursor sharing as it forces bind variables, blinds CBO, bind var peeking is only first call, Oracle doesn't use histograms much, expensive to create, use sparingly, dist queries don't pull hist from remote site, don't work well with joins, no impact if you're using bind vars, if using dbms_stats to hack certain stats be careful of rare codepaths
CH8 - Bitmap Indexes
don't stop at just one, avoid updates like the plague as can cause deadlocking, opt assumes 80% data tightly packed, 20% widely scattered
CH9 - Query Transformation
partly rule based, peeling the onion w views to understand complex queries, natural language queries often not the most efficient, therefore this transformation process has huge potential upside for Oracle in overall optimization of app code behind the scenes by db engine, always remember Oracle may rewrite your query, sometimes want to block with hints, tell CBO about uniqueness, not NULL if you know this
CH10 - Join Cardinality
makes sensible guess at best first table, continues from there,
don't hide useful information from the CBO, histograms may help with some difficult queries
CH11 - Nested Loops
fairly straightforward costing based on cardinality of each returned set multiplied together
CH12 - Hash Joins
Oracle executes as optimal (all in memory), onepass (doesn't quite fit so dumped to disk for one pass) and multipass (least attractive sort to disk), avoid scripts writing scripts in prod, best option is to use workarea_size_policy=AUTO, set pga_aggregate_target & use CPU costing
CH 13 - Sorting + Merge Joins
also uses optimal, onepass, & multipass algorithms, need more than 4x dataset size for in memory sort, 8x on 64bit system, increasing sort_area_size will incr. CPU util so on CPU bottlenecked machines sorting to disk (onepass) may improve performance, must always use ORDER BY to guarentee sorted output, Oracle may not need to sort behind the scenes, Oracle very good at avoiding sorts, again try to use workarea_size_policy=AUTO
CH 14 - 10053 Trace
reviews various ways to enable, detailed rundown of trace with comments inline, and highlights; even mentions a VOL 2 + 3 of the book is coming!
Appendix A
be careful when switching from analyze to dbms_stats, in 10g some new hist will appear w/default dbms_stats options, 10g creates job to gather stats
Conclusion
----------
I found this book to be full of gems of information that you won't find anywhere else. If you're at the more technical end of the spectrum, this is a one of a kind Oracle book and a
must-have for your collection. Keep in mind something Jonathan mentions in appendix A: "New features that improve 99% of all known queries may cripple your database because you fall into the remaining 1% of special cases". If these cases are your concern, then this book will surely prove to be one-of-a-kind for you!
No Competition
There is little point to write how good this book is, since there is no other book devoted to SQL optimization exclusively. Dan Tow's book comes close, but he is focused more on a method of join graph analysis that he developed, than on details how optimizer did arrive to a certain access path. The lack of competition on the market is really surprising giving that SQL optimization is the only part of RDBMS that is justifiably complex, and would remain complex in foreseable future.
Compared to SQL optimizations all the other issues that DBA deals today look ridiculous. There is no reason why, for example export and import should be more complex than copying image file from your camera. Likewise, managing extents and segments is totally automated these days. All the manageability trend just proves a simple idea that RDBMS is nothing more than query execution engine.
Now, unlike any other RDBMS implementation area, the flow of poorly executed SQL never seems to cease. SQL Optimization is well known to be a difficult problem. Statistics information is incomplete, robust cost metrics is elusive, and the search space is explosive. The optimization goals are often conflicting. The very first idea that every SQL performance analyst discovers: "The optimization is only as good as its cost estimates". Those issues are fundamental rather than SQL DBMS vendor specific, of course. Given the scope and complexity of the problem, one citation comes to mind: "There is no emperor's way to SQL optimization".






