Oakies Blog Aggregator

Franck Pachot's picture

Amazon DynamoDB: a r(el)ational Glossary

By Franck Pachot

.
There are many NoSQL databases. And, because SQL is an ISO standard, “No SQL” also means “No Standard”. Many have a similar API and similar objects, but with completely different names. Today, NoSQL databases are used as an additional datastore for some well-defined use cases for which a hashed key-value store fits better than a relational table. And it quickly became “Not Only SQL” as it is complementary to RDBMS databases using SQL. But at the origin, the idea was to replace the RDBMS databases, refusing the SQL API, and then inventing a “No SQL” data store. When you want to replace something rather than proposing something new, you often adopt the same language to make it look similar. And this why, in my opinion we find some relational database terms like “Table” and “Index”. But they have a different meaning. Here is a dictionary where I try to explain the DynamoDB artifacts and differentiate from their Relational and SQL meaning.

Attribute

This is taken from the Relational semantic. An “Attribute” (a “Column” in SQL) is an element of a “Relation” (a “Table” in SQL) for which a “Tuple” (a “Row” in SQL) can have a value. Basically, when you store a list of your friends, they have a first name, a last name, a birthdate,… and many other properties. “First Name”, “Last Name” are attributes of the relation “Friend”. The combination of those properties is a fact like a sentence saying “I have a friend whose first name is Donald and last name is Chamberlain and was born on December 21, 1944”. In DynamoDB an attribute is a JSON member. The previous tuple could be written as: {“first name”:”Donald”, “last name”: “Chamberlain”, birthdate: “1944-12-21”}.

But there are many differences. A relational attribute has a “Domain” (a “Data Type” in SQL) which defines the set of possible values. A “First Name” or “Last Name” is a combination of characters within a characterset, with a specific maximum length (because the application that manipulates it have some memory buffers, screen width, paper length,… physical limits). A “Birthdate” is a day within a specific calendar,… They are constrained so that the developer that reads an attribute knows that she will not encounter any edge case at runtime. If the domain says that the first name holds maximum 30 of [A-Za-z-‘] characters, then she knows that she will not buffer overflow when she allocates 30 bytes (or 60 in UTF-16, or 120 in UTF-8), and that she doesn’t have to code some additional assertions in her code to be sure there’s no non-alphanumeric characters there. In JSON you can encounter any character string in a value. And DynamoDB attributes have only a few data type constraints: Null, Number, String, Boolean, Binary. And they are defined per attribute value as there is no schema. The format of a date is just a character string with a convention. Nothing will tell you that “1752-09-13” is not a valid date.

If you come from Oracle, you can see a DynamoDB attribute like the ANYDATA data type:

DEMO@atp> create table demo (att anydata);
Table DEMO created.
DEMO@atp> insert into demo values (anydata.convertNumber(42));
1 row inserted.
DEMO@atp> insert into demo values (anydata.convertVarchar2(q'{Don't panic}'));
1 row inserted.
DEMO@atp> select * from demo;
                                                                            ATT
_______________________________________________________________________________
OPAQUE(ANYDATA TypeCode: "TYPECODE_NUMBER" - ANYDATA Value: "42")
OPAQUE(ANYDATA TypeCode: "TYPECODE_VARCHAR2" - ANYDATA Value: "Don't panic")

Another difference is that a relational attribute should be scalar (holds only one value). DynamoDB can store a whole hierarchy in a JSON attribute. The 1st Normal Form says that the value must be atomic even if Chris Date accepts nested tables in 1NF. And SQL allows this. Actually, it is not NoSQL vs. SQL here. Usually, in SQL you store hierarchies into another table but can store it as a hierarchy with nested tables, XML, or JSON. Usually, in NoSQL document database you store a whole hierarchy as one value but the “single-table” design splits them into multiple items of a collection.

If you compare with MongoDB, a DynamoDB attribute is similar to a MongoDB field.

Item

Even if an Item in DynamoDB looks like a relational “Tuple” (or “Row” in SQL) there’s no ambiguity. We are in a key-value store and the item is the value. We are in a key-value store and the item is the value. This would be stored in a SQL BLOB (Binary Large Object) in a relational database. As JSON, it could be the OSON datatype in Oracle or JSONB in PostgreSQL. It is a column, not a row.

The equivalent of a DynamoDB item would be a BLOB storing JSON in a table with a key and this BLOB only. This is, for example, how SODA, the NoSQL API for Oracle RDBMS, stores a collection:


DEMO@atp1_tp> soda create MyDocumentStore
Successfully created collection: MyDocumentStore

DEMO@atp1_tp> ddl "MyDocumentStore"

  CREATE TABLE "DEMO"."MyDocumentStore"
   (    "ID" VARCHAR2(255) COLLATE "USING_NLS_COMP" NOT NULL ENABLE,
        "CREATED_ON" TIMESTAMP (6) DEFAULT sys_extract_utc(SYSTIMESTAMP) NOT NULL ENABLE,
        "LAST_MODIFIED" TIMESTAMP (6) DEFAULT sys_extract_utc(SYSTIMESTAMP) NOT NULL ENABLE,
        "VERSION" VARCHAR2(255) COLLATE "USING_NLS_COMP" NOT NULL ENABLE,
        "JSON_DOCUMENT" BLOB,
         CHECK ("JSON_DOCUMENT" is json format oson (size limit 32m)) ENABLE,
         PRIMARY KEY ("ID") 
   ) ;

Here JSON_DOCUMENT is in format OSON, the Oracle binary implementation for JSON, stored in a BLOB which can be any size. The “soda” and “ddl” are standard SQLcl commands.

If you compare with MongoDB, a DynamoDB item is similar to a MongoDB document except that it is a JSON string value where MongoDB can store larger documents (up to 16MB) in binary format (BSON). But we will see that the idea is to split a document to multiple items in DynamoDB, with a key starting with the same prefix to be hashed and range partitioned together.

Collection

This is a NoSQL term. A “Collection” is a set of “Item”. But in DynamoDB this has a specific meaning and the SQL equivalent would be a table “Partition”. When you HASH partition an RDBMS table, you do the same as a DynamoDB collection: you apply a hash function on the partition key value which determines the physical segment of the table where you want to store it. This is how you scale any database, NoSQL or RDBMS: partition by hash and distribute those partition over the storage, processes, and maybe nodes.

A DynamoDB collection is not the same as a MongoDB collection. It is more like a chunk. What is called a collection in MongoDB is called a table in DynamoDB.

Table

This is where using the same name in DynamoDB as in SQL database can mislead to a bad data model. A “Relation” (“Table” in SQL) is a set of tuples (“rows”) that belongs to the same business entity. In DynamoDB you can store anything in the same table as it is just a key-value store. For example, you store the customer information with their orders within the same table and within the same collection. And that’s the goal: the data model is optimized for one use-case and if this use case is to display one customer orders, then you want to retrieve all that with one ‘get’. A SQL table has the opposite goal: centralize the data for many use-cases. You have one table with your customers. And one table with your orders. And the CRM can query the customers only. The Delivery will query only the orders. The Marketing wants to get some customer information for specific orders. The billing wants to get all orders for a customer…

Again, it is not NoSQL vs. SQL and you can have some nuances of both. In a RDBMS when you want to pre-build a hierarchy to optimize a specific use-case, you get the RDBMS to maintain transparently a redundant physical store like an index or a materialized view. And in DynamoDB the “single-table” design splits a document into multiple items to get it accessed from multiple use cases.

Actually, a Table in DynamoDB could be seen as a Materialized View in SQL rather than as a table. It is pre-built by the application and not by the RDBMS but it is like a view (purpose-built for specific queries) and materialized (the physical colocation of data is there by design). The worst mistake a developer can do is consider it as a SQL table and store related items in multiple DynamoDB tables. This would be bad for performance and cost. What you store in a DynamoDB table is probably at the same scope as what you store in a SQL schema for a small application.

Index

An index in RDBMS is a physical structure that provides fast access to a small number of rows because it is ordered on a value. That’s the same idea in DynamoDB: access with another attribute than the key. Another similarity is the fact that the index is transparently maintained. The developer declares the index and doesn’t have to take care of the maintenance in his code. Of course, in NoSQL the maintenance of the index can be in eventual consistency but it is still automatically maintained. However, there is a huge difference for your coding effort. In RDBMS you never query the index. The developer doesn’t have to know the physical design. An index is a physical structure created for performance reasons. SQL provides a logical view on the database: you query the table and the RDBMS optimizer decides how it is the most efficient get the rows required for the result. In DynamoDB you see the indexes as a table: you need to “get” on them if you want to use them.

In NoSQL, you have no optimizer. You do it in the code. If you add a new index for fast access, you need to change the code to use it. In RDBMS you have the agility to add an index (even online without application interruption in some RDBMS) and queries will use it when it is appropriate. It is all about having a simple API with calls to embed in your procedural code or calling a database service that does the procedural access for you.

In DynamoDB you have local indexes and global indexes. That’s the same idea as in RDBMS where in index can be local, partitioned in the same way as the table, or global, partitioned on another value. And you find the same optimization ideas. I have a few blog posts related to covering indexes in DynamoDB.

Primary Key

In DynamoDB as well as in SQL a Primary Key is a unique identifier for an item or row. It is enforced to be unique: If you put an item that already exists, it will replace it (the default in DynamoDB and can do the same in SQL with the MERGE statement) or reject it (the default in SQL and you can do the same in DynamoDB with an attribute_not_exists condition). RDBMS usually uses the primary key for the physical organization as well. Oracle is very agile on that and only specific table organization (like IOT) requires that the physical order follows the primary key. Other RDBMS (like MySQL InnoDB) will store the tables clustered on the primary key. In DynamoDB, like in any key-value store, the primary key determines the physical access through physical structures. A DynamoDB is hash partitioned on the first column of the primary key. And may be sorted on a second attribute. If you want to hash on multiple attributes, your code will concatenate the values to be the first attribute. Again, NoSQL provides a simple API and the developer does the additional logic in the procedural code. In RDBMS, a primary key is a declared constraint that may, or may not, be related to the physical organization. And can be compound with multiple columns. RDBMS can also have additional keys for which the uniqueness is enforced and for which the referential integrity can be validated.

In summary

This glossary is aimed at people working in SQL and NoSQL to avoid confusion between the different meanings of the same terms used in RDBMS and DynamoDB. The most important you should remember is that a Table in DynamoDB is different than a Table in SQL. Yon don’t need to separate to different tables because of the size: tables are automatically partitioned. You don’t need to separate to different tables because of the structure: all are key-value. And, you will quickly have a generic key name where you put different entities and the key value being build to control the clustering of data. Think of a DynamoDB table as a key-value store. You can have multiple stores to isolate multiple applications with different performance requirement (and cost). But an application should store its data into one store only. I’m not giving examples here because you can follow the two links on “single-table” which will bring you to the excellent examples from Rick Houlian and Alex DeBrie. I wrote this blog post as an expansion of my following tweet:

Cet article Amazon DynamoDB: a r(el)ational Glossary est apparu en premier sur Blog dbi services.

Richard Foote's picture

Oracle 19c Automatic Indexing: Data Skew Part III (The Good Son)

  I’m going to expand just a tad on my previous posts on data skew and run a simple query that returns a few rows based on a column predicate AND another query on the same column that returns many rows. The following table has a CODE column as with previous posts with the data […]

oraclebase's picture

Video : Instance Caging to Manage CPU Usage

In today’s video we’ll discuss instance caging to manage CPU usage. This can be useful when we are trying to consolidate multiple instances on a single server.

This video is based on the following article.

The star of today’s video is the beard belonging to Victor Torres. I feel totally inadequate with my patchy stubble… </p />
</p></div>
    <div class=»

Richard Foote's picture

Oracle 19c Automatic Indexing: Data Skew Part II (Everything’s Alright)

In my previous post, I discussed an example with data skew, in which the Automatic Indexing process created a new index, but somehow the CBO when using the index estimated the correct cardinality estimate even though no histograms were explicitly calculated. In this post I’ll answer HOW this achieved by the CBO. Get some idea […]

connor_mc_d's picture

When can I use SQL Macros?

One of the very cool features we’ve been talking about for 20c is SQL Macros. But you no longer need to wait for a future release of the database to get access to all the goodness of SQL Macros.

Why? Because much of the functionality has now been backported to 19c, and is also now officially in the documentation so there’s no ambiguity as to whether you are supported to use them or not. They’ll be coming soon to an RU near you </p />
</p></div>
    <div class=»

connor_mc_d's picture

Age calculations – Just how old ARE you?

Any database application that stores information about people will very likely store a person’s date of birth as part of that data. Whether it be for marketing purposes, or part of the security checks for a password reset, or part of the authentication when the inevitable call centre calls you with their “latest, greatest offer”, the date of birth is common.

But when it comes to reporting, typically it is not the date of birth that we present on our screen or hard-copy, but the person’s age. Clearly we cannot store a persons age in the database because it changes every day, month or year depending on the granularity with which you present it. Which leads to the ponderance – Surely it cannot be that hard to calculate someone’s age from their date of birth?

As it turns out, it is not as simple as it looks, and often developers have come up with “close but not close” enough methods to calculate the age, and by “not close enough” I mean, the methods break down as we hit boundary cases.

Lets create some test data, where the columns in the table T below are

  • birth – the date of birth someone provides to our application
  • sys_date – a representation of the standard “sysdate” function which reflects the date at which we run our age calculation
  • true_age – what the correct age is given the “birth” and “sys_date” column values.

SQL> create table t
  2    ( birth      date,
  3      sys_date   date,
  4      true_age   int );

Table created.

SQL>
SQL> insert into t values ( date '2000-05-12',date '2016-03-30',15 );

1 row created.

SQL> insert into t values ( date '2000-05-12',date '2016-05-30',16 );

1 row created.

SQL> insert into t values ( date '2000-05-12',date '2016-05-12',16 );

1 row created.

SQL> insert into t values ( date '2000-05-12',date '2076-06-12',76 );

1 row created.

SQL> insert into t values ( date '2000-02-29',date '2013-03-01',13 );

1 row created.

SQL> insert into t values ( date '2000-02-29',date '2016-02-29',16 );

1 row created.

SQL> insert into t values ( date '2000-02-29',date '2017-02-28',16 );

1 row created.

SQL> insert into t values ( date '2000-02-29',date '2087-02-28',86 );

1 row created.

SQL> insert into t values ( date '2000-01-01',date '2087-12-31',87 );

1 row created.

Using this data, lets look at common yet flawed mechanism for calculating age. This first one I’ve seen many times over the course of my career, namely to do a simple date subtraction. Sometimes people divide by 365, or 366 or even 365.25, but no matter what divisor is chosen, you’ll always hit some errors in the calculation.


SQL> select
  2  x.*,
  3   case when true_age = calculated_age then 'OK' else 'ERROR' end tag
  4  from
  5  (
  6  select
  7    t.*,
  8    trunc((sys_date-birth)/365) calculated_age
  9  from t
 10  ) x;

BIRTH     SYS_DATE    TRUE_AGE CALCULATED_AGE TAG
--------- --------- ---------- -------------- -----
12-MAY-00 30-MAR-16         15             15 OK
12-MAY-00 30-MAY-16         16             16 OK
12-MAY-00 12-MAY-16         16             16 OK
12-MAY-00 12-JUN-76         76             76 OK
29-FEB-00 01-MAR-13         13             13 OK
29-FEB-00 29-FEB-16         16             16 OK
29-FEB-00 28-FEB-17         16             17 ERROR
29-FEB-00 28-FEB-87         86             87 ERROR
01-JAN-00 31-DEC-87         87             88 ERROR

9 rows selected.

Side note: Date subtraction in itself needs care. See my recent post for more details on this.

A better method is to take advantage of the MONTHS_BETWEEN function in the Oracle Database, because unlike the days in a year which is variable, there is always 12 months in a year, so this sounds more promising as a divisor. However, as you can see below, even MONTHS_BETWEEN struggles when someone has a date of birth on a “leap day”. How annoying of them Smile


SQL> select
  2  x.*,
  3   case when true_age = calculated_age then 'OK' else 'ERROR' end tag
  4  from
  5  (
  6  select
  7    t.*,
  8    trunc(months_between(sys_date,birth)/12) calculated_age
  9  from t
 10  ) x;

BIRTH     SYS_DATE    TRUE_AGE CALCULATED_AGE TAG
--------- --------- ---------- -------------- -----
12-MAY-00 30-MAR-16         15             15 OK
12-MAY-00 30-MAY-16         16             16 OK
12-MAY-00 12-MAY-16         16             16 OK
12-MAY-00 12-JUN-76         76             76 OK
29-FEB-00 01-MAR-13         13             13 OK
29-FEB-00 29-FEB-16         16             16 OK
29-FEB-00 28-FEB-17         16             17 ERROR
29-FEB-00 28-FEB-87         86             87 ERROR
01-JAN-00 31-DEC-87         87             87 OK

9 rows selected.

Somewhat counter-intuitively, a reliable formula to derive a persons age is to dispense with the date arithmetic altogether and convert the dates to numbers.


SQL> select
  2  x.*,
  3   case when true_age = calculated_age then 'OK' else 'ERROR' end tag
  4  from
  5  (
  6  select
  7    t.*,
  8    trunc((to_number(to_char(sys_date,'YYYYMMDD'))-
  9          to_number(to_char(birth,'YYYYMMDD')))/10000) calculated_age
 10  from t
 11  ) x;

BIRTH     SYS_DATE    TRUE_AGE CALCULATED_AGE TAG
--------- --------- ---------- -------------- -----
12-MAY-00 30-MAR-16         15             15 OK
12-MAY-00 30-MAY-16         16             16 OK
12-MAY-00 12-MAY-16         16             16 OK
12-MAY-00 12-JUN-76         76             76 OK
29-FEB-00 01-MAR-13         13             13 OK
29-FEB-00 29-FEB-16         16             16 OK
29-FEB-00 28-FEB-17         16             16 OK
29-FEB-00 28-FEB-87         86             86 OK
01-JAN-00 31-DEC-87         87             87 OK

9 rows selected.

You can see how this neat little trick works by looking at the components of the expression


SQL> select
  2    to_number(to_char(sys_date,'YYYYMMDD')) d1,
  3    to_number(to_char(birth,'YYYYMMDD')) d2,
  4    to_number(to_char(sys_date,'YYYYMMDD')) -
  5      to_number(to_char(birth,'YYYYMMDD')) delt
  6  from t;

        D1         D2       DELT
---------- ---------- ----------
  20160330   20000512     159818
  20160530   20000512     160018
  20160512   20000512     160000
  20760612   20000512     760100
  20130301   20000229     130072
  20160229   20000229     160000
  20170228   20000229     169999
  20870228   20000229     869999
  20871231   20000101     871130

9 rows selected.

The conversion to numbers means that elements like Feb 28th and Feb 29th are still distinct as opposed to both being seen as “the last day of Feburary” by internal date functions, and (for example on the second last line), the age calculation stay just below 87 as opposed to crossing over the boundary to 87.

Now that you have a reliable formula, you can easily add the AGE as a column in a view on your table and not have to worry about the internals.

harald's picture

Start Taking Donations, Tips, and Contributions for Your Creative and Professional Pursuits

With our growing suite of payment features, we want to make it easier for you to earn money on WordPress.com. With the Donations block, you can now accept credit and debit card payments for all types of donations, earning revenue and growing your base of supporters. Collect donations, tips, and contributions on your website to fuel your creative and professional projects or to support and grow your business or organization.

Donations block example for an arts organization

https://en-blog.files.wordpress.com/2020/09/arts-organization-example.pn... 2048w, https://en-blog.files.wordpress.com/2020/09/arts-organization-example.pn... 150w, https://en-blog.files.wordpress.com/2020/09/arts-organization-example.pn... 300w, https://en-blog.files.wordpress.com/2020/09/arts-organization-example.pn... 768w" sizes="(max-width: 1024px) 100vw, 1024px" />

What can you accept donations for?

You can collect financial contributions on your website for just about anything — the sky really is the limit. Here are examples of things people support through donations:

  • Creative pursuits for musicians, artists, designers, writers, and more
  • Concrete creations like podcasts, video games, music clips, and photography
  • Bloggers and content creators of all shapes and sizes
  • Everyday passions like news summaries and mindfulness exercises
  • Professional endeavors including civic engagement and professional development
  • Nonprofits and community, religious, and political organizations

Donations block examples for a musician and radio station

https://en-blog.files.wordpress.com/2020/09/musician-example-1.png?w=91 91w, https://en-blog.files.wordpress.com/2020/09/musician-example-1.png?w=182 182w, https://en-blog.files.wordpress.com/2020/09/musician-example-1.png?w=768 768w, https://en-blog.files.wordpress.com/2020/09/musician-example-1.png 932w" sizes="(max-width: 621px) 100vw, 621px" />
https://en-blog.files.wordpress.com/2020/09/radio-_-podcast-example.png?... 150w, https://en-blog.files.wordpress.com/2020/09/radio-_-podcast-example.png?... 300w, https://en-blog.files.wordpress.com/2020/09/radio-_-podcast-example.png?... 768w, https://en-blog.files.wordpress.com/2020/09/radio-_-podcast-example.png 1132w" sizes="(max-width: 1024px) 100vw, 1024px" />

Continue to build your community by engaging with your supporters in a unique and authentic way. People can opt to support you through one-time, monthly, or yearly contributions, and the Donations block lets you engage with each level for a more custom experience. For example, you might send your monthly supporters additional content and information on top of what you send your one-time supporters.

As you ask for support, we’ll handle the rest — the credit and debit card payment processing, sending receipts, reporting, and more.

Ask for your first donation

Above all, the first step in earning money on your website is to ask for it. You can add a Donations block to your website in a matter of minutes; watch this short video to learn how. Alternatively, a step-by-step guide follows below.


How to use Donations block to earn money on your WordPress.com website

  • To use the Donations block, you’ll need a WordPress.com website with any paid plan — Personal, Premium, Business, or eCommerce.
  • On any page or post, add the Donations block.
  • To set up your first donation request, create a Stripe account if you don’t have one already. Stripe is the company we’ve partnered with to process credit and debit card payments in a safe, secure, and speedy way.
  • After you’ve connected to Stripe, configure the block’s settings, like how often you’re asking for donations. It can be any combination of single (one-time), monthly recurring, or yearly recurring donations.
  • Set three donation amounts that you’d like visitors to choose from for any of the payment intervals. These are fully customizable. Be sure to set your currency as well.
  • You can also allow visitors to donate what they want — essentially a blank box for them to fill out how much money they would like to give.
  • Review all of the text in your Donations block — you can edit every single letter, so be sure to provide enough information for your visitors about their donation, why you’re asking for it, etc.
  • Publish your block!
  • You can manage your supporters, see earnings, and keep an eye on other metrics in the Earn dashboard.
  • Last but not least, tell others about what you’re doing! Share on social media, email, and however you best communicate with people who might donate to your cause.

A suite of payment features to fit your needs

Looking to accept payments for something else? There are several other payment features on WordPress.com to suit your needs and help you make money with your website. In addition to the new Donations block, here are other features:

  • Payments block: Accept one-time or recurring payments on your website for physical items, digital downloads, services, memberships, subscriptions, and more.
  • Premium Content block: Create one-time, monthly, or yearly subscription options to share select content with those who pay for it — text, images, videos, or any kind of content. Exclusive content can be sent to email inboxes or viewed on your website.
  • Paid newsletters: Using the Premium Content block, you can share your site’s latest premium content via email newsletters in a fully automated way.
  • eCommerce Store: Turn your website into an eCommerce store and sell products and services seamlessly.

If you’re interested in setting up a membership- or subscription-based website, learn more about getting started with memberships and subscriptions.


Add the Donations block and start earning money with your website today!

Richard Foote's picture

Oracle 19c Automatic Indexing: Data Skew Part I (A Saucerful of Secrets)

When it comes to Automatic Indexes, things can become particularly interesting when dealing with data skew (meaning that some columns values are much less common than other column values). The next series of blog posts will look at a number of different scenarios in relation to how Automatic Indexing works with data that is skewed […]

Jonathan Lewis's picture

Bloom Upgrade

It’s a common pattern of Oracle features that they start with various restrictions or limitations that disappear over time. This note is about an enhancement to Bloom filter processing that appeared in the 18.1 optimizer and, for some people, may be a good enough reason for upgrading to a newer version of Oracle. This enhancement came to my attention through a question on the Oracle Developer forum asking how to get a Bloom filter pushed inside a UNION ALL view. The original requirement wasn’t a trivial one so I’ll demonstrate the problem with a very simple example – first the data set:

rem
rem     Script:         bloom_pushdown.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Sep 2020
rem
rem     Last tested:
rem             19.3.0.0
rem

create table t1 as select * from all_objects where rownum <= 50000;
create table t2 as select t1.* from t1, (select rownum n1 from dual connect by level <= 4);
create table t3 as select t1.* from t1, (select rownum n1 from dual connect by level <= 4); -- > comment to avoid wordpress format issue

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1 for columns object_type size 254'
        );
end;
/

I’ve been a bit lazy here, copying data from view all_objects. I’ve gathered stats on t1 so that I can generate a histogram on the object_type column because I’m going to query for a rare object_type and I want the optimizer to get a reasonable estimate of rows. I’m going to hint a parallel query to join t1 to t2 (aliased, trivially, as v1 for reasons that will become apparent soon):

select
        /*+ 
                parallel(2) 
        */
        t1.object_name, v1.object_name
from
        t1,
        t2 v1
where
        t1.object_type = 'SCHEDULE'
and     v1.object_id = t1.object_id
/

In my case the optimizer chooses to do a hash join between these two table, and creates a Bloom filter to try and minimise the data passing through the data flow operation. The result set in my 12.2.0.1 database is only 16 rows, so it would be nice if the parallel scan could eliminate most of the 200,000 rows in t2 early – here’s the execution plan pulled from memory after running the query with rowsource execution stats enabled:


----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation              | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |      1 |        |   371 (100)|     16 |00:00:00.06 |      20 |      0 |       |       |          |
|   1 |  PX COORDINATOR        |          |      1 |        |            |     16 |00:00:00.06 |      20 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)  | :TQ10000 |      0 |     16 |   371   (5)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|*  3 |    HASH JOIN           |          |      2 |     16 |   371   (5)|     16 |00:00:00.05 |    6278 |   3988 |  1250K|  1250K|     2/0/0|
|   4 |     JOIN FILTER CREATE | :BF0000  |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    2034 |      0 |       |       |          |
|*  5 |      TABLE ACCESS FULL | T1       |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    2034 |      0 |       |       |          |
|   6 |     JOIN FILTER USE    | :BF0000  |      2 |    200K|   292   (4)|     16 |00:00:00.04 |    4244 |   3988 |       |       |          |
|   7 |      PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|     16 |00:00:00.04 |    4244 |   3988 |       |       |          |
|*  8 |       TABLE ACCESS FULL| T2       |     32 |    200K|   292   (4)|     16 |00:00:00.03 |    4244 |   3988 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   5 - filter("T1"."OBJECT_TYPE"='SCHEDULE')
   8 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"V1"."OBJECT_ID"))

We see that Oracle has generated a Bloom filter at operation 4 from the data returned from t1 at operation 5, and then used that Bloom filter at operation 6 to eliminate most of the data from t2 before passing the remaining few rows up to the hash join.

Let’s make the query more interesting – what if you want to use a UNION ALL of t2 and t3 in the query (for example one might be “current data” while the other is “historic data”. Here’s the query and plan from 12.2.0.1:

select
        /*+ 
                parallel(2) 
        */
        t1.object_name, v1.object_name
from
        t1,
        (select * from t2 union all select * from t3) v1
where
        t1.object_type = 'SCHEDULE'
and     v1.object_id = t1.object_id
/

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
-----------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |      1 |        |   667 (100)|     32 |00:00:00.37 |      40 |      0 |       |       |          |
|   1 |  PX COORDINATOR         |          |      1 |        |            |     32 |00:00:00.37 |      40 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)   | :TQ10000 |      0 |     32 |   667   (5)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|*  3 |    HASH JOIN            |          |      1 |     32 |   667   (5)|     32 |00:00:00.34 |    5125 |   3860 |  1250K|  1250K|     2/0/0|
|*  4 |     TABLE ACCESS FULL   | T1       |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    2034 |      0 |       |       |          |
|   5 |     VIEW                |          |      2 |    400K|   584   (4)|    400K|00:00:00.52 |    8488 |   7976 |       |       |          |
|   6 |      UNION-ALL          |          |      2 |        |            |    400K|00:00:00.24 |    8488 |   7976 |       |       |          |
|   7 |       PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|    200K|00:00:00.11 |    4244 |   3988 |       |       |          |
|*  8 |        TABLE ACCESS FULL| T2       |     32 |    200K|   292   (4)|    200K|00:00:00.07 |    4244 |   3988 |       |       |          |
|   9 |       PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|    200K|00:00:00.11 |    4244 |   3988 |       |       |          |
|* 10 |        TABLE ACCESS FULL| T3       |     32 |    200K|   292   (4)|    200K|00:00:00.03 |    4244 |   3988 |       |       |          |
-----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   4 - filter("T1"."OBJECT_TYPE"='SCHEDULE')
   8 - access(:Z>=:Z AND :Z<=:Z)      -- > edit to avoid wordpress format issue
  10 - access(:Z>=:Z AND :Z<=:Z)      -- > edit to avoid wordpress format issue

No Bloom filter – so all 400,000 rows feed up the plan and through the hash join. This won’t matter too much for my sub-second tiny data set but on a pair of 50GB tables, with the potential to offload the Bloom filter to storage in Exadata and, perhaps, eliminate 99% of the data at the cell servers, this could make a huge difference to performance.

Since Bloom filters are all about hashing data (in Oracle the standard Bloom filter is the bitmap summarising the build table in a hash join) let’s trying pushing the optimizer into a hash distribution for the parallel join to see if that had any effect:


select
        /*+ 
                parallel(2) 
                gather_plan_statistics
                leading(@sel$1 t1@sel$1 v1@sel$1)
                use_hash(@sel$1 v1@sel$1)
                pq_distribute(@sel$1 v1@sel$1 hash hash)
        */
        t1.object_name, v1.object_name
from
        t1,
        (select * from t2 union all select * from t3) v1
where
        t1.object_type = 'SCHEDULE'
and     v1.object_id = t1.object_id
/

---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |      1 |        |   667 (100)|     32 |00:00:00.43 |      60 |      0 |       |       |          |
|   1 |  PX COORDINATOR             |          |      1 |        |            |     32 |00:00:00.43 |      60 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)       | :TQ10002 |      0 |     32 |   667   (5)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|*  3 |    HASH JOIN BUFFERED       |          |      1 |     32 |   667   (5)|     32 |00:00:00.38 |    4000 |   3752 |  2290K|  2082K|     2/0/0|
|   4 |     JOIN FILTER CREATE      | :BF0000  |      2 |      4 |    75   (4)|      8 |00:00:00.01 |       0 |      0 |       |       |          |
|   5 |      PX RECEIVE             |          |      2 |      4 |    75   (4)|      8 |00:00:00.01 |       0 |      0 |       |       |          |
|   6 |       PX SEND HYBRID HASH   | :TQ10000 |      0 |      4 |    75   (4)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|   7 |        STATISTICS COLLECTOR |          |      2 |        |            |      4 |00:00:00.01 |    1517 |      0 |       |       |          |
|   8 |         PX BLOCK ITERATOR   |          |      2 |      4 |    75   (4)|      4 |00:00:00.01 |    1517 |      0 |       |       |          |
|*  9 |          TABLE ACCESS FULL  | T1       |     26 |      4 |    75   (4)|      4 |00:00:00.01 |    1517 |      0 |       |       |          |
|  10 |     PX RECEIVE              |          |      2 |    400K|   584   (4)|     66 |00:00:00.77 |    8488 |   7976 |       |       |          |
|  11 |      PX SEND HYBRID HASH    | :TQ10001 |      2 |    400K|   584   (4)|     66 |00:00:00.77 |    8488 |   7976 |       |       |          |
|  12 |       JOIN FILTER USE       | :BF0000  |      2 |    400K|   584   (4)|     66 |00:00:00.77 |    8488 |   7976 |       |       |          |
|  13 |        VIEW                 |          |      2 |    400K|   584   (4)|    400K|00:00:00.68 |    8488 |   7976 |       |       |          |
|  14 |         UNION-ALL           |          |      2 |        |            |    400K|00:00:00.59 |    8488 |   7976 |       |       |          |
|  15 |          PX BLOCK ITERATOR  |          |      2 |    200K|   292   (4)|    200K|00:00:00.18 |    4244 |   3988 |       |       |          |
|* 16 |           TABLE ACCESS FULL | T2       |     32 |    200K|   292   (4)|    200K|00:00:00.06 |    4244 |   3988 |       |       |          |
|  17 |          PX BLOCK ITERATOR  |          |      2 |    200K|   292   (4)|    200K|00:00:00.12 |    4244 |   3988 |       |       |          |
|* 18 |           TABLE ACCESS FULL | T3       |     32 |    200K|   292   (4)|    200K|00:00:00.08 |    4244 |   3988 |       |       |          |
---------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   9 - access(:Z>=:Z AND :Z<=:Z)   -- > edit to avoid wordpress format issue 
       filter("T1"."OBJECT_TYPE"='SCHEDULE') 
  16 - access(:Z>=:Z AND :Z<=:Z)   -- > edit to avoid wordpress format issue
  18 - access(:Z>=:Z AND :Z<=:Z)   -- > edit to avoid wordpress format issue

We’ve managed to introduce a Bloom filter (which is visible as :BF0000 in the plan, even through there’s no reference to sys_op_bloom_filter() in the predicate information) but there’s a problem, we’re still passing 400,000 rows up the plan and the Bloom filter is only being applied at (or just after) the VIEW operator, discarding all but 66 rows before doing the hash join. It’s an improvement but not ideal; we’d like to see the Bloom filter applied to each of the two tables separately to eliminate rows as early as possible.

This can’t be done in 12.2, and you’d have to rewrite the query, changing a “join with union” into a “union of joins”, and that’s not really a desirable strategy.

Next Steps

Searching MOS, though you will be able to find the following note:

Doc ID 18849313.8 – ENH : bloom filters/pruning are pushed through union-all view

There’s an enhancement request to do what we want in 18.1, and the enhancement has got into the software. Here’s the (unhinted) plan from 19.3 (the plan stays the same when optimizer_features_enable is set back to 18.1.0, but drops back to the 12.1. plan when OFE is set to 12.2.0.1):

------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem |  O/1/M   |
------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |      1 |        |   666 (100)|     32 |00:00:00.11 |      10 |      0 |       |       |          |
|   1 |  PX COORDINATOR          |          |      1 |        |            |     32 |00:00:00.11 |      10 |      0 | 73728 | 73728 |          |
|   2 |   PX SEND QC (RANDOM)    | :TQ10000 |      0 |     32 |   666   (5)|      0 |00:00:00.01 |       0 |      0 |       |       |          |
|*  3 |    HASH JOIN             |          |      2 |     32 |   666   (5)|     32 |00:00:00.05 |   10020 |   7958 |  1250K|  1250K|     2/0/0|
|   4 |     JOIN FILTER CREATE   | :BF0000  |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    1998 |      0 |       |       |          |
|*  5 |      TABLE ACCESS FULL   | T1       |      2 |      4 |    75   (4)|      8 |00:00:00.01 |    1998 |      0 |       |       |          |
|   6 |     VIEW                 |          |      2 |    400K|   583   (4)|     32 |00:00:00.04 |    8022 |   7958 |       |       |          |
|   7 |      UNION-ALL           |          |      1 |        |            |     12 |00:00:00.02 |    4011 |   3979 |       |       |          |
|   8 |       JOIN FILTER USE    | :BF0000  |      2 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|   9 |        PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|* 10 |         TABLE ACCESS FULL| T2       |     32 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|  11 |       JOIN FILTER USE    | :BF0000  |      2 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|  12 |        PX BLOCK ITERATOR |          |      2 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
|* 13 |         TABLE ACCESS FULL| T3       |     32 |    200K|   292   (4)|     16 |00:00:00.02 |    4011 |   3979 |       |       |          |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("V1"."OBJECT_ID"="T1"."OBJECT_ID")
   5 - filter("T1"."OBJECT_TYPE"='SCHEDULE')
  10 - access(:Z>=:Z AND :Z<=:Z)      -- > edit to avoid wordpress format issue
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T2"."OBJECT_ID")) 
  13 - access(:Z>=:Z AND :Z<=:Z)      -- > edit to avoid wordpress format issue
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"T3"."OBJECT_ID"))

As you can see, we create a Bloom filter at operation 4, and use it twice at operations 8 and 11 – with the sys_op_bloom_filter() functions clearly visible in the predicate information showing us that the Bloom filter is applied to the object_id column in both cases.

If you want to disable this enhancement for some reasons there are two hidden parameters available (which you might set for a single query using the opt_param() hint):

  • _bloom_filter_setops_enabled = true
  • _bloom_pruning_setops_enabled = true

The first is for Bloom filters in the situation shown, I assume the second deals with Bloom filters for partition pruning.

Summary

In versions prior to 18.1 the optimizer is unable to push Bloom filters down to the individual tables in a UNION ALL view, but this limitation was removed in the 18.1 code set.

 

oraclebase's picture

Packer by HashiCorp : First Steps

A few days ago I wrote about some Vagrant Box Drama I was having. Martin Bach replied saying I should build my own Vagrant boxes. I’ve built Vagrant boxes manually before, as shown here.

The manual process is just boring, so I’ve tended to use other people’s Vagrant boxes, like “bento/oracle-8”, but then you are at the mercy of what they decide to include/exclude in their box. Martin replied again saying,

“Actually I thought the same until I finally managed to get around automating the whole lots with Packer and Ansible. Works like a dream now and with minimum effort”

Martin Bach

So that kind-of shamed me into taking a look at Packer. </p />
</p></div>
    <div class=»

To prevent automated spam submissions leave this field empty.