Oakies Blog Aggregator

dbakevlar's picture

This is GDPR Reality

How many updated policies for use of personal data did you sign off the last couple days? As I had observed the EU procrastinating on their compliance for General Data Protection Regulations, (GDPR) until December of 2016, (it went into effect in April, 2014 with a deadline in the EU of December, 2016) I wasn’t surprised that we’re seeing a flurry of requests to sign off on data usage this week. The deadline is tomorrow, May 25th for non-EU and as expected, here we are.

I’ve been talking about GDPR for over two years now and I’m far from the only one. I’m still surprised how few were focused on it other than a known buzzword. A buzzword it has become and I think as other buzzwords like “machine learning” and “AI”, it can begin to just be white noise in the background, like a slight humming in the background of our daily technical demands. News has been notifying us about the looming deadline since it went into effect, yet so few were actually pro-active about it. Even with all the recent policy updates, no matter website, company or application, a policy shouldn’t be confused with compliance.

The What

GDPR compliance requires you to not only notify the users that you have their data and what you’ll be using it for. You also have to possess an ongoing way of auditing usage, locating the data, updating the data and if requested by the user, REMOVE the data from your system. Yes, there are caveats surrounding historical and ongoing compliance for those dealing with credit, banking and government data. It’s not like you can send an email to the credit card company you didn’t pay off and demand they remove you from their system to escape responsibility. The company simply has to justify how and why they use your personal information. You can reject their desire to sell your data to another company for marketing purposes and such, but the data about your credit history or similar data will remain.

The Who

As I’ve discussed in previous posts, GDPR effects just about every business and type of business on the planet. You can claim you don’t have any European customers, but if you have a retail website, you have to be GDPR compliant as a user in the EU can access it. If you have a website that tracks data about users that view it, log into it or data such as IP addresses and cookies, you are subject to GDPR. If you have any data in your system that is of an EU citizen and keep in mind, I said CITIZEN, no EU address or EU IP address, you are expected to be compliant. Unless you’re collecting passport data that lists where the customer is a citizen of, you won’t know until someone complains and its too late.

The When

I have a promise from Brent Ozar that a year from now, we’ll revisit the idea that companies and entrepreneurs will see opportunities with GDPR in place. We both agree that the EU will most likely make a few, large companies into examples, charging them the 4% annual revenue fine for GDPR non-compliance, but I fore see this growing into a real business sector. I expect third parties to take advantage of the opportunity to be partners to the EU government, working hand-in-hand to identify, investigate, audit and fine companies for non-compliance, taking a percentage of the fine off the top as revenue after serving as the resources the government won’t have to pursue individual businesses. This will require a more American capitalism-like venture, but in the latest issue of Time just named France’s Emmanuel Macron as the leader of the world’s free market, so sit back and watch as EU companies spring up and American companies follow suit. I do see a change coming with the EU being more aggressive in the way they do business.

The How

In the meantime, I actually created a filter to send all but those that match a list of applications and sites I use regularly to spam. Yep, if I didn’t even know I had an account with you, then most likely I don’t want you to have access to my data. To do this, I created a global rule on emails with filtered search terms and then built a second one that looks for keywords from a list and pushes them back from spam to inbox.

Automation is the only way I’m going to survive this life and if I miss something, well, I guess I’ll sign off on the data usage policy like a newbie when it happens and enjoy reaping the benefits already in place for EU citizens. I think we’re all sick of hearing GDPR, but no matter what, it’s here to stay.



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [This is GDPR Reality], All Right Reserved. 2018.

fritshoogland's picture

A re-introduction to the vagrant-builder suite for database installation

In a blogpost introducing the vagrant builder suite I explained what the suite could do, and the principal use, to automate the installation of the Oracle database software and the creation of a database on a virtual machine using vagrant together with ansible and virtual box.

This blogpost shows how to use that suite for automating the installation of the Oracle database software and the creation of a database on a linux server directly, with only the use of ansible without vagrant and virtualbox.

The suite has been updated with all the PSU’s up to current (180417; april 2018), for 11.2.0.4, 12.1.0.2 and 12.2.0.1, and now includes Oracle 18. Please mind for Oracle version 12.2 and 18 you need to provide the installation media, because these are not downloadable as patch. I really hate that Oracle is not providing the installation media for these downloadable in an automated way. This is a good opportunity for oracle to show it’s listening and provide a solution, especially because it wants to be a player in the developer/devops field.

The regular use if this suite with vagrant/ansible/virtualbox would be to clone the vagrant-builder repository, set the variables in the Vagrantfile, and issue ‘vagrant up’, which will fetch an O/S image from the vagrant cloud, create an extra disk for Oracle, create a VM specification in virtual box, boot up the VM, and then run ansible to do the full configuration of Linux for installing the Oracle database, install and patch the Oracle database software and create a database. Without any human intervention.

Currently, the local/direct install scripts support redhat (actually, a redhat version that manifests itself as ‘Red Hat Enterprise Linux’ or ‘RedHat’ in the ansible fact ‘ansible_distribution’) or oracle linux (‘OracleLinux’), version 7 only. You can get all the facts that ansible gathers on the current host by executing ‘ansible localhost -m setup’.

To use it natively on a linux system, you must first make sure the operating system provides a /u01 directory with enough space for the Oracle software install and the database. A rough indication would be more than 20G. I regularly use 40G, which allows me to use the database and create tablespaces. It doesn’t care if it’s only a directory, or if it’s a mount point. Actually the only thing that local_install.yml is doing with it, is changing the ownership of the /u01 directory to oracle once the user is created.

The first thing to do is install ansible and git. Ansible is using python, and python comes installed with all recent RHEL compatible distributions (as an example, ‘yum’ is using python too). In my opinion, installing additional python packages should not be a problem. If your environment is highly standardised and these kind of installs are not allowed, you should use the scripts in the “regular” ansible way, which I will detail at the end.

1. Install git
Oracle linux 7 has git in the base repository. I assume this means this is the same for RHEL 7. That means that when the server has the base repository or the latest repository or a company repository (for version controlling the packages), it can be simply done using yum install:

# yum install git

2. Install ansible
One way of installing ansible is using easy_install and pip. This will get you a recent ansible version. The simplest way would be using an EPEL repository, however, packages in the linux distribution repositories are not updated very frequently in my experience.

# yum install python-setuptools
# easy_install pip
# pip install ansible

3. Clone the vagrant-builder suite
This CLI example clones the vagrant-builder suite into the builder directory. You can name the directory any way you want. The best way of doing this is using a normal (meaning non-root) user, that has password-less unlimited sudo rights. Cloud environments come with such a user by default, for the oracle cloud this is ‘opc’, for amazon this is ‘ec2-user’ (and for vagrant boxes this is ‘vagrant’). The ansible script uses sudo to execute as root, creates the oracle user and then uses sudo to execute as oracle.

$ git clone https://gitlab.com/FritsHoogland/vagrant-builder.git builder

4. Run local_install.yml
Now the automatic installation components are all setup. The next thing to do is go into the builder/ansible directory, and edit the local_install.yml file:

---
- hosts: localhost
  become: true
  vars:
  - mosuser:
  - mospass:
  - oracle_base: /u01/app/oracle
  - database_name:
  - global_password: oracle
  - db_create_file_dest: /u01/app/oracle/oradata
  - database_version:
  - asm_version: ""
  - stage_directory: /u01/stage
...

Line 5/6: fill out MOS details. This is needed to download patches or installation media in patches.
Line 7: this is the default value for the ORACLE_BASE.
Line 8: you need a database name if you want to create a database. If you set it to empty (“”) no database will be created.
Line 9: this sets this password for all oracle database accounts.
Line 10: this variable sets the place for all the database files, db_create_file_dest.
Line 11: this sets the database software version to install. Ansible will determine what files to download. The Oracle 12.2 installation media must be placed in the builder/ansible/files directory. Look in the Vagrantfile for specifications and versions available.
Line 12: the grid software is currently not installed. The variable needs to be specified in order to have the facts setup. Facts are ansible variables.
Line 13: this is the directory in which all files are staged. It will be created at the beginning of a role, and removed at the end.

After this has been filled out, run it with a user that is allowed to execute sudo without specifying a password:

$ ansible-playbook local_install.yml

Ansible will read the playbook, and see that no inventory (list of hosts) is specified, and the hosts specifier is set to localhost, and then run locally.

Just like with its original use with vagrant, this will download all the necessary software directly from Oracle using the MOS credentials, with the exception of the installation media of Oracle 12.2.0.1 and Oracle 18.0.0.0, because these can not be downloaded via the CLI as far as I know, and therefore have to be provided in the ansible/files directory.

If you want to speed up the build, or do not have a good enough network, you can put the installation media in the files directory anyway, the ansible script will look in the files directory for it. The patches for performing patching (all MOS downloads are patches, of which some are actually the installation media) are always downloaded from MOS.

Using ansible non-local.
If you are not allowed to install additional software on the oracle server, there is another way. However, when you want to use this, it means you are asked to essentially manually install the oracle software. Regardless of what is allowed, you have to make changes to the operating system to facilitate running oracle anyway, and apparently the environment is not that automated, so that claim would be a bit silly.

The non-local way is running ansible in the way it is normally used, which is having a server with ansible installed, from which it uses ssh to run the ansible playbooks on (one or more) remote servers (which then only needs python installed on the remote servers, which EL6 and EL7 have by default). The local_install.yml script can be modified very easily to be used in this way: the hosts specification must be changed to ‘all’ or to a name given to a group of servers in the inventory file. This however is beyond the scope of the article.

However, if you need to perform database software installations and database creations regularly in an environment, it would make sense to use a centralised server to perform these actions, instead of setting it up on every distinct server.

Bonus material.
I added another script in the ansible directory, db_management.yml, which performs database creation and removal (and the install of slob). It follows the same pattern of local_install.yml, fill out the variables, and, unique for this script, uncomment the action you want it to perform, and run it.

Jonathan Lewis's picture

Show parameter

Just a quick little tip for Friday afternoon.

If you use the “show parameter” or “show spparameter” commands from SQL*Plus you’ve probably noticed that the parameter value may be too long for the defined output column, and even the parameter name may occasionally be too long. For example (from 12.2.0.1):

SQL> show spparameter optimizer

SID	 NAME			       TYPE	   VALUE
-------- ----------------------------- ----------- ----------------------------
*	 optimizer_adaptive_plans      boolean
*	 optimizer_adaptive_reporting_ boolean
	 only
*	 optimizer_adaptive_statistics boolean
*	 optimizer_capture_sql_plan_ba boolean
	 selines
*	 optimizer_dynamic_sampling    integer
*	 optimizer_features_enable     string
*	 optimizer_index_caching       integer
*	 optimizer_index_cost_adj      integer
*	 optimizer_inmemory_aware      boolean
*	 optimizer_mode                string
*	 optimizer_secure_view_merging boolean
*	 optimizer_use_invisible_index boolean
	 es
*	 optimizer_use_pending_statist boolean
	 ics
*	 optimizer_use_sql_plan_baseli boolean
	 nes

SQL> show parameter audit

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
audit_file_dest                      string      /u01/app/oracle/admin/orcl12c/
                                                 adump
audit_sys_operations                 boolean     TRUE
audit_syslog_level                   string
audit_trail                          string      NONE
unified_audit_sga_queue_size         integer     1048576

The column definitions for these displays are embedded in the the SQL*Plus library code ($ORACLE_HOME/lib/libsqlplus.so), and the effects can be seen by spooling the output from the basic “column” command from SQL*Plus and searching for the word ‘param’ (ignoring case). The relevant column definitions are:


COLUMN   SID_COL_PLUS_SHOW_SPPARAM      ON      FORMAT    a8   HEADING SID
COLUMN   VALUE_COL_PLUS_SHOW_SPPARAM    ON      FORMAT   a28   HEADING VALUE
COLUMN   NAME_COL_PLUS_SHOW_SPPARAM     ON      FORMAT   a29   HEADING NAME

COLUMN   value_col_plus_show_param      ON      FORMAT   a30   HEADING VALUE
COLUMN   name_col_plus_show_param       ON      FORMAT   a36   HEADING NAME

If you want to change a few of the lengths (or even the column headings) you can simply add modified copies of these commands to the glogin script ($ORACLE_HOME/sqlplus/admin/glogin.sql) or to a login.sql script that’s referenced in your defined SQL path (and make sure you take a look at Franck Pachot’s blog to see how that has changed in 12.2).

 

martin.bach's picture

Hybrid Columnar Compression in 12.2 – nice new feature

Oracle 12.2 introduced an interesting optimisation for Hybrid Columnar Compression (HCC). Until 12.2 you had to use direct path inserts into HCC compressed segments for data to be actually compressed. If you didn’t use a direct path insert you would still succeed in entering data into the segment, however your newly inserted data was not HCC compressed. There is no error message or other warning telling you about that, which can lead to surprises for the unaware.

My friend and colleague Nick has pointed out that the official HCC white paper states – somewhat hidden – that this requirement is no longer as strict in 12.2. I haven’t managed to find the document Nick pointed out, but a quick search using my favourite retrieval engine unearthed the updated version for 18c.

The new feature is easy to miss: you read on page 3 that data must be loaded using data warehouse bulk loading (direct path) techniques. Examples of said technologies follow before the author continues with a discussion about querying HCC data. For me that was the end of the subject… Turns out it wasn’t: the nugget Nick unearthed was on page 4, in the last paragraph before the next section on “HCC Warehouse (Query) Compression”. Quoting literally from said white paper:

Starting with Oracle Database 12.2, HCC automatically compresses new data from SQL INSERT … SELECT statements, without the APPEND hint and array inserts from programmatic interfaces such as PL/SQL and the Oracle Call Interface (OCI).

Aha! Once you know what to look for you find the same information in the 12.2 new features guide, too. Sometimes it’s hard to see the wood for all those trees.

So there is no more need for the append hint for insert-select in 12.2? This would be great news, and should alleviate some issues people have from unknowingly using HCC without direct path inserts. And it’s good for concurrency! Although I do like white papers, there are times when the white paper has to be checked against the lab to see if it is actually true.

I did exactly this for this blog post.

12.2 Test Case

I am using the SOE.ORDERS table for that purpose, as it holds a fair bit of data. To see whether the new algorithm works I decided to create a new empty table ORDERS_HCC with the same table structure as SOE.ORDERS. In the next step I issue an insert-select statement. If the white paper is correct it’ll compress the data using Query High.

SQL (12.2)> show user    
USER is "SOE"
SQL (12.2)> select banner from v$version where rownum  create table orders_hcc 
  2  column store compress for query high 
  3  as select * from orders where 1 = 0; 
                                                                                                
Table created.

SQL (12.2)> insert into orders_hcc 
  2  select * from orders where rownum  commit;

Commit complete.

SQL (12.2)>

Note that I specifically omitted the /*+ append */ hint in the insert statement. Also note that the preceding CTAS statement didn’t select any rows from the source. In Oracle releases up to and including 12.1, data in ORDERS_HCC would not be compressed at the end of this little code snippet.

But how can you prove the white paper is right? Thankfully Oracle is well instrumented, and even more thankfully that instrumentation not limited to performance. The Oracle-provided dbms_compression package features a procedure called “get_compression_type()”, which allows you to pass it a ROWID and some other information and it’ll tell you the block’s compression algorithm.

Remember that you can change the compression algorithm for a given segment many times over. A partition in a range partitioned table can start uncompressed while being actively used, and as data gets colder, compression levels can be increased. Technically speaking the “alter table … column store” command on its own does not change the way data is stored in the segment. Data currently stored will remain in whatever state it was before changing the compression attribute. Only newly inserted data will be compressed according to the changed segment attribute.

Back to the example: using DBMS_COMPRESSION I would like to find out if my table data is indeed compressed for Query High after my earlier insert command. Let’s see if it is, using the first 10 rows as a rough indication.

SQL (12.2)> select dbms_compression.get_compression_type(user, 'ORDERS_HCC', rowid) ctype
  2  from ORDERS_HCC where rownum < 11;

     CTYPE
----------
	 4
	 4
	 4
	 4
	 4
	 4
	 4
	 4
	 4
	 4

10 rows selected.

Well it would appear as if these are all compressed for Query High (QH). Looking at the package definition I can see that a compression type of 4 indicates Query High.

So far so good, but I merely checked 10 out of 1 million rows. I’m fairly sure the rest of the table is also HCC compressed for QH, but I want proof. To remove any doubts, I can query the whole table. I’m doing this so you don’t have to. The next query will take forever (eg certainly more than 1 minute) to execute, and it is CPU bound so please don’t do this at work. If you really feel like having to run a query like this, don’t do it outside the lab. You have been warned :)

SQL (12.2)> with comptypes as (
  2   select rownum rn, dbms_compression.get_compression_type(user, 'ORDERS_HCC', rowid) ctype
  3     from ORDERS_HCC
  4  )
  5      select count(*), ctype from comptypes group by ctype;

  COUNT(*)      CTYPE
---------- ----------
   1000000          4

I always start these types of queries in a screen (1) session to prevent network connection issues from interrupting my long running task. After some time, the query returns with the results as you can see. The entire table is compressed with Query High.

Summary

Array-inserts into HCC segments can compress data in Oracle 12.2 even if you don’t specify the append hint. The behaviour for conventional inserts did not change. I am going to post another piece of research containing some more data later this week or next.

Franck Pachot's picture

Postgres, the fsync() issue, and ‘pgio’ (the SLOB method for PostgreSQL)

That’s a long blog post title, which is actually just a good pretext to play with Kevin Closson SLOB method for PostgreSQL: pgio
I use the beta version of pgio here. If you want to read more about it, you can start on https://kevinclosson.net/2018/05/22/sneak-preview-of-pgio-the-slob-method-for-postgressql-part-i-the-beta-pgio-readme-file/. If you are used to the SLOB for Oracle (https://kevinclosson.net/slob/) you will quickly understand the ‘why’ and ‘how’ of pgio.

PostgreSQL’s fsync() surprise

You may have read about the fsync() issue. Postgres, from the beginning, relies a lot on the filesystem buffering to optimize I/O. So they write() to the data files but fsync() only at checkpoints. This is ok when everything goes well because the writes since the last checkpoints are protected by the Write Ahead Logging, where fsync() occurs for each writes at commit (if you didn’t change the default parameters for WAL). But when a problem occurs, such as power outage, some writes may be lost, or partially lost, and that’s not easy to detect at checkpoint time with fsync().

So, basically, there’s a risk of corruption and there are no easy ways to detect it.

You can read the details from https://lwn.net/Articles/752063/ and that’s not the subject of this post.

Most of the other databases are opening files with O_DSYNC, which means that the write() call will detect the error immediately. And the major ones are doing direct I/O anyway because they have their own buffer cache and do not need the performance overhead and corruption risk of double buffering.

Why is this so hard to fix?

So why is it so hard to do the same with Postgres? Just because it was not initially designed to optimize I/O and postgres relied heavily on the OS filesystem for that. The database systems which sync at each write, and which can do direct I/O, have implemented many optimizations to reduce the overhead of a disk latency at each write. They have their own buffer cache, with a background database writer which re-orders the writes in the optimal way. And they have multiblock writes for large contiguous writes which bypass the buffer cache.

However, you may have a storage system where write latency is minimal, and you may have an application where the overhead here is not a big problem. This means that you should measure it in order to balance between performance and prevention of corruption. And this is where the SLOB method is awesome: reliable and predictable metrics to measure IOPS.

pgio to the rescue

This is my first trial of pgio, in beta version. It cannot be easier. I’ve just un-tar-ed it:

tar -xvf pgio-0.9.tar
cd pgio

I’ve setup the pgio.conf with 4 schemas and 2 threads per schema:

UPDATE_PCT=10
RUN_TIME=300
NUM_SCHEMAS=4
NUM_THREADS=2
WORK_UNIT=255
UPDATE_WORK_UNIT=8
SCALE=200M
DBNAME=pgio
CONNECT_STRING="pgio"
CREATE_BASE_TABLE=TRUE

Because I want to test writes, I’ve set the UPDATE_PCT so that 10% of calls will do an UPDATE. And I kept the default work unit to read 255 blocks and, for those 10% updates, update 8 blocks only. I’ll run that with 2 threads per schemas, which means 8 concurrent sessions. And they will run for 300 seconds.

In this test I didn’t want to set different values. I just want to see what happens in IOPS for a common workload of lot of reads and small changes. the scale is 200M here. My workload sessions will find their buffers in memory.

On each test, I’ve created the pgio database:

create tablespace pgio location '/u01/pgdata';
CREATE TABLESPACE
create database pgio tablespace pgio;
CREATE DATABASE

Then run the setup.sh to load data in those schemas:

Job info: Loading 200M scale into 4 schemas as per pgio.conf->NUM_SCHEMAS.
Batching info: Loading 2 schemas per batch as per pgio.conf->NUM_THREADS.
Base table loading time: 15 seconds.
Waiting for batch. Global schema count: 2. Elapsed: 0 seconds.
Waiting for batch. Global schema count: 3. Elapsed: 103 seconds.
Waiting for batch. Global schema count: 4. Elapsed: 177 seconds.
Waiting for batch. Global schema count: 4. Elapsed: 249 seconds.
 
Group data loading phase complete. Elapsed: 249 seconds.
List of relations
Schema | Name | Type | Owner | Size | Description
--------+-----------+-------+----------+--------+-------------
public | pgio1 | table | postgres | 200 MB |
public | pgio2 | table | postgres | 200 MB |
public | pgio3 | table | postgres | 200 MB |
public | pgio4 | table | postgres | 200 MB |
public | pgio_base | table | postgres | 29 MB |
(5 rows)

And then I’m ready to run the runit.sh

ext4 mount option

My tablespace is on an ext4 filesystem:

-bash-4.2$ df -HT /u01/pgdata
Filesystem Type Size Used Avail Use% Mounted on
/dev/sdb ext4 32G 1.2G 29G 4% /u01

I’ll run the same workload, several times, with changing only one mount option:

async All I/O to the filesystem should be done asynchronously. (See also the sync option.)
sync All I/O to the filesystem should be done synchronously. In case of media with limited number of write cycles (e.g. some flash drives) "sync" may cause life-cycle shortening.

Which means that some runs will run with /u01 mounted as:

/dev/sdb on /u01 type ext4 (rw,nosuid,nodev,relatime,sync,seclabel,data=ordered)

and some others will run with the default (async):

/dev/sdb on /u01 type ext4 (rw,nosuid,nodev,relatime,seclabel,data=ordered)

I did multiple runs and checked that the result is consistent among them. I’ll show only one result for each configuration.

Run it with async

Here is the output of one ‘runit.sh’ when /u01 was in async:

Date: Thu May 24 10:56:57 CEST 2018
Database connect string: "pgio".
Shared buffers: 128MB.
Testing 4 schemas with 2 thread(s) accessing 200M (25600 blocks) of each schema.
Running iostat, vmstat and mpstat on current host--in background.
Launching sessions. 4 schema(s) will be accessed by 2 thread(s) each.
pg_stat_database stats:
datname| blks_hit| blks_read|tup_returned|tup_fetched|tup_updated
BEFORE: pgio | 252209 | 118112 | 110420 | 6788 | 18
AFTER: pgio | 25189171 | 136972696 | 159128092 | 147250205 | 573216
DBNAME: pgio. 4 schemas, 2 threads(each). Run time: 300 seconds. RIOPS >456181< CACHE_HITS/s >83123<

This shows that, within those 5 minutes, I’ve fetched 147243417 tuples and updated 573198 ones.

pgio takes snapshots of iostat, vmstat and mpstat. Here is a sample after 1 minute of run where we show that all CPU are busy in user or kernel, but not waiting on I/O latency:

10:57:51 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
10:57:54 AM all 52.82 0.00 42.22 0.09 0.00 1.11 0.00 0.00 0.00 3.76
10:57:54 AM 0 54.11 0.00 40.75 0.00 0.00 1.37 0.00 0.00 0.00 3.77
10:57:54 AM 1 54.42 0.00 40.14 0.34 0.00 1.02 0.00 0.00 0.00 4.08
10:57:54 AM 2 51.19 0.00 43.34 0.34 0.00 0.68 0.00 0.00 0.00 4.44
10:57:54 AM 3 51.02 0.00 44.22 0.34 0.00 1.36 0.00 0.00 0.00 3.06
10:57:54 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
10:57:57 AM all 53.33 0.00 42.15 0.00 0.00 1.02 0.00 0.00 0.00 3.50
10:57:57 AM 0 53.95 0.00 42.27 0.00 0.00 0.69 0.00 0.00 0.00 3.09
10:57:57 AM 1 52.56 0.00 42.66 0.00 0.00 0.68 0.00 0.00 0.00 4.10
10:57:57 AM 2 54.27 0.00 40.27 0.00 0.00 1.37 0.00 0.00 0.00 4.10
10:57:57 AM 3 52.72 0.00 43.54 0.00 0.00 1.36 0.00 0.00 0.00 2.38
10:57:57 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
10:58:00 AM all 54.10 0.00 41.54 0.09 0.00 0.77 0.00 0.00 0.00 3.50
10:58:00 AM 0 55.14 0.00 39.38 0.34 0.00 1.03 0.00 0.00 0.00 4.11
10:58:00 AM 1 54.95 0.00 40.96 0.00 0.00 0.68 0.00 0.00 0.00 3.41
10:58:00 AM 2 54.11 0.00 41.10 0.00 0.00 0.68 0.00 0.00 0.00 4.11
10:58:00 AM 3 52.05 0.00 44.86 0.00 0.00 0.68 0.00 0.00 0.00 2.40
10:58:00 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle

Run it with sync

Here is the output of one ‘runit.sh’ when /u01 was in sync:

Date: Thu May 24 12:18:54 CEST 2018
Database connect string: "pgio".
Shared buffers: 128MB.
Testing 4 schemas with 2 thread(s) accessing 200M (25600 blocks) of each schema.
Running iostat, vmstat and mpstat on current host--in background.
Launching sessions. 4 schema(s) will be accessed by 2 thread(s) each.
pg_stat_database stats:
datname| blks_hit| blks_read|tup_returned|tup_fetched|tup_updated
BEFORE: pgio | 255169 | 119163 | 112734 | 6945 | 18
AFTER: pgio | 15040938 | 74598977 | 87775490 | 86742056 | 337889
DBNAME: pgio. 4 schemas, 2 threads(each). Run time: 300 seconds. RIOPS >248266< CACHE_HITS/s >49285<

This shows that, within those 5 minutes, I’ve fetched 86735111 tuples and updated 337871 ones. So, basically the IOPS have been divided by two here in this example when waiting on each writes to be synced to disk.

pgio takes snapshots of iostat, vmstat and mpstat. Here is a sample after 1 minute of run where we show that all CPU are 30% idle waiting on I/O completion:

12:19:51 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
12:19:54 PM all 29.95 0.00 27.79 10.88 0.00 1.26 0.00 0.00 0.00 30.13
12:19:54 PM 0 30.63 0.00 27.46 11.27 0.00 0.70 0.00 0.00 0.00 29.93
12:19:54 PM 1 30.07 0.00 27.62 12.24 0.00 0.70 0.00 0.00 0.00 29.37
12:19:54 PM 2 30.28 0.00 27.82 10.92 0.00 0.35 0.00 0.00 0.00 30.63
12:19:54 PM 3 28.02 0.00 28.02 8.56 0.39 3.89 0.00 0.00 0.00 31.13
12:19:54 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
12:19:57 PM all 30.10 0.00 27.92 11.24 0.00 1.00 0.00 0.00 0.00 29.74
12:19:57 PM 0 29.29 0.00 28.57 10.71 0.00 0.36 0.00 0.00 0.00 31.07
12:19:57 PM 1 30.88 0.00 28.07 11.93 0.00 0.35 0.00 0.00 0.00 28.77
12:19:57 PM 2 30.31 0.00 27.18 12.54 0.00 0.70 0.00 0.00 0.00 29.27
12:19:57 PM 3 30.43 0.00 27.67 9.88 0.00 2.77 0.00 0.00 0.00 29.25
12:19:57 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
12:20:00 PM all 29.51 0.00 27.00 10.76 0.00 1.08 0.00 0.00 0.00 31.66
12:20:00 PM 0 29.58 0.00 28.17 10.56 0.00 0.35 0.00 0.00 0.00 31.34
12:20:00 PM 1 29.72 0.00 26.22 12.24 0.00 0.70 0.00 0.00 0.00 31.12
12:20:00 PM 2 29.12 0.00 26.32 10.88 0.00 0.35 0.00 0.00 0.00 33.33
12:20:00 PM 3 29.34 0.00 27.80 8.88 0.00 3.09 0.00 0.00 0.00 30.89
12:20:00 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle

So what?

Currently, Postgres relies on the filesystem to optimize the I/O, but there’s a risk of corruption in case of failure. We can force to wait for I/O completion with the ‘sync’ mount options of the filesystems, or even with some file attributes (chattr -R +S) for ext4 or xfs, but there’s a performance penalty. The important thing is to measure this penalty, and this is where pgio is great: measure the performance penalty with a workload that is customizable (amount of changes, amount of data,…) but also predictable (does not depend on other parameters like an application benchmark). When you know how being in ‘sync’ impacts your system, you can choose. And we can bet that future versions of Postgres will improve and offer ways to stay efficient without compromising the data at first power outage.

 

Cet article Postgres, the fsync() issue, and ‘pgio’ (the SLOB method for PostgreSQL) est apparu en premier sur Blog dbi services.

Franck Pachot's picture

ADWC – System and session settings (DWCS lockdown profile)

The Autonomous Data Warehouse Cloud service is a PaaS managed service where we have a PDB and an ADMIN user which has most of the system privileges. For example, we have the privilege to change initialization parameters:
SQL> select * from dba_sys_privs where grantee=user and privilege like 'ALTER S%';
 
GRANTEE PRIVILEGE ADMIN_OPTION COMMON INHERITED
------- --------- ------------ ------ ---------
ADMIN ALTER SESSION YES NO NO
ADMIN ALTER SYSTEM YES NO NO

Still, not everything is allowed for several reasons: ensure that we cannot break the Oracle managed CDB and force us to use only the features allowed in the ‘autonomous’ service. This is limited with a lockdown profile:
SQL> show parameter pdb_lockdown
 
NAME TYPE VALUE
------------ ------ -----
pdb_lockdown string DWCS

DWCS means Data Warehouse Cloud Service which was the name of the Autonomous Data Warehouse Cloud service until Larry Ellison announces this self-driven-no-human trend under the marketing umbrella of ‘autonomous’.

The limitations are all documented but I like to see them by myself and in 18c we have a mean to see the lockdown rules from the PDB itself, through V$LOCKDOWN_RULES.

ALTER SYSTEM

Basically, in this ADWC all ALTER SYSTEM statements are disallowed and then they add the few exceptions for what we are allowed to:

SQL> select * from v$lockdown_rules where rule in ('ALTER SYSTEM') and clause_option is null;
RULE_TYPE RULE CLAUSE CLAUSE_OPTION STATUS USERS CON_ID
--------- ---- ------ ------------- ------ ----- ------
STATEMENT ALTER SYSTEM DISABLE ALL 73
STATEMENT ALTER SYSTEM SET ENABLE COMMON 73
STATEMENT ALTER SYSTEM KILL SESSION ENABLE ALL 73

You can ignore what is enabled for COMMON users because we have no common user to connect to our PDB. We will see which ALTER SYSTEM SET clauses are allowed. But in addition to those, only the ‘KILL SESSION’ is allowed for ALTER SYSTEM.

Here is the detail about the parameters we can set:

SQL> select * from v$lockdown_rules left outer join (select upper(name) clause_option,display_value,description from v$parameter) using (clause_option) where rule in ('ALTER SYSTEM') and clause_option is not null and status='ENABLE';
 
CLAUSE_OPTION RULE_TYPE RULE CLAUSE STATUS USERS CON_ID DISPLAY_VALUE DESCRIPTION
------------- --------- ---- ------ ------ ----- ------ ------------- -----------
APPROX_FOR_AGGREGATION STATEMENT ALTER SYSTEM SET ENABLE ALL 73 FALSE Replace exact aggregation with approximate aggregation
APPROX_FOR_COUNT_DISTINCT STATEMENT ALTER SYSTEM SET ENABLE ALL 73 FALSE Replace count distinct with approx_count_distinct
APPROX_FOR_PERCENTILE STATEMENT ALTER SYSTEM SET ENABLE ALL 73 none Replace percentile_* with approx_percentile
AWR_PDB_AUTOFLUSH_ENABLED STATEMENT ALTER SYSTEM SET ENABLE ALL 73 TRUE Enable/Disable AWR automatic PDB flushing
NLS_LANGUAGE STATEMENT ALTER SYSTEM SET ENABLE ALL 73 AMERICAN NLS language name
NLS_SORT STATEMENT ALTER SYSTEM SET ENABLE ALL 73 NLS linguistic definition name
NLS_TERRITORY STATEMENT ALTER SYSTEM SET ENABLE ALL 73 AMERICA NLS territory name
NLS_CALENDAR STATEMENT ALTER SYSTEM SET ENABLE ALL 73 NLS calendar system name
NLS_COMP STATEMENT ALTER SYSTEM SET ENABLE ALL 73 BINARY NLS comparison
NLS_CURRENCY STATEMENT ALTER SYSTEM SET ENABLE ALL 73 NLS local currency symbol
NLS_DATE_FORMAT STATEMENT ALTER SYSTEM SET ENABLE ALL 73 DD-MON-YYYY HH24:MI:ss NLS Oracle date format
NLS_DATE_LANGUAGE STATEMENT ALTER SYSTEM SET ENABLE ALL 73 NLS date language name
NLS_DUAL_CURRENCY STATEMENT ALTER SYSTEM SET ENABLE ALL 73 Dual currency symbol
NLS_ISO_CURRENCY STATEMENT ALTER SYSTEM SET ENABLE ALL 73 NLS ISO currency territory name
NLS_LENGTH_SEMANTICS STATEMENT ALTER SYSTEM SET ENABLE ALL 73 BYTE create columns using byte or char semantics by default
NLS_NCHAR_CONV_EXCP STATEMENT ALTER SYSTEM SET ENABLE ALL 73 FALSE NLS raise an exception instead of allowing implicit conversion
NLS_NUMERIC_CHARACTERS STATEMENT ALTER SYSTEM SET ENABLE ALL 73 NLS numeric characters
NLS_TIMESTAMP_FORMAT STATEMENT ALTER SYSTEM SET ENABLE ALL 73 time stamp format
NLS_TIMESTAMP_TZ_FORMAT STATEMENT ALTER SYSTEM SET ENABLE ALL 73 timestamp with timezone format
NLS_TIME_FORMAT STATEMENT ALTER SYSTEM SET ENABLE ALL 73 time format
NLS_TIME_TZ_FORMAT STATEMENT ALTER SYSTEM SET ENABLE ALL 73 time with timezone format
OPTIMIZER_IGNORE_HINTS STATEMENT ALTER SYSTEM SET ENABLE ALL 73 TRUE enables the embedded hints to be ignored
OPTIMIZER_IGNORE_PARALLEL_HINTS STATEMENT ALTER SYSTEM SET ENABLE ALL 73 TRUE enables embedded parallel hints to be ignored
PLSCOPE_SETTINGS STATEMENT ALTER SYSTEM SET ENABLE ALL 73 identifiers:all plscope_settings controls the compile time collection, cross reference, and storage of PL/SQL source code identifier and SQL statement data
PLSQL_CCFLAGS STATEMENT ALTER SYSTEM SET ENABLE ALL 73 PL/SQL ccflags
PLSQL_DEBUG STATEMENT ALTER SYSTEM SET ENABLE ALL 73 FALSE PL/SQL debug
PLSQL_OPTIMIZE_LEVEL STATEMENT ALTER SYSTEM SET ENABLE ALL 73 1 PL/SQL optimize level
PLSQL_WARNINGS STATEMENT ALTER SYSTEM SET ENABLE ALL 73 DISABLE:ALL PL/SQL compiler warnings settings

The APPROX_ ones, disable by default, can be used to transparently use approximations for faster results.
The NLS_ ones can be used to set NLS defaults for our sessions.
OPTIMIZER_IGNORE_ are new in 18c and are set by default here to ignore embedded hints. However, we can set then to false.
PLSQL_ are the defaults for sessions and I don’t understand why warnings are not enabled by default. Fortunately, we are able to change that at PDB level.

There are also some rules to disable some ALTER SYSTEM SET. They are there for the common users only (which have ALTER SYSTEM SET enabled) but they are interesting to see what Oracle choose to set in the ADWC service which cannot be changed in the PDB even by their common users:

SQL> select * from v$lockdown_rules left outer join (select upper(name) clause_option,display_value,description from v$parameter) using (clause_option) where rule in ('ALTER SYSTEM') and clause_option is not null and status='DISABLE';
CLAUSE_OPTION RULE_TYPE RULE CLAUSE STATUS USERS CON_ID DISPLAY_VALUE DESCRIPTION
------------- --------- ---- ------ ------ ----- ------ ------------- -----------
DB_FILES STATEMENT ALTER SYSTEM SET DISABLE ALL 73 25 max allowable # db files
"_PDB_INHERIT_CFD" STATEMENT ALTER SYSTEM SET DISABLE ALL 73
"_PDB_MAX_AUDIT_SIZE" STATEMENT ALTER SYSTEM SET DISABLE ALL 73
"_PDB_MAX_DIAG_SIZE" STATEMENT ALTER SYSTEM SET DISABLE ALL 73
MAX_IDLE_TIME STATEMENT ALTER SYSTEM SET DISABLE ALL 73 60 maximum session idle time in minutes
PARALLEL_DEGREE_POLICY STATEMENT ALTER SYSTEM SET DISABLE ALL 73 AUTO policy used to compute the degree of parallelism (MANUAL/LIMITED/AUTO/ADAPTIVE)
_PARALLEL_CLUSTER_CACHE_POLICY STATEMENT ALTER SYSTEM SET DISABLE ALL 73 ADAPTIVE policy used for parallel execution on cluster(ADAPTIVE/CACHED)
_ENABLE_PARALLEL_DML STATEMENT ALTER SYSTEM SET DISABLE ALL 73 TRUE enables or disables parallel dml
RESULT_CACHE_MODE STATEMENT ALTER SYSTEM SET DISABLE ALL 73 FORCE result cache operator usage mode
RESULT_CACHE_MAX_RESULT STATEMENT ALTER SYSTEM SET DISABLE ALL 73 1 maximum result size as percent of cache size
RESOURCE_MANAGER_PLAN STATEMENT ALTER SYSTEM SET DISABLE ALL 73 FORCE:DWCS_PLAN resource mgr top plan
_CELL_OFFLOAD_VECTOR_GROUPBY STATEMENT ALTER SYSTEM SET DISABLE ALL 73 FALSE enable SQL processing offload of vector group by
PARALLEL_MIN_DEGREE STATEMENT ALTER SYSTEM SET DISABLE ALL 73 CPU controls the minimum DOP computed by Auto DOP
_MAX_IO_SIZE STATEMENT ALTER SYSTEM SET DISABLE ALL 73 33554432 Maximum I/O size in bytes for sequential file accesses
_LDR_IO_SIZE STATEMENT ALTER SYSTEM SET DISABLE ALL 73 33554432 size of write IOs used during a load operation
_LDR_IO_SIZE2 STATEMENT ALTER SYSTEM SET DISABLE ALL 73 33554432 size of write IOs used during a load operation of EHCC with HWMB
_OPTIMIZER_GATHER_STATS_ON_LOAD_ALL STATEMENT ALTER SYSTEM SET DISABLE ALL 73 TRUE enable/disable online statistics gathering for nonempty segments
_OPTIMIZER_GATHER_STATS_ON_LOAD_HIST STATEMENT ALTER SYSTEM SET DISABLE ALL 73 TRUE enable/disable online histogram gathering for loads
_DATAPUMP_GATHER_STATS_ON_LOAD STATEMENT ALTER SYSTEM SET DISABLE ALL 73 TRUE Gather table statistics during Data Pump load rather thanimporting statistics from the dump file. This should be set to TRUE in the lockdown profile in a DWCS environment.
_OPTIMIZER_ANSWERING_QUERY_USING_STATS STATEMENT ALTER SYSTEM SET DISABLE ALL 73 TRUE enable statistics-based query transformation
_PX_XTGRANULE_SIZE STATEMENT ALTER SYSTEM SET DISABLE ALL 73 128000 default size of a external table granule (in KB)
_OPTIMIZER_ALLOW_ALL_ACCESS_PATHS STATEMENT ALTER SYSTEM SET DISABLE ALL 73 FALSE allow all access paths
_DATAPUMP_INHERIT_SVCNAME STATEMENT ALTER SYSTEM SET DISABLE ALL 73 TRUE Inherit and propagate service name throughout job
_DEFAULT_PCT_FREE STATEMENT ALTER SYSTEM SET DISABLE ALL 73 1 Default value of PCT_FREE enforced by DWCS lockdown

So, among the interesting ones, Result Cache is forced for all results (RESULT_CACHE_MODE=FORCE), Parallel DML is enabled for all sessions (but we will see that we can disable it at session level), PCTFREE will always be 1 (_DEFAULT_PCT_FREE=1), statistics are gathered during load (this is a 18c feature). And we cannot change that.

There are only few additional ALTER SYSTEM SET which are allowed at session level:

SQL> select * from v$lockdown_rules where rule in ('ALTER SESSION') and clause is not null and clause_option is not null
and (clause_option,status,users) not in (select clause_option,status,users from v$lockdown_rules where rule in ('ALTER SYSTEM') and clause is not null and clause_option is not null)
;
RULE_TYPE RULE CLAUSE CLAUSE_OPTION STATUS USERS CON_ID
--------- ---- ------ ------------- ------ ----- ------
STATEMENT ALTER SESSION SET CONTAINER ENABLE ALL 73
STATEMENT ALTER SESSION SET CURRENT_SCHEMA ENABLE ALL 73
STATEMENT ALTER SESSION SET EDITION ENABLE ALL 73
STATEMENT ALTER SESSION SET OPTIMIZER_CAPTURE_SQL_PLAN_BASELINES ENABLE ALL 73
STATEMENT ALTER SESSION SET DEFAULT_COLLATION ENABLE ALL 73
STATEMENT ALTER SESSION SET ROW ARCHIVAL VISIBILITY = ACTIVE ENABLE ALL 73
STATEMENT ALTER SESSION SET ROW ARCHIVAL VISIBILITY = ALL ENABLE ALL 73
STATEMENT ALTER SESSION SET TIME_ZONE ENABLE ALL 73

Besides the parameters here are what we can do with ALTER SESSION:

SQL> select * from v$lockdown_rules where rule='ALTER SESSION' and clause_option is null;
 
RULE_TYPE RULE CLAUSE CLAUSE_OPTION STATUS USERS CON_ID
--------- ---- ------ ------------- ------ ----- ------
STATEMENT ALTER SESSION DISABLE ALL 73
STATEMENT ALTER SESSION SET ENABLE COMMON 73
STATEMENT ALTER SESSION ADVISE COMMIT ENABLE ALL 73
STATEMENT ALTER SESSION CLOSE DATABASE LINK ENABLE ALL 73
STATEMENT ALTER SESSION DISABLE COMMIT IN PROCEDURE ENABLE ALL 73
STATEMENT ALTER SESSION DISABLE PARALLEL DDL ENABLE ALL 73
STATEMENT ALTER SESSION DISABLE PARALLEL DML ENABLE ALL 73
STATEMENT ALTER SESSION DISABLE PARALLEL QUERY ENABLE ALL 73
STATEMENT ALTER SESSION DISABLE RESUMABLE ENABLE ALL 73
STATEMENT ALTER SESSION ENABLE COMMIT IN PROCEDURE ENABLE ALL 73
STATEMENT ALTER SESSION ENABLE PARALLEL DDL ENABLE ALL 73
STATEMENT ALTER SESSION ENABLE PARALLEL DML ENABLE ALL 73
STATEMENT ALTER SESSION ENABLE PARALLEL QUERY ENABLE ALL 73
STATEMENT ALTER SESSION ENABLE RESUMABLE ENABLE ALL 73
STATEMENT ALTER SESSION FORCE PARALLEL DDL ENABLE ALL 73
STATEMENT ALTER SESSION FORCE PARALLEL DML ENABLE ALL 73
STATEMENT ALTER SESSION FORCE PARALLEL QUERY ENABLE ALL 73

I’ll show other rules (other than ALTER SYSTEM and ALTER SESSION statements) in a future post. Lockdown profiles is a great feature because they have very fine granularity and makes it easy to document what is allowed or not. Oracle introduced them for their own usage in the public cloud. You can use the same on-premises for your private cloud. This requires multitenant architecture, but the option is not mandatory.

 

Cet article ADWC – System and session settings (DWCS lockdown profile) est apparu en premier sur Blog dbi services.

Jonathan Lewis's picture

Missing Audit

Here’s a detail I discovered a long time ago – and rediscovered very recently: it’s possible to delete data from a table which is subject to audit without the delete being audited. I think the circumstances where it would matter are a little peculiar, and I’ve rarely seen systems that use the basic Oracle audit feature anyway, but this may solve a puzzle for someone, somewhere, someday.

The anomaly appears if you create a referential integrity constraint as “on delete cascade”. A delete from the parent table will automatically (pre-)delete matching rows from the child table but the delete on the child table will not be audited. Here’s a demonstration script – note that you will need to have set the parameter audit_trail to ‘DB’ to prove the point.


rem
rem     Script:         del_cascade_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2004
rem
rem     Last tested
rem             12.1.0.2
rem             11.2.0.4
rem

drop table t2 purge;
drop table t1 purge;

create table t1 (
        id              number(6),
        v1              varchar2(10),
        padding         varchar2(100),
        constraint t1_pk 
                primary key (id)
);


create table t2 (
        id_par          number(6),
        id_ch           number(6),
        v1              varchar2(10),
        padding         varchar2(100),
        constraint t2_pk 
                primary key (id_par,id_ch),
        constraint t2_fk_t1 
                foreign key (id_par) references t1 
                on delete cascade
);


insert into t1
select
        rownum,
        rownum,
        rpad('x',100)
from
        all_objects
where
        rownum <= 100 -- > comment to avoid wordpress format issue
;


insert into t2
select
        1+trunc((rownum-1)/5),
        rownum,
        rownum,
        rpad('x',100)
from
        all_objects
where
        rownum <= 500 -- > comment to avoid wordpress format issue
;

commit;

prompt  =================================
prompt  Parent/Child rowcounts for id = 1
prompt  =================================

select count(*) from t1 where id = 1;
select count(*) from t2 where id_par = 1;

column now new_value m_now
select to_char(sysdate,'dd-mon-yyyy hh24:mi:ss') now from dual;

audit delete on t2 by access; 
audit delete on t1 by access; 

prompt  =======================================================
prompt  If you allow the cascade (keep the t2 delete commented)
prompt  then the cascade deletion is not audited.
prompt  =======================================================

-- delete from t2 where id_par = 1;
delete from t1 where id = 1;

noaudit delete on t1; 
noaudit delete on t2; 

column obj_name format a32

select  action_name, obj_name 
from    user_audit_trail
where   timestamp >= to_date('&m_now','dd-mon-yyyy hh24:mi:ss')
;

The script has an optional delete from the child table (t2) just before the delete from the parent table (t1). When you run the script you should see that before the delete t1 reports one row while t2 reports 5 rows. After the delete(s) both tables will report zero rows.

If you leave the t2 delete commented out then the delete from t2 will have been the recursive delete due to the cascade and the number of rows returned from user_audit_trail will be one (the delete from t1). If you allow the explicit delete from t2 to take place then user_audit_trail will report two rows, one each for t1 and t2.

Sample output (with a little cosmetic editing) – when the delete from t2 is commented out:

=================================
Parent/Child rowcounts for id = 1
=================================

  COUNT(*)
----------
         1


  COUNT(*)
----------
         5

Audit succeeded.
Audit succeeded.

=======================================================
If you allow the cascade (keep the t2 delete commented)
then the cascade deletion is not audited.
=======================================================

1 row deleted.


Noaudit succeeded.
Noaudit succeeded.


  COUNT(*)
----------
         0


  COUNT(*)
----------
         0


ACTION_NAME                  OBJ_NAME
---------------------------- --------------------------------
DELETE                       T1

1 row selected.

As you can see, I’ve deleted just one row from one table (the t1 delete), but the final query against t2 shows that the child rows have also been deleted, but the only audit record reported is the one for the parent – despite the fact that if you enable sql_trace before the delete from t1 you will find the actual recursive statement ‘ delete from “TEST_USER”.”T2″ where “ID_PAR” = :1’ in the trace file.

The “recursive” nature of the delete in the trace file might be a bit of a clue – it is reported as being operated by SYS (lid = 0), not by the real end-user, and the parsing_user_id and parsing_schema_id in v$sql are both SYS (i.e. 0). Checking dba_audit_trail and the audit_file_dest for audited SYS operations, though, there was still no sign of any audit record for the child delete.

 

kevinclosson's picture

Sneak Preview of pgio (The SLOB Method for PostgreSQL) Part IV: How To Reduce The Amount of Memory In The Linux Page Cache For Testing Purposes.

I hope these sneak peeks are of interest…

PostgreSQL and Buffered I/O

PostgreSQL uses buffered I/O. If you want to test your storage subsystem capabilities with database physical I/O you have to get the OS page cache “out of the way”–unless you want to load really large test data sets.

Although pgio (the SLOB Method for PostgreSQL) is still in Beta, I’d like to show this example of the tool I provide for users to make a really large RAM system into an effectively smaller RAM system.

Linux Huge Pages

Memory allocated to huge pages is completely cordoned off unless a process allocates some IPC shared memory (shmget(1)).  The pgio kit comes with a simple tools called pgio_reduce_free_memory.sh which leverages this quality of huge pages in order to draw down available memory so that one can test physical I/O with a database size that is quite smaller than the amount of physical memory in the database host.

The following picture shows an example of using pgio_reduce_free_memory.sh to draw set aside 443 gigabytes of available memory so as to leave only 32 gigabytes for OS page cache. As such, one can test a pgio database of, say, 64 gigabytes and generate a tremendous about of physical I/O.

 

pgio1https://kevinclosson.files.wordpress.com/2018/05/pgio1-e1527117316719.pn... 150w, https://kevinclosson.files.wordpress.com/2018/05/pgio1-e1527117316719.pn... 300w, https://kevinclosson.files.wordpress.com/2018/05/pgio1-e1527117316719.pn... 768w, https://kevinclosson.files.wordpress.com/2018/05/pgio1-e1527117316719.png 908w" sizes="(max-width: 500px) 100vw, 500px" />

 

I should think this little tool could be helpful for a lot of testing purposes beyond pgio.

Uwe Hesse's picture

Users, schemas & privileges in #Exasol

Exasol Logohttps://uhesse.files.wordpress.com/2018/05/exasol_logo.png?w=150 150w" sizes="(max-width: 161px) 100vw, 161px" />

In Exasol, a database user may own multiple schemas – or even none at all. I connect to my Community Edition to show that:

C:\Users\uh>cd \Program Files (x86)\EXASOL\EXASolution-6.0\EXAplus

C:\Program Files (x86)\EXASOL\EXASolution-6.0\EXAplus>exaplusx64 -c 192.168.56.101:8563 -u sys -p exasol -lang EN

EXAplus 6.0.8 (c) EXASOL AG

Wednesday, May 23, 2018 3:28:29 PM CEST
Connected to database EXAone as user sys.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> create user adam identified by adam;
EXA: create user adam identified by adam;

Rows affected: 0

SQL_EXA> grant dba to adam;
EXA: grant dba to adam;

Rows affected: 0

SQL_EXA> select user_name from exa_dba_users;
EXA: select user_name from exa_dba_users;

USER_NAME
------------------------------------------------------------
SYS
ADAM

2 rows in resultset.

SQL_EXA> select schema_owner,schema_name from exa_schemas;
EXA: select schema_owner,schema_name from exa_schemas;

SCHEMA_OWNER
-------------------------------------------------------------
SCHEMA_NAME
-------------------------------------------------------------
SYS
RETAIL

1 row in resultset.

SQL_EXA> connect adam/ADAM;

Wednesday, May 23, 2018 3:34:42 PM CEST
Connected to database EXAone as user adam.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> create table t1 (n number);
EXA: create table t1 (n number);
Error: [42000] no schema specified or opened or current schema has been dropped [line 1, column 27] (Session: 1601269589413551548)
SQL_EXA> open schema adam;
EXA: open schema adam;
Error: [42000] schema ADAM not found [line 1, column 13] (Session: 1601269589413551548)

Demo user adam has the DBA role granted but there is no adam schema yet. I need to create it first:

EXA: create schema adam;

Rows affected: 0

SQL_EXA> open schema adam;
EXA: open schema adam;

Rows affected: 0

SQL_EXA> create table t1 (n number);
EXA: create table t1 (n number);

Rows affected: 0

SQL_EXA> create schema adam2;
EXA: create schema adam2;

Rows affected: 0

SQL_EXA> create table adam2.t2 (n number);
EXA: create table adam2.t2 (n number);

Rows affected: 0

SQL_EXA> select table_schema,table_name from exa_user_tables;
EXA: select table_schema,table_name from exa_user_tables;

TABLE_SCHEMA
--------------------------------------------------------
TABLE_NAME
--------------------------------------------------------
ADAM
T1
ADAM2
T2

2 rows in resultset.

As you see, user adam has now two schemas with different tables in them. Now briefly to privileges:

SQL_EXA> create user fred identified by fred;
EXA: create user fred identified by fred;

Rows affected: 0

SQL_EXA> grant create session to fred;
EXA: grant create session to fred;

Rows affected: 0

SQL_EXA> grant select on adam.t1 to fred;
EXA: grant select on adam.t1 to fred;

Rows affected: 0

SQL_EXA> connect fred/FRED;

Wednesday, May 23, 2018 3:53:34 PM CEST
Connected to database EXAone as user fred.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> select * from adam.t1;
EXA: select * from adam.t1;

N
-----------------

0 rows in resultset.

SQL_EXA> select * from adam2.t2;
EXA: select * from adam2.t2;
Error: [42500] insufficient privileges: SELECT on table T2 (Session: 1601270776421928841)
SQL_EXA> connect adam/ADAM;

Wednesday, May 23, 2018 3:54:33 PM CEST
Connected to database EXAone as user adam.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> create role allonadam2;
EXA: create role allonadam2;

Rows affected: 0

SQL_EXA> grant all on adam2 to allonadam2;
EXA: grant all on adam2 to allonadam2;

Rows affected: 0

SQL_EXA> grant allonadam2 to fred;
EXA: grant allonadam2 to fred;

Rows affected: 0

SQL_EXA> connect fred/FRED;

Wednesday, May 23, 2018 3:55:54 PM CEST
Connected to database EXAone as user fred.
EXASolution 6.0.8 (c) EXASOL AG

SQL_EXA> select * from adam2.t2;
EXA: select * from adam2.t2;

N
-----------------

0 rows in resultset.

SQL_EXA> drop table adam2.t2;
EXA: drop table adam2.t2;
Error: [42500] insufficient privileges for dropping table (Session: 1601270923042332982)

That’s because ALL contains ALTER, DELETE, EXECUTE, INSERT, SELECT and UPDATE but not DROP which can be confirmed using EXA_DBA_OBJ_PRIVS.

kevinclosson's picture

Sneak Preview of pgio (The SLOB Method for PostgreSQL) Part III: Link To The Full README file for Beta pgio v0.9.

If you are interested in a head start on pgio, the following is a link to the full README file which has some loading and testing how-to:

The pgio text README file version 0.9 Beta

To prevent automated spam submissions leave this field empty.