Home » SQL & PL/SQL » SQL & PL/SQL » VSIZE() function's meaning and physical storage size (18)
VSIZE() function's meaning and physical storage size [message #689469] Wed, 17 January 2024 07:57 Go to next message
Darth Waiter
Messages: 78
Registered: October 2020
Member
Coming from MS SQL background, I am a bit confused about the physical storage of data types in Oracle.
In MS SQL, all I need to know to determine how many bytes a column occupies on disk is to look in MSDN at the table with data types, their length parameters, and the physical bytes.
In Oracle, I realize that some types such as NUMBER act more like VARBINARY rather than BINARY, i.e. they store as many bytes as needed instead of always storing the fixed number of bytes.

We have several databases of considerable size, and it used to cost a certain amount in storage space, until the regulator required that we have DR copies of them.
So, I was asked to try to reduce their sizes by downscaling any overstated data type. Many DBs are MS SQL, and for them I have a simple script that outputs the column width reduction SQL script for data owners to review and make a decision, but some are Oracle, and for them I need to come up with something. So, I need to figure out how many bytes data types actually store.

I looked into VSIZE() and DUMP() functions that seem to work in a similar fashion: they return bytes used for 'internal representation' of the value.
But it is not yet clear to me whether 'internal representation' is representative of the actual physical storage size for the datatype of the column or of the number of bytes that each value occupies regardless of the actual column size.

Appreciate your help!
Re: VSIZE() function's meaning and physical storage size [message #689476 is a reply to message #689469] Thu, 18 January 2024 09:49 Go to previous messageGo to next message
Frank
Messages: 7901
Registered: March 2000
Senior Member
Looking at this from a technical/DBA side it is merely impossible to predict the size of rows. Apart from number columns, "text" columns typically are stored as varchar2, which (only) define a max size.
If you have no functional knowledge of the application, it is very hard to say what the average length of such a column will be (what is the average length of a last name, a city, etc?)
You can however find the actual numbers for these for your existing data in metadata (e.g. DBA_TABLES#AVG_ROW_LEN). These are calculated when you gather statistics

HTH,
Frank
Re: VSIZE() function's meaning and physical storage size [message #689477 is a reply to message #689476] Thu, 18 January 2024 09:56 Go to previous messageGo to next message
Darth Waiter
Messages: 78
Registered: October 2020
Member
That sounds odd to me. I do it for a living, on MS SQL, and I routinely reduce 150 GB databases to 80-100 GB, to the applause of happy customers who also realize that their apps now perform much quicker. Needless to say that I do not need any functional knowledge of the app to come up with the reduction script, and the customer only needs to take a look at it with me and approve/reject/adjust specific column reductions.
I am aware of AVG_ROW_LEN but it does not help as it is not granular enough.

[Updated on: Thu, 18 January 2024 09:57]

Report message to a moderator

Re: VSIZE() function's meaning and physical storage size [message #689478 is a reply to message #689477] Thu, 18 January 2024 10:06 Go to previous messageGo to next message
Frank
Messages: 7901
Registered: March 2000
Senior Member
Not sure if I understand you correctly, not knowing MS SQL server's architecture. However, the way I read it is that its database can hold a lot of empty space in each row.
Oracle (roughly) reserves space for data, so its size (typically) cannot be reduced easily.
There are a lot of exceptions (space is reserved in blocks to account for future updates that would grow rows, segments (tables/indexes) with lots of deleted rows, etc) but squeezing these often leads to new problems. On the careful side, let's say that tweaking these is situational. (e.g. room, caused by deleted rows will be filled with new rows, so immediately after shrinking your db it will start growing again. This can be different if you did a one time only huge delete)
Re: VSIZE() function's meaning and physical storage size [message #689479 is a reply to message #689478] Thu, 18 January 2024 10:10 Go to previous messageGo to next message
Darth Waiter
Messages: 78
Registered: October 2020
Member
Example: the original dev/designer creates an INT field that only ever stores values within the min/max range of SMALLINT. The column may be altered to SMALLINT and save 2 bytes of space per row. At 1,000,000 rows it's 2 MB for the table data plus some for its index. Something along these lines.
Re: VSIZE() function's meaning and physical storage size [message #689480 is a reply to message #689479] Thu, 18 January 2024 10:15 Go to previous messageGo to next message
Frank
Messages: 7901
Registered: March 2000
Senior Member
This would only yield results for char columns (not varchar2, but char). Those do actually take up the max size.
Re: VSIZE() function's meaning and physical storage size [message #689481 is a reply to message #689480] Thu, 18 January 2024 10:22 Go to previous messageGo to next message
Darth Waiter
Messages: 78
Registered: October 2020
Member
Actually, that's not correct. That yields results for many more data types. Also, I was not interested in varchar2 to begin with, so this is kind offtopic.
Re: VSIZE() function's meaning and physical storage size [message #689482 is a reply to message #689479] Thu, 18 January 2024 11:32 Go to previous messageGo to next message
Michel Cadot
Messages: 68647
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Darth Waiter wrote on Thu, 18 January 2024 17:10
Example: the original dev/designer creates an INT field that only ever stores values within the min/max range of SMALLINT. The column may be altered to SMALLINT and save 2 bytes of space per row. At 1,000,000 rows it's 2 MB for the table data plus some for its index. Something along these lines.

Oracle does not care of how the column is defined to store the data.
The type only defines what validation controls have to be done when data are inserted but not how Oracle will store these data.

In your example, a number is stored in the same way if the column is defined as INTEGER or SMALLINT (which is in fact the same thing for Oracle: NUMBER(38,0)).

So, you cannot change the size of the stored data by changing the data type but for CLOB or LONG to VARCHAR2 or the like and CHAR to VARCHAR2 as Frank mentioned).

[Updated on: Thu, 18 January 2024 11:35]

Report message to a moderator

Re: VSIZE() function's meaning and physical storage size [message #689483 is a reply to message #689482] Thu, 18 January 2024 12:01 Go to previous messageGo to next message
Michel Cadot
Messages: 68647
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

However there is a case where being more specific on the data type can save space: when you store result of operations.
If you don't specify a scale, Oracle will store the result to the maximum precision and so consume more space than needed.

For example, if you store prices there is no need to store more than 2 digits after the decimal point:
SQL> create table t (v1 number, v2 number(10,2));

Table created.

SQL> insert into t values (1000/3, 1000/3);

1 row created.

SQL> set numwidth 50
SQL> select * from t;
                                                V1                                                 V2
-------------------------------------------------- --------------------------------------------------
          333.333333333333333333333333333333333333                                             333.33

1 row selected.

SQL> select vsize(v1), vsize(v2) from t;
 VSIZE(V1)  VSIZE(V2)
---------- ----------
        21          4

1 row selected.

So this is the case you can save space: always explicitly specify the number of digits you need after the decimal point.

[Updated on: Thu, 18 January 2024 12:04]

Report message to a moderator

Re: VSIZE() function's meaning and physical storage size [message #689484 is a reply to message #689483] Thu, 18 January 2024 12:12 Go to previous messageGo to next message
Darth Waiter
Messages: 78
Registered: October 2020
Member
Michel Cadot wrote on Thu, 18 January 2024 12:01

So this is the case you can save space: always explicitly specify the number of digits you need after the decimal point.
Yes, it is obvious, and that is why I ask the original question: does VSIZE() give me the actual physical storage size for such data types?
Re: VSIZE() function's meaning and physical storage size [message #689485 is a reply to message #689484] Thu, 18 January 2024 15:06 Go to previous messageGo to next message
Michel Cadot
Messages: 68647
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

The answer depends on the data type.
DATE, TIMESTAMP, CHAR... have a fixed storage size.
VARCHAR2, NUMBER (and its subtypes)... have a storage size that depends on the value.

If I add 3 rows in my previous table:
SQL> insert into t values (99, 99);

1 row created.

SQL> insert into t values (101,101);

1 row created.

SQL> commit;

Commit complete.

SQL> set numwidth 45
SQL> select * from t;
                                           V1                                            V2
--------------------------------------------- ---------------------------------------------
     333.333333333333333333333333333333333333                                        333.33
                                           99                                            99
                                          101                                           101

3 rows selected.

SQL> set numwidth 10
SQL> select v2, vsize(v1), vsize(v2) from t;
        V2  VSIZE(V1)  VSIZE(V2)
---------- ---------- ----------
    333.33         21          4
        99          2          2
       101          3          3

3 rows selected.

DUMP gives the exact bytes that are stored (in the part after ":"):
SQL> select v2, dump(v1) d, dump(v2) d from t;
        V2 D                                                                            D
---------- ---------------------------------------------------------------------------- ------------------------
    333.33 Typ=2 Len=21: 194,4,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34,34 Typ=2 Len=4: 194,4,34,34
        99 Typ=2 Len=2: 193,100                                                         Typ=2 Len=2: 193,100
       101 Typ=2 Len=3: 194,2,2                                                         Typ=2 Len=3: 194,2,2

3 rows selected.

The "len" part is from the header of the column value and is what VSIZE returns.

[Updated on: Thu, 18 January 2024 16:10]

Report message to a moderator

Re: VSIZE() function's meaning and physical storage size [message #689486 is a reply to message #689481] Fri, 19 January 2024 01:08 Go to previous messageGo to next message
Frank
Messages: 7901
Registered: March 2000
Senior Member
Darth Waiter wrote on Thu, 18 January 2024 17:22
Actually, that's not correct. That yields results for many more data types. Also, I was not interested in varchar2 to begin with, so this is kind offtopic.
I meant that this behaviour only works for chars _in an Oracle database_
icon14.gif  Re: VSIZE() function's meaning and physical storage size [message #689499 is a reply to message #689485] Fri, 19 January 2024 11:04 Go to previous message
Darth Waiter
Messages: 78
Registered: October 2020
Member
Michel Cadot wrote on Thu, 18 January 2024 15:06

The answer depends on the data type.
DATE, TIMESTAMP, CHAR... have a fixed storage size.
VARCHAR2, NUMBER (and its subtypes)... have a storage size that depends on the value.
Great, thanks! I can experiment further once I get access to Oracle.
Previous Topic: Need help to build a query
Next Topic: DBMS_OUTPUT how to include a white space
Goto Forum:
  


Current Time: Sat Apr 27 01:57:12 CDT 2024