Feed aggregator

Secure and Private: On-Premise Invoice Processing with LangChain and Ollama RAG

Andrejus Baranovski - 17 hours 31 min ago
The Ollama desktop tool helps run LLMs locally on your machine. This tutorial explains how I implemented a pipeline with LangChain and Ollama for on-premise invoice processing. Running LLM on-premise provides many advantages in terms of security and privacy. Ollama works similarly to Docker; you can think of it as Docker for LLMs. You can pull and run multiple LLMs. This allows to switch between LLMs without changing RAG pipeline. 

 

Secure Password Store - Wallets

Pete Finnigan - Mon, 2023-12-04 12:26
One of the key security issues I come across when performing security audits is the proliferation of passwords located on SQL files and OS shell scripts and more. If you get access to the server you can learn a lot....[Read More]

Posted by Pete On 04/12/23 At 01:48 PM

Categories: Security Blogs

Comparison between ALTER TABLE MOVE ONLINE and DBMS_REDEFINITION

Tom Kyte - Mon, 2023-12-04 12:26
I need to move online a huge table with BLOBs (around 25 TB) to a new tablespace with bigger block size. Wich method is recommended? Is DBMS_REDEFINITION still the method recommended by oracle? Or is now ALTER TABLE MOVE ONLINE?
Categories: DBA Blogs

Using hierarchical query and its impact

Tom Kyte - Mon, 2023-12-04 12:26
Hi Tom, I do have a large table millions of records. and the levels are much deeper it would vary case to case In order to achieve the results, whether hierarchical query is good or some other approach suits the purpose Regards, Jithesh
Categories: DBA Blogs

Golden Gate for OLTP

Tom Kyte - Mon, 2023-12-04 12:26
Hello Team, I wanted to know if it is possible to run the same OLTP application on two different sites with databases on both sites being synced using the Golden Gate? Generally, do we need another dataguard enviornment(s) to cater as DR solution or above configuration suffice as the DR solution? What points need to be considered/taken care of for such configurations? Regards,
Categories: DBA Blogs

If exists then insert query performance comparison

Tom Kyte - Mon, 2023-12-04 12:26
Hi Tom, Below are the scripts for your reference wherein it works like if a row is not present in a table then insert it based on one column value. 1.My query to you is simply looking at below scripts (without any alteration) which method do you think is better w.r.t to good execution time and performance and why? 2.If needed to check execution plan/time of a plsql block (as in first method ) of below scripts .How should we do it? --table creation create table test1(id number,name varchar2(500),city varchar2(500)); --method1 using plsql block declare vtemp number(10,0):=0; begin begin select 1 into vtemp from dual where not EXISTS ( select * from test1 where city='New York' --based on city ); exception when no_data_found then null; end; if vtemp=1 then insert into test1 values('1','Alexa','New York'); commit; end if; end; / OR ----method2 using insert select insert into test1(id,name,city) select 1,'Alexa','New York' from dual where 'New York' not in (select city from test1 where city='New York'); ?- based on city new york will inserting in table Thanks
Categories: DBA Blogs

Cost increases using json_value: jsontable evaluation

Tom Kyte - Mon, 2023-12-04 12:26
Hi everyone! I'm working with 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production I'm trying to obtain json elements as columns in a view. When I saw the plan for the following sql I realized that the cost increases when json_value is used more than once in the same column. <code> /*Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production*/ /*TESTING JSON_VALUE SELECTING FROM DUAL */ --COST OK WITH Example AS(SELECT '{"element_1":"Radio controlled plane","element_2":"Ship"}' Json_1, '{"element_3":"Car","element_4":"Train"}' Json_2 FROM DUAL) SELECT json_value(json_1,'$.element_1') element_1_value,json_value(json_1,'$.element_2') element_2_value,json_value(json_2,'$.element_3') element_3_value FROM Example A; /*TESTING JSON_VALUE FROM EXAMPLE TABLE*/ drop table JSON_TEST; CREATE TABLE JSON_TEST ( test_id number(4), JSON_1 VARCHAR2(100 BYTE), JSON_2 VARCHAR2(100 BYTE) ); /*INSERT TEST DATA*/ Insert into JSON_TEST (test_id,JSON_1, JSON_2) Values (1,'{"element_1":"Plane","element_2":"Ship"}', '{"element_3":"Car","element_4":"Train"}'); Insert into JSON_TEST (test_id,JSON_1, JSON_2) Values (2,'{"element_1":"Duck","element_2":"Dog"}', '{"element_3":"Cat","element_4":"Mouse"}'); Insert into JSON_TEST (test_id,JSON_1, JSON_2) Values (3,'{"element_1":"Glass","element_2":"Knife"}', '{"element_3":"Bottle","element_4":"Cup"}'); COMMIT; /*TESTING COST WITH ONLY ONE ELEMENT FROM EACH JSON COLUMN (JSON_1, JSON_2)*/ --COST OK SELECT json_value(json_1,'$.element_1') element_1_value,json_value(json_2,'$.element_3') element_3_value FROM JSON_TEST ; /*TESTING COST WITH TWO ELEMENTS FROM ONE JSON COLUMN (JSON_1)*/ --COST INCREASES A LOT, JSONTABLE EVALUATION IS NOW PRESENT IN THE PLAN SELECT json_value(json_1,'$.element_1') element_1_value,json_value(json_1,'$.element_2') element_2_value FROM JSON_TEST ; </code> There's a bug or i'm doing something wrong? How can I or our DBA's fix this problem? Thanks!!
Categories: DBA Blogs

JSON_TABLE, NESTED, and cartesian joins

Tom Kyte - Mon, 2023-12-04 12:26
using the example data from https://oracle-base.com/articles/19c/sql-nested-clause-instead-of-json_table-19c If I want the Address *and* the Contact Details, the intuitive thing would be to add another NESTED clause, like this <code> SELECT jt.* FROM json_documents NESTED data COLUMNS (FirstName, LastName, Job, Active, NESTED ContactDetails COLUMNS (Email, Phone), NESTED Address columns (Street, City, Country, Postcode) ) jt; </code> However, what I end up with is "FIRSTNAME" "LASTNAME" "JOB" "ACTIVE" "EMAIL" "PHONE" "STREET" "CITY" "COUNTRY" "POSTCODE" "John" "Doe" "Clerk" "true" "" "" "99 My Street" "My City" "UK" "A12 34B" "John" "Doe" "Clerk" "true" "john.doe@example.com" "44 123 123456" "" "" "" "" "Jayne" "Doe" "Manager" "false" "" "" "100 My Street" "My City" "UK" "A12 34B" "Jayne" "Doe" "Manager" "false" "jayne.doe@example.com" "" "" "" "" "" I am getting one row with the first NESTED, and another with the other NESTED. Is it because NESTED is shorthand for JSON_TABLE, so it is creating 2 JSON_TABLEs, and hence I am getting a cartesian join? And is that what I *should* expect? For the query to work correctly when I have more than one NESTED object, I have to forgo the NESTED clause for the 2nd level nesting and explicitly set out each column and its PATH, like this <code> SELECT j.id, jt.first_name, jt.last_name, jt.job, jt.email, jt.phone, jt.active FROM json_documents j NESTED data COLUMNS (first_name VARCHAR2(50 CHAR) PATH FirstName, last_name VARCHAR2(50 CHAR) PATH LastName, job VARCHAR2(10 CHAR) PATH Job, email VARCHAR2(100 CHAR) PATH ContactDetails.Email, phone VARCHAR2(50 CHAR) PATH ContactDetails.Phone, street VARCHAR2(100 CHAR) PATH Address.Street, city VARCHAR2(50 CHAR) PATH Address.City, country VARCHAR2(100 CHAR) PATH Address.Country, postcode VARCHAR2(50 CHAR) PATH Address.Postcode, active VARCHAR2(5 CHAR) PATH Active) jt; </code> What it comes down to is why does this <code> SELECT jt.* FROM json_documents NESTED data COLUMNS (FirstName, LastName, Job, Active, NESTED ContactDetails COLUMNS (Email, Phone) ) jt; </code> or this <code> SELECT jt.* FROM json_documents NESTED data COLUMNS (FirstName, LastName, Job, Active, NESTED Address columns (Street, City, Country, Postcode) ) jt; </code> work correctly and give me 2 rows but this <code> SELECT jt.* FROM json_documents NESTED data COLUMNS (FirstName, Las...
Categories: DBA Blogs

ORA-00054: RESOURCE BUSY AND ACQUIRE WITH NOWAIT SPECIFIED

Tom Kyte - Thu, 2023-11-30 16:46
Hi, I am getting following error while migration. ORA-00054: RESOURCE BUSY AND ACQUIRE WITH NOWAIT SPECIFIED Migration code is given below. I am not able to understand I have acquired exclusive lock on this table then why I am getting this error. <b>Please help ASAP this has to go in production tomorrow. Thanks in Advance</b> Code: <code> DECLARE l_script clob := q'[BEGIN DELETE FROM fqng_report.rpt_refresh_history_data_queue q WHERE q.source_table_name = 'FQNG_INVENTORY'; EXECUTE IMMEDIATE 'ALTER TABLE FQNG_REPORT.RPT_INVENTORY NOLOGGING'; EXECUTE IMMEDIATE 'TRUNCATE TABLE FQNG_REPORT.rpt_inventory'; EXECUTE IMMEDIATE 'LOCK TABLE FQNG_REPORT.rpt_inventory IN EXCLUSIVE MODE'; INSERT INTO FQNG_REPORT.RPT_INVENTORY SELECT * FROM FQNG.fqng_inventory ; COMMIT; EXECUTE IMMEDIATE 'ALTER TABLE FQNG_REPORT.RPT_INVENTORY LOGGING'; END;]'; BEGIN DBMS_SCHEDULER.CREATE_JOB ( job_name => 'MASTER.REPORT_DATA_MIGRATION_INVENT', job_type => 'PLSQL_BLOCK', job_action => l_script, number_of_arguments => 0, start_date => SYSDATE, repeat_interval => NULL, end_date => NULL, job_class => 'SYS.DEFAULT_JOB_CLASS', enabled => TRUE, auto_drop => TRUE, comments => 'This is one time job for Reports Migration', credential_name => NULL, destination_name => NULL); end; / </code> Here the data is migrated from the transaction table to the table in the reporting schema.
Categories: DBA Blogs

PGIO Update

Kevin Closson - Thu, 2023-11-30 11:30

PGIO is the SLOB method for PostreSQL. An old friend of mine, Bart Sjerps, has implemented a modified PGIO in Python and posted it on Github: https://github.com/bsjerps/pypgio

For more on PGIO:

https://dev.to/franckpachot/postgresql-double-buffering-understand-the-cache-size-in-a-managed-service-oci-2oci

https://dev.to/yugabyte/slob-on-yugabytedb-1a32

https://franckpachot.medium.com/do-you-know-what-you-are-measuring-with-pgbench-d8692a33e3d6

This is solid, oaktable-level research about PostgreSQL on OCI. Love the methodology.

And I notice the long-term, resounding impact of Kevin Closson's "slob" testing framework in Franck's "ybio" tool (based on pgio – Kevin's slob for Postgres). Bravo again. https://t.co/iVfTrKb8ND

— Jeremy Schneider (@jer_s) November 28, 2023

query performance tuning

Tom Kyte - Wed, 2023-11-29 22:26
hi, there are queries which are performing slow from past 1 day, earlier same queries were giving output in 22 mins and not it takes 55 mins, this is basically the EOD process. Query 1: <code>SELECT DISTINCT DECODE(:B1 ,'C',IR.REASON_CODE,'D',IR.DESCRIPTION) REASON FROM DEA_REQN_REASONS RR INNER JOIN DEA_INVESTMENT_REASON IR ON UPPER(RR.REASON_CODE) = UPPER(IR.REASON_CODE) WHERE RR.REQ_ID IN (SELECT DEAL_ID FROM DEA_SMD_DEALS_ALLOC WHERE APP_REFER = :B2 UNION ALL SELECT DEAL_ID FROM DEA_SMD_DEALS_ALLOC_HIST WHERE APP_REFER = :B2 UNION ALL SELECT DEAL_ID FROM DEA_PMDEAL_ALLOC WHERE APP_REFER = :B2 UNION ALL SELECT DEAL_ID FROM DEA_PMDEAL_ALLOC_HIST WHERE APP_REFER = :B2 ) AND IR.RECTYPE='L'</code> Query 2: <code>select * from v_chkposint where exists (select scheme from scheme where amc_code = v_chkposint.amc_code and scheme = v_chkposint.scheme and project_y_n = 'Y' and rectype = 'L')</code> what's the scope of tuning, if you expert advise is given, will help to improve the performance. <code>BEGIN_INTERVAL_TIME PLAN_HASH_VALUE OPTIMIZER_ SQL_PROFIL DISK_READS_TOTAL BUFFER_GETS_TOTAL CPU_TIME_TOTAL ELAPSED_TIME_TOTAL IOWAIT_TOTAL ------------------------------ --------------- ---------- ---------- ---------------- ----------------- ----------------- ------------------ ----------------- 20-OCT-23 04.30.46.535 PM 1596829921 ALL_ROWS 45897 11332241 18512737 68903861 8012199 20-OCT-23 01.30.04.043 PM 1596829921 ALL_ROWS 680 7598713 12852262 47355121 428789 19-OCT-23 03.30.03.269 PM 1596829921 ALL_ROWS 27854 393591114 729369364 2402058351 781202 19-OCT-23 11.30.11.237 AM 1596829921 ALL_ROWS 27854 379549234 706629192 2327540410 781202 18-OCT-23 06.30.23.024 PM 1596829921 ALL_ROWS 13936 126131290 220169362 733216565 418827</code> Please find below execution plan for SQL_ID [bjhx2g3b6krmf and ff0sjh69smwx2]. +++++++++++++Execution Plan for SQL_ID [bjhx2g3b6krmf] <code>SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY_AWR('bjhx2g3b6krmf')); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- SQL_ID bjhx2g3b6krmf -------------------- SELECT DISTINCT DECODE(:B1 ,'C',IR.REASON_CODE,'D',IR.DESCRIPTION) REASON FROM DEA_REQN_REASONS RR INNER JOIN DEA_INVESTMENT_REASON IR ON UPPER(RR.REASON_CODE) = UPPER(IR.REASON_CODE) WHERE RR.REQ_ID IN (SELECT DEAL_ID FROM DEA_SMD_DEALS_ALLOC WHERE APP_REFER = :B2 UNION ALL SELECT DEAL_ID FROM DEA_SMD_DEALS_ALLOC_HIST WHERE APP_REFER = :B2 UNION ALL SELECT DEAL_ID FROM DEA_PMDEAL_ALLOC...
Categories: DBA Blogs

SQL Firewall in 23c - UKOUG and Scripts

Pete Finnigan - Mon, 2023-11-27 15:26
I spoke at the UKOUG conference just over a week ago twice and the second of my talks was about the new SQL Firewall in Oracle 23c. This was a walk through of the SQL Firewall and what it does....[Read More]

Posted by Pete On 27/11/23 At 03:18 PM

Categories: Security Blogs

Writting DDL_EVENT Triggers

Tom Kyte - Mon, 2023-11-27 15:26
Do you have a full example of writing a ddl_Even trigger. I want to audit on a schema level any creates, alters, or drops of any objects....and write the actual calling syntax to a table. Oracle8 has the new feature of DDL_EVENT, and their is an example in the SQL Reference, but it just shows a pl/sql_block not how to extract the calling DDL. If their is not an easier way I would end up querying v$sql_text and looking for text with some ddl call in it. I would then write the text to a seperate table. Any other ideas or examples would be appreciated. Thanks
Categories: DBA Blogs

Easy-to-Follow RAG Pipeline Tutorial: Invoice Processing with ChromaDB & LangChain

Andrejus Baranovski - Mon, 2023-11-27 07:11
I explain the implementation of the pipeline to process invoice data from PDF documents. The data is loaded into Chroma DB's vector store. Through LangChain API, the data from the vector store is ready to be consumed by LLM as part of the RAG infrastructure. 

 

sys_op_descend()

Jonathan Lewis - Mon, 2023-11-27 03:46

When you specify that a column used in an index should be a “descending” column Oracle uses the internal, undocumented, sys_op_descend() function to derive a value to store for that column in the index leaf block.

For many years I’ve claimed that this function simply takes the one’s-complement of a character value and appends 0xff to the result before storing it. This is nothing like the whole story and in a recent comment to a note on descending indexes and finding max() values I was given another part of the story in response to my confession that I did not know why a varchar2(128) had turned into a raw(193) – it’s the length not the rawness that puzzled me – when subject to sys_op_descend().

Here’s a little script to generate some data that helps to explain what’s going on.

rem
rem     Script:         sys_op_descend.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Nov 2023
rem     Purpose:        
rem
rem     Last tested 
rem             19.11.0.0
rem

create table t1 (
        v1      varchar2(10),
        nv1     nvarchar2(10)
)
/

insert into t1 values('A','A');
insert into t1 values('AB','AB');
insert into t1 values('ABC','ABC');
commit;

create table t2 (
        v1      varchar2(10),
        nv1     nvarchar2(10)
)
/

insert into t2 values(chr(0), chr(0));
insert into t2 values(chr(1), chr(1));
insert into t2 values(chr(2), chr(2));
insert into t2 values(chr(3), chr(3));
insert into t2 values(chr(4), chr(4));
commit;
;

commit;

There’s nothing particularly special about the two tables and data I’ve inserted, I’m just trying to generate patterns that make it easy to see what’s going on; and I’ll be comparing a varchar2() and an nvarchar2() because my nvarchar2() column is a multi-byte character set.

Let’s look at the (nearly raw) data from t1 where I’ve stored growing strings of printable characters. Here’s a simple query and its result set:

set linesize 50
break on row

select
        dump(v1,16)                     v1,
        dump(sys_op_descend(v1), 16)    v1_desc,
        dump(nv1,16)                    nv1,
        dump(sys_op_descend(nv1), 16)   nv1_desc
from
        t1
/


V1
--------------------------------------------------
V1_DESC
--------------------------------------------------
NV1
--------------------------------------------------
NV1_DESC
--------------------------------------------------
Typ=1 Len=1: 41
Typ=23 Len=2: be,ff
Typ=1 Len=2: 0,41
Typ=23 Len=4: fe,fb,be,ff

Typ=1 Len=2: 41,42
Typ=23 Len=3: be,bd,ff
Typ=1 Len=4: 0,41,0,42
Typ=23 Len=7: fe,fb,be,fe,fb,bd,ff

Typ=1 Len=3: 41,42,43
Typ=23 Len=4: be,bd,bc,ff
Typ=1 Len=6: 0,41,0,42,0,43
Typ=23 Len=10: fe,fb,be,fe,fb,bd,fe,fb,bc,ff


If you start with the first two columns of the output rows you can see: ‘A’ == 0x41, which becomes 0xbe, 0xff, following the “one’s complement with an appended 0xff” rule. The same pattern is visible for ‘AB’ and ‘ABC’.

When you look at the 3rd and 4th columns of each row (the nvarchar2), ‘A’ is now a 2-byte value (0x00, 0x41) which turns into the four bytes: 0xfe, 0xfb, 0xbe, 0xff. The value 0xbe is recognisable as the one’s-complement of 0x41 that appeared for the varchar2() values – but the 0x00 in the original nvarchar2() value seems to have turned into a two-byte 0xfe, 0xfb rather than the “expected” 0xff.

Why doesn’t Oracle use the “right” one’s complement for zero? Maybe because 0xff is the byte that’s supposed to mark the end of the string, and it’s important to avoid the risk of ambiguity. But now we have a new problem: Oracle is using 0xfe as the first of two bytes to represent the “descending” zero, and 0xfe is the one’s-complement of 0x01. So how is Oracle working around the fact that it would be a bad idea to have two possible meanings for the value 0xfe? That’s where the second data set comes in; here’s the same query, with results, run against the t2 table:

select
        dump(v1,16)                     v1,
        dump(sys_op_descend(v1), 16)    v1_desc,
        dump(nv1,16)                    nv1,
        dump(sys_op_descend(nv1), 16)   nv1_desc
from
        t2
/

V1
--------------------------------------------------
V1_DESC
--------------------------------------------------
NV1
--------------------------------------------------
NV1_DESC
--------------------------------------------------
Typ=1 Len=1: 0
Typ=23 Len=3: fe,fe,ff
Typ=1 Len=2: 0,0
Typ=23 Len=3: fe,fd,ff

Typ=1 Len=1: 1
Typ=23 Len=3: fe,fa,ff
Typ=1 Len=2: 0,1
Typ=23 Len=3: fe,fc,ff

Typ=1 Len=1: 2
Typ=23 Len=2: fd,ff
Typ=1 Len=2: 0,2
Typ=23 Len=4: fe,fb,fd,ff

Typ=1 Len=1: 3
Typ=23 Len=2: fc,ff
Typ=1 Len=2: 0,3
Typ=23 Len=4: fe,fb,fc,ff

Typ=1 Len=1: 4
Typ=23 Len=2: fb,ff
Typ=1 Len=2: 0,4
Typ=23 Len=4: fe,fb,fb,ff

Looking at the last three groups of 4 lines we can see the varchar2() column following the “one’s complement, append 0xff” rule and the nvarchar2() following the additional “use 0xfe 0xfb for zero” rule; but for chr(0) and chr(1) the dumps need some further untangling. With the tests so far all we can say with any confidence is that “if you see the 0xfe byte then a different pattern is about to appear briefly”.

I don’t really need to do any more experiments to guess why, in my previous note, the descending varchar2(128) was projected as raw(193) (though you might want to play around to see what happens with strings of several consecutives zeros or ones). I suppose the size reported could vary with character set, but if I have a two-byte fixed width character set and most of the data corresponds to basic ASCII characters then I’ll have a lot of characters where every other byte is a zero that encodes into two bytes when descending – so it makes sense to use for the projection a size derived as: 64 * 1 + 64 * 2 + 1 (0xff) = 193. Q.E.D.

Index sizing

An important side effect of this improved understanding is the realisation of what can happen to the size of an index when declared descending. For a simple demonstration, here’s a table with 4 columns and 50,000 rows copied from all_objects.object_name, using the nvarchar2() type for one of the pairs of copies.

create table t1a (
        nva nvarchar2(128),
        nvd nvarchar2(128),
        va  varchar2(128),
        vd  varchar2(128)
)
/

insert into t1a
select  object_name, object_name, object_name, object_name
from    all_objects
where
        rownum <= 50000
;

create index t1a_nva on t1a(nva);
create index t1a_nvd on t1a(nvd desc);
create index t1a_va on t1a(va);
create index t1a_vd on t1a(vd desc);

execute dbms_stats.gather_table_stats(user,'t1a')

select index_name, leaf_blocks
from
        user_indexes
where
        table_name = 'T1A'
order by
        index_name
/

select 
        column_name, avg_col_len 
from    user_tab_cols 
where   table_name = 'T1A' 
order by 
        column_name
/


INDEX_NAME           LEAF_BLOCKS
-------------------- -----------
T1A_NVA                      590
T1A_NVD                      854
T1A_VA                       336
T1A_VD                       343

4 rows selected.

COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
NVA                           74
NVD                           74
SYS_NC00005$                 111
SYS_NC00006$                  39
VA                            38
VD                            38

6 rows selected.

As you can see, the descending varchar2() index (backed by column sys_nc0006$) is only slightly larger than the corresponding ascending index, but the descending nvarchar2() (backed by column sys_nc00005$) has increased by something much closer to 50% in size because half the bytes in each object_name were zeroes that have been replaced by the two byte 0xfe 0xfb. This is much worse than the “one byte per descending column per row” that I’ve been claiming for the last 20 or 30 years.

Why book reports aren’t a good homework assignment any more

RDBMS Insight - Fri, 2023-11-24 13:40
/* Chat containers */ .chat { border: 2px solid #dedede; background-color: #f1f1f1; border-radius: 5px; padding: 10px; margin: 10px 0; } /* Darker chat container */ .darker { border-color: #ccc; background-color: #ddd; } /* Style images */ .chat img { float: left; max-width: 60px; width: 100%; margin-right: 20px; border-radius: 50%; }

So, everyone knows that the kids these days are using ChatGPT to write their book reports. Even Big Nate knows it!

But what about ChatGPT’s safeguards? Isn’t it supposed to have some kind of anti-cheating baked in, so it won’t just write essays for kids? Why doesn’t that work?

Sure, it does have safeguards… kind of. If you just ask it to write an essay, it responds with a “helpful” answer about how to write an essay. The thing is that these safeguards are incredibly easy to work around.

Let’s pretend we’re a student who has to write a book report on The Kingdom over the Sea, by Zohra Nabi. Here’s how to write it in 20 minutes without even touching the book.

1. Find a review.

Look up the book online in a search engine. Find a book review that describes it and skim the review. Note the main character’s name and what happens.

In this case, the Kingdom over the Sea was reviewed on Kirkus Reviews. The main character’s name is Yara, and the author’s name is Zohra Nabi.

2. Ask ChatGPT for the first paragraph.

Let’s pretend we’re a Grade 6 student asking ChatGPT what should go in the first paragraph of this book report. Here are the prompts I used:

Big Nate

Hi Im a Grade 6 student and I need to write a baper on a book called, the Kingdom over the Sea.

Big Nate

can you tell me what to put in the first paragraph

chatgpt logo

Of course! The first paragraph of your paper is usually the introduction, and it should give your reader a brief overview of the book and what your paper will be about. Here’s a basic structure for the first paragraph:

Start with a hook: Begin with an interesting or engaging sentence to grab your reader’s attention. It could be a quote from the book, a surprising fact, or a question related to the story.

Introduce the book: Mention the title of the book (“The Kingdom over the Sea”) and the author’s name if you know it.

Provide some context: Briefly describe the genre of the book (e.g., fantasy, adventure) and its setting (where and when the story takes place).

Mention the main idea: Give a hint about the main theme or what the story is about. For example, is it about a young hero going on an epic adventure, a mystery to be solved, or a journey to a magical land?

State your purpose: Finally, state that your paper will explore and analyze various aspects of the book. Mention the topics or themes you plan to discuss, such as characters, plot, or themes.

Here’s an example to get you started:

“In the enchanting world of ‘The Kingdom over the Sea,’ written by [Author’s Name], readers are transported to a magical land filled with adventures, mystery, and wonder. This fantasy novel, set in a distant realm, follows the journey of [Main Character’s Name] as he embarks on an epic quest. In this paper, we will delve into the captivating elements of this story, exploring the characters, plot twists, and the themes that make ‘The Kingdom over the Sea’ an unforgettable read.”

Feel free to customize it to fit your own thoughts and ideas. If you have any specific details about the book you’d like to include in the introduction, please let me know!

3. Fill in and rephrase

Ignore most of ChatGPT’s long, didactic answer. Grab the sample paragraph, fill in the missing “Author’s Name” and “Main Character’s Name”, and–this part is key–ask ChatGPT to rephrase it like a Grade 6 student wrote it. If you don’t, you’ll wind up with this problem:

Big Nate comic

Here’s the rewritten prompt:

Big Nate

please rewrite this text so it sounds like a Grade 6 student wrote it: “In the enchanting world of ‘The Kingdom over the Sea,’ written by Zohra Nabi, readers are transported to a magical land filled with adventures, mystery, and wonder. This fantasy novel, set in a distant realm, follows the journey of Yara as she embarks on an epic quest. In this paper, we will delve into the captivating elements of this story, exploring the characters, plot twists, and the themes that make ‘The Kingdom over the Sea’ an unforgettable read.”

chatgpt logo

“In ‘The Kingdom over the Sea’ by Zohra Nabi, it’s like you’re going on a journey to a super cool and magical place! This book is all about adventures, mysteries, and really amazing stuff. It’s a fantasy book, and it happens in a faraway world. We get to follow this awesome character named Yara, and she’s off on a super big adventure. In my paper, we’ll check out the cool stuff in the story, like the characters, surprises, and the things that make ‘The Kingdom over the Sea’ a book you won’t forget.”

And there’s the first paragraph of your book report — done!

5. Second paragraph: Paraphrase

For the second paragraph, just ask ChatGPT to paraphrase the whole review you found in step 1. Paste the entire review into the prompt. (I haven’t pasted the whole thing; the review starts with “Cryptic directions”)

Big Nate

can you write this paragraph like a 6th grader wrote it “Cryptic directions…

chatgpt logo

“Yara’s mom left her some tricky clues that sent her from England to a super magical world. Her mom passed away and left her directions to go to a place called Zehaira, where she’d meet a sorceress named Leyla Khatoun. Yara wasn’t sure if this was real at first, but she really wanted to know more about her mom and their past because, on paper, they were from Iraq. When she got there, she found a world where the Sultan was against magic and had the sorcerers hiding from the Inquisition. On her way to find Leyla, Yara heard about a plan to wipe out all the sorcerers. But Leyla was busy running a secret sorcerer hideout and didn’t want to help the city sorcerers. Yara had to work really hard to get Leyla to see the big danger and to tell her the secrets her mom kept. And guess what? Yara discovers she has some magic of her own, which was a bit unexpected. But we already liked Yara a lot by then. The world in the story is full of magic and feels like an Arabian adventure, and the people and jinn in the story are really interesting. The ending was kind of quick, but it looks like there will be more exciting adventures in the next book.”

6. Expand

Now you have two paragraphs. Feed both of them to ChatGPT and ask it to write Paragraph 3.

Big Nate Paragraph 1 of the book review is is “In ‘The Kingdom over the Sea’…. Paragraph 2 is “Yara’s mom…” What should I put in Paragraph 3?

Note: In the actual prompt, I pasted in the entire first and second paragraphs; I didn’t duplicate them here because that would be boring. But you get the idea.

chatgpt logo

“One of my favorite parts in ‘The Kingdom over the Sea’ was when Yara discovered her own magical abilities. It was a big surprise, and I thought it was really cool how the author revealed her powers. I also liked how the characters changed during the story. Yara became even braver, and it was awesome to see her grow. The themes in the book, like courage and friendship, are important in real life too. I thought they were well shown in the story. The author’s writing style made the book easy to read, and I felt like I was right there with Yara on her adventure. Overall, I really enjoyed ‘The Kingdom over the Sea,’ and I would totally recommend it to my friends because it’s full of magic and fun!”

7. Put it all together

Put the three paragraphs all together to get a finished 440-word book review that could plausibly come from an overly enthusiastic middle-schooler:

In ‘The Kingdom over the Sea’ by Zohra Nabi, it’s like you’re going on a journey to a super cool and magical place! This book is all about adventures, mysteries, and really amazing stuff. It’s a fantasy book, and it happens in a faraway world. We get to follow this awesome character named Yara, and she’s off on a super big adventure. In my paper, we’ll check out the cool stuff in the story, like the characters, surprises, and the things that make ‘The Kingdom over the Sea’ a book you won’t forget.

Yara’s mom left her some tricky clues that sent her from England to a super magical world. Her mom passed away and left her directions to go to a place called Zehaira, where she’d meet a sorceress named Leyla Khatoun. Yara wasn’t sure if this was real at first, but she really wanted to know more about her mom and their past because, on paper, they were from Iraq. When she got there, she found a world where the Sultan was against magic and had the sorcerers hiding from the Inquisition. On her way to find Leyla, Yara heard about a plan to wipe out all the sorcerers. But Leyla was busy running a secret sorcerer hideout and didn’t want to help the city sorcerers. Yara had to work really hard to get Leyla to see the big danger and to tell her the secrets her mom kept. And guess what? Yara discovers she has some magic of her own, which was a bit unexpected. But we already liked Yara a lot by then. The world in the story is full of magic and feels like an Arabian adventure, and the people and jinn in the story are really interesting. The ending was kind of quick, but it looks like there will be more exciting adventures in the next book.

One of my favorite parts in ‘The Kingdom over the Sea’ was when Yara discovered her own magical abilities. It was a big surprise, and I thought it was really cool how the author revealed her powers. I also liked how the characters changed during the story. Yara became even braver, and it was awesome to see her grow. The themes in the book, like courage and friendship, are important in real life too. I thought they were well shown in the story. The author’s writing style made the book easy to read, and I felt like I was right there with Yara on her adventure. Overall, I really enjoyed ‘The Kingdom over the Sea,’ and I would totally recommend it to my friends because it’s full of magic and fun!

8. Sprinkle with mistakes and hand in

Add a few spelling mistakes for verisimilitude, and the homework is done.

This took me about 20 minutes, start to finish: much less time than it would’ve taken me to read the book, think about it, come up with something to say, and write it down.

And if I really was a Grade 6 student, I would’ve lost the experience of reading a fantastic book, thinking about it, and reflecting on my own reaction to it. (I did read the book, and it really is lovely.)

What’s a teacher to do?

I don’t know, but here are a couple of ideas from a non-teacher:

  • Use in-class quizzes, not book reports to see who’s done the assigned reading.
  • Ask for response papers that discuss a student’s personal reaction to a book, rather than summarizing the book.
  • Ask for personalized answers in which students contextualize their response to a book with something in their own lives.
  • Require handwritten homework so that, at the very least, students have to hand-copy what ChatGPT spits out. Even the act of copying can lead to reflection.
Post credits
  • Post images are from Big Nate, written and drawn by Lincoln Peirce.

  • Chat CSS is adapted from W3Schools.
Categories: DBA Blogs

Using vector databases for context in AI

Kubilay Çilkara - Fri, 2023-11-24 11:30

In the realm of Artificial Intelligence (AI), understanding and retaining context stands as a pivotal factor for decision-making and enhanced comprehension. Vector databases, are the foundational pillars in encapsulating your own data to be used in conjunction with AI and LLMs. Vector databases are empowering these systems to absorb and retain intricate contextual information.

Understanding Vector Databases

Vector databases are specialised data storage systems engineered to efficiently manage and retrieve vectorised data - also known as embeddings. These databases store information in a vector format, where each data entity is represented as a multidimensional numerical vector, encapsulating various attributes and relationships, thus fostering the preservation of rich context. That is text, video or audio is translated into numbers with many attributes in the multidimensional space. Then mathematics are used to calculate the proximity between these numbers. Loosely speaking that is what a neural network in an LLM does, it computes proximity (similarity) between the vectors. The vector database is the database where the vectors are stored. If you don't use vector databases in LLM, under architectures like RAG, you will not be able to bring your own data or context into your LLM AI model as all it will know will be what it is trained on which will probably be what it was trained on from the public internet. Vector database enable you to bring your own data to AI.

Examples of Vector Databases

Several platforms offer vector databases, such as Pinecone, Faiss by Facebook, Annoy, Milvus, and Elasticsearch with dense vector support. These databases cater to diverse use cases, offering functionalities tailored to handle vast amounts of vectorised information, be it images, text, audio, or other complex data types.

Importance in AI Context

Within the AI landscape, vector databases play a pivotal role in serving specific data and context for AI models. Particularly, in the Retrieval-Augmented Generation (RAG) architecture, where retrieval of relevant information is an essential part of content generation, vector databases act as repositories, storing precomputed embeddings from your own private data. These embeddings encode the semantic and contextual essence of your data, facilitating efficient retrieval in your AI apps and Bots. Bringing vector databases to your AI apps or chatbots will bring your own data to your AI apps, Agents and chatbots, and these apps will speak your data in case of LLMs.

Advantages for Organisations and AI Applications

Organisations can harness the prowess of vector databases within RAG architectures to elevate their AI applications and enable them to use organisational specific data:

  1. Enhanced Contextual Understanding: By leveraging vector databases, AI models grasp nuanced contextual information, enabling more informed decision-making and more precise content generation based on specific and private organisational context.

  2. Improved Efficiency in Information Retrieval: Vector databases expedite the retrieval of pertinent information by enabling similarity searches based on vector representations, augmenting the speed and accuracy of AI applications.

  3. Scalability and Flexibility: These databases offer scalability and flexibility, accommodating diverse data types and expanding corpora, essential for the evolving needs of AI-driven applications.

  4. Optimised Resource Utilisation: Vector databases streamline resource utilisation by efficiently storing and retrieving vectorised data, thus optimising computational resources and infrastructure.

Closing Thoughts

In the AI landscape, where the comprehension of context is paramount, vector databases emerge as linchpins, fortifying AI systems with the capability to retain and comprehend context-rich information. Their integration within RAG architectures not only elevates AI applications but also empowers organisations to glean profound insights, fostering a new era of context-driven AI innovation from data.

In essence, the power vested in vector databases will reshape the trajectory of AI, propelling it toward unparalleled contextualisation and intelligent decision-making based on in house and organisations own data.

But the enigma persists: What precisely will be the data fuelling the AI model?

Categories: DBA Blogs

Unified Auditing on tables

Tom Kyte - Thu, 2023-11-23 20:06
We have been trying to enable auditing on few DDL activities in autonomous database. As per the following code we created, enabled and verified the audit configs. Everything looks normal to us, however none of the table create/drop activities (listed at the end) are not being tracked in <code>unified_audit_trail</code> table. We can only see audit management related activities like <code>DBMS_AUDIT_MGMT BEGIN dbms_audit_mgmt.flush_unified_audit_trail; END;</code> in <code>unified_audit_trail</code> table. I must be missing something, can you please help me navigate through it. <code> -- Connect to ADMIN user -- DB Version SELECT * FROM v$version; -- Verfiy Audit Status SELECT parameter , value FROM v$option WHERE parameter = 'Unified Auditing'; -- Enable immediate write mode BEGIN dbms_audit_mgmt.set_audit_trail_property(audit_trail_type => dbms_audit_mgmt.audit_trail_unified , audit_trail_property => dbms_audit_mgmt.audit_trail_write_mode ,audit_trail_property_value => dbms_audit_mgmt.audit_trail_immediate_write); END; / -- Validate audit configs SELECT * FROM dba_audit_mgmt_config_params; -- Create an audit policy CREATE AUDIT POLICY xx_ddl_test_policy PRIVILEGES CREATE ANY TABLE , ALTER ANY TABLE , DROP ANY TABLE WHEN 'SYS_CONTEXT (''USERENV'', ''SESSION_USER'') IN (''ADMIN'',''XXTST'')' EVALUATE PER STATEMENT ; -- Enable Audit Policy AUDIT POLICY xx_ddl_test_policy; -- Validate policy SELECT * FROM audit_unified_enabled_policies WHERE policy_name IN ('XX_DDL_TEST_POLICY'); SELECT * FROM audit_unified_policies WHERE policy_name in (SELECT policy_name FROM audit_unified_enabled_policies); -- Flush audit trail EXEC dbms_audit_mgmt.flush_unified_audit_trail; -- Connect to XXTST user -- Test Audit Policy CREATE TABLE xx_test (vc_col VARCHAR2(25)); INSERT INTO xx_test ('Test 01'); COMMIT; TRUNCATE TABLE xx_test; DROP TABLE xx_test; -- Verify audit trial SELECT * FROM unified_audit_trail ORDER BY event_timestamp DESC; </code>
Categories: DBA Blogs

Statistics Gathering using older version feature

Tom Kyte - Thu, 2023-11-23 20:06
Hello Connor/Chris, I am in the middle of troubleshooting a slow-running application maintenance job for a database that has been migrated/upgraded from on-premise 12.1.0.2 on linux x64 to Exadata Cloud @ Customer setup. Many of the "Top SQLs" that have regressed are internal SQLs generated by call to GATHER_TABLE_STATS call that is part of the application job. The internal SQLs are similar to below but for some reason are running 5-6 times slower on 19c database. I am running out of options to figure out what could be causing this but I was wondering whether there exists a way to replicate the behaviour of statistics gathering from 12.1.0.2 in 19c database. Example internal SQL that has regressed: <code>select /*+ full(t) no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring xmlindex_sel_idx_tbl opt_param('optimizer_inmemory_aware' 'false') no_substrb_pad */to_char(count("OWNER")),substrb(dump(min("OWNER"),16,0,64) ,1,240),substrb(dump(max("OWNER"),16,0,64),1,240), to_char(count("OBJECT_NAME")),substrb(dump(min("OBJECT_NAME"),16,0,64),1, 240),substrb(dump(max("OBJECT_NAME"),16,0,64),1,240), to_char(count("SUBOBJECT_NAME")),substrb(dump(min("SUBOBJECT_NAME"),16,0,64) ,1,240),substrb(dump(max("SUBOBJECT_NAME"),16,0,64),1,240), to_char(count("OBJECT_ID")),substrb(dump(min("OBJECT_ID"),16,0,64),1,240), substrb(dump(max("OBJECT_ID"),16,0,64),1,240), to_char(count("DATA_OBJECT_ID")),substrb(dump(min("DATA_OBJECT_ID"),16,0,64) ,1,240),substrb(dump(max("DATA_OBJECT_ID"),16,0,64),1,240), to_char(count("OBJECT_TYPE")),substrb(dump(min("OBJECT_TYPE"),16,0,64),1, 240),substrb(dump(max("OBJECT_TYPE"),16,0,64),1,240), to_char(count("CREATED")),substrb(dump(min("CREATED"),16,0,64),1,240), substrb(dump(max("CREATED"),16,0,64),1,240),to_char(count("LAST_DDL_TIME")), substrb(dump(min("LAST_DDL_TIME"),16,0,64),1,240), substrb(dump(max("LAST_DDL_TIME"),16,0,64),1,240),to_char(count("TIMESTAMP") ),substrb(dump(min("TIMESTAMP"),16,0,64),1,240), substrb(dump(max("TIMESTAMP"),16,0,64),1,240),to_char(count("STATUS")), substrb(dump(min("STATUS"),16,0,64),1,240),substrb(dump(max("STATUS"),16,0, 64),1,240),to_char(count("TEMPORARY")),substrb(dump(min("TEMPORARY"),16,0, 64),1,240),substrb(dump(max("TEMPORARY"),16,0,64),1,240), to_char(count("GENERATED")),substrb(dump(min("GENERATED"),16,0,64),1,240), substrb(dump(max("GENERATED"),16,0,64),1,240),to_char(count("SECONDARY")), substrb(dump(min("SECONDARY"),16,0,64),1,240),substrb(dump(max("SECONDARY"), 16,0,64),1,240),to_char(count("NAMESPACE")),substrb(dump(min("NAMESPACE"), 16,0,64),1,240),substrb(dump(max("NAMESPACE"),16,0,64),1,240), to_char(count("EDITION_NAME")),substrb(dump(min("EDITION_NAME"),16,0,64),1, 240),substrb(dump(max("EDITION_NAME"),16,0,64),1,240), to_char(count("SHARING")),substrb(dump(min("SHARING"),16,0,64),1,240), substrb(dump(m...
Categories: DBA Blogs

503 Service Unavailable - ORA-28001: the password has expired

Tom Kyte - Tue, 2023-11-21 13:26
hi sir i got this message when i try to run my app "503 Service Unavailable 2023-11-15T09:24:51.336389300Z | 8-GbuZcF3_cnIHgb1ul5Kg | ORDS was unable to make a connection to the database. This can occur if the database is unavailable, the maximum number of sessions has been reached or the pool is not correctly configured. The connection pool named: |default|lo| had the following error(s): ORA-28001: the password has expired" how can i solve it ? thanks
Categories: DBA Blogs

Pages

Subscribe to Oracle FAQ aggregator