Home » Server Options » Text & interMedia » extract information from an unstructured email into tables (RDBMS 11R2)
extract information from an unstructured email into tables [message #620976] |
Thu, 07 August 2014 10:28 |
|
ajnewbs
Messages: 8 Registered: June 2014 Location: UK
|
Junior Member |
|
|
I am looking for help on something that may not even be possible but it is something I am sure someone must have tried before.
We want to save some manual data entry time on our support system, we want to be able to pick various items of information out of our customer emails and save the data in specific database fields which make up our support case - these emails are not structured in any way and therefore it seems an impossible task, but someone must be doing this somewhere and I don't know where to start looking - are there utilities already available or?
|
|
|
Re: extract information from an unstructured email into tables [message #620980 is a reply to message #620976] |
Thu, 07 August 2014 10:37 |
Lalit Kumar B
Messages: 3174 Registered: May 2013 Location: World Wide on the Web
|
Senior Member |
|
|
You posted two consecutive messages.
Anyway, with the information you posted, it is not clear about how do you handle the emails to read it's body content. Where are these emails residing? DB or email client inbox or web based email inbox? What process do you use for data entry? And finally, when you say "utilities", are you looking for a software for this?
|
|
|
|
|
Re: extract information from an unstructured email into tables [message #620986 is a reply to message #620983] |
Thu, 07 August 2014 10:44 |
|
Michel Cadot
Messages: 68712 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
ajnewbs wrote on Thu, 07 August 2014 17:43Thank you for replying - I am just looking for general guidance on whether it is possible. Assume that we have already extracted the text from an email into a varchar2 field in the database - my use of email as an example was to illustrate the source of the text.
Basically we want to scan a load of text and identify the individual bits of information that we can load into our specific data fields, for example product , country, version, problem description, severity etc., etc....
Good luck to write your mail analyzer but this has nothing to do with the database.
|
|
|
|
|
|
|
|
|
|
Re: extract information from an unstructured email into tables [message #621038 is a reply to message #621037] |
Thu, 07 August 2014 17:00 |
|
Barbara Boehmer
Messages: 9100 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
Here is an extraction example to get you started.
SCOTT@orcl12c> -- table containing emails:
SCOTT@orcl12c> CREATE TABLE customer_emails
2 (emailid NUMBER,
3 email VARCHAR2(4000))
4 /
Table created.
SCOTT@orcl12c> INSERT ALL
2 INTO customer_emails VALUES
3 (1, 'I have a minor problem with Oracle 12c in the USA.')
4 INTO customer_emails VALUES
5 (2, 'I am in the UK and my Microsoft smartscreen filter is causing a major problem.')
6 SELECT * FROM DUAL
7 /
2 rows created.
SCOTT@orcl12c> -- table of categories to load email data into:
SCOTT@orcl12c> CREATE TABLE email_categories
2 (emailid NUMBER,
3 product VARCHAR2(15),
4 country VARCHAR2(15),
5 version VARCHAR2(15),
6 severity VARCHAR2(15))
7 /
Table created.
SCOTT@orcl12c> -- policy and rules you create to determine what data goes into what categories:
SCOTT@orcl12c> BEGIN
2 CTX_ENTITY.CREATE_EXTRACT_POLICY ('email_pol');
3 CTX_ENTITY.ADD_EXTRACT_RULE
4 ('email_pol',
5 1,
6 '<rule>
7 <expression>(Oracle|Microsoft)</expression>
8 <type refid="1">product</type>
9 </rule>');
10 CTX_ENTITY.ADD_EXTRACT_RULE
11 ('email_pol',
12 2,
13 '<rule>
14 <expression>(UK|USA)</expression>
15 <type refid="1">country</type>
16 </rule>');
17 CTX_ENTITY.ADD_EXTRACT_RULE
18 ('email_pol',
19 3,
20 '<rule>
21 <expression>(11g|12c)</expression>
22 <type refid="1">xversion</type>
23 </rule>');
24 CTX_ENTITY.ADD_EXTRACT_RULE
25 ('email_pol',
26 4,
27 '<rule>
28 <expression>(major|minor)</expression>
29 <type refid="1">xseverity</type>
30 </rule>');
31 CTX_ENTITY.COMPILE ('email_pol');
32 END;
33 /
PL/SQL procedure successfully completed.
SCOTT@orcl12c> -- parse email data and insert into categories:
SCOTT@orcl12c> DECLARE
2 v_entities CLOB;
3 BEGIN
4 FOR r IN (SELECT * FROM customer_emails) LOOP
5 CTX_ENTITY.EXTRACT ('email_pol', r.email, NULL, v_entities);
6 INSERT INTO email_categories (emailid, product, country, version, severity)
7 SELECT r.emailid,
8 MAX (DECODE (foo.type, 'product', foo.text)),
9 MAX (DECODE (foo.type, 'country', foo.text)),
10 MAX (DECODE (foo.type, 'xversion', foo.text)),
11 MAX (DECODE (foo.type, 'xseverity', foo.text))
12 FROM XMLTABLE
13 ('/entities/entity'
14 PASSING XMLTYPE (v_entities)
15 COLUMNS
16 offset NUMBER PATH '@offset',
17 lngth NUMBER PATH '@length',
18 text VARCHAR2(50) PATH 'text/text()',
19 type VARCHAR2(50) PATH 'type/text()',
20 source VARCHAR2(50) PATH '@source') AS foo
21 GROUP BY r.emailid;
22 END LOOP;
23 END;
24 /
PL/SQL procedure successfully completed.
SCOTT@orcl12c> -- results:
SCOTT@orcl12c> SELECT * FROM email_categories
2 /
EMAILID PRODUCT COUNTRY VERSION SEVERITY
---------- --------------- --------------- --------------- ---------------
1 Oracle USA 12c minor
2 Microsoft UK major
2 rows selected.
|
|
|
Goto Forum:
Current Time: Tue Nov 12 19:12:36 CST 2024
|