Tuesday, June 30, 2009

Marketing Server Optimization on Teradata

The purpose of this blog is to help and understand the techniques that may be implemented for optimizing Marketing Server

Let us understand some marketing metadata terms before we get into steps for implementing the optimizing techniques

1. Target Level A Target Level is the entity that a marketer is interested in counting. Target Levels are usually customer types such as individuals, businesses, or households

2. Segmentation Catalog A segmentation catalog is a Oracle BI subject area (presentation catalog) that is enabled for segmentation. The segmentation catalog provides a set of dimensions and facts that can be used to create segment criteria

3. Sampling Factor Sampling is a subset table for a target level dimension table that contains the same set of columns, but only a percentage of the data

4. List Catalog A list catalog is a Oracle BI subject area (presentation catalog) that is enabled for List Format design (list generation)

5. Qualified List Item
A Qualified List Item is entity that is evaluated against segment criteria so that the information related to that entity can be exported in a list file

6. Caching
Segmentation criteria blocks that count target level identifiers may be used by a user frequently

7. Saved Result Sets
The resulting set of target level identifiers of complex segmentation criteria may be saved permanently

Lets us look at the steps to enable sampling for a Target Level

The purpose to enable sampling is to improve user response time of running Marketing queries for segments and segment trees. While creating and debugging segments and segment trees, running the queries would take a long time to complete if the entire data set is used. Use sampling for these purposes to decrease the query completion time because a smaller subset of the data set is used.

1. Create physical sampling tables
2. Setup session variable
3. Map sampling tables to Target Level

Create Physical Sampling Tables

Sampling must be enabled at a Target level

• First identify the desired target level and sampling factoro For this example the desired Target Level to be sampled is Contacts
• Generate DDLs for all physical dimension and fact tables to be sampled Example : CREATE TABLE AS ( SELECT * FROM SAMPLE RANDOMIZED ALLOCATION ) WITH DATA
• Rename the physical table and index names to include the target level and sampling factor. For example, if the original table name was W_PERSON_D. A sampled table name for a 10% sample of Contacts could be M_10C_PERSON_D
• Populate M_10C_PERSON_D with 10% of W_PERSON_D.
• Cascade the 10% Contacts join relationship to the remaining sampled tables. For example, to populate a campaign history sample table, you would include all fact records that join to the 10% sample of Contact records

Map the sampling tables into the physical layer of metadata

Setup Session Variables

Select Manage > Variables > Initialization Block > New Initialization Block

1. Only one initialization block needs to be setup to enable sampling eg: Marketing Sampling Tables with session variables associated. The session variables must be created with Default Initializer as the name of target level table







2. Click Manage--> Marketing and then Double Click on the Targeting Level “Customers”. Click on the “Sampling Tables” tab
3. Click on the “Add” button.Fill in the name Sampled Physical table in the “Sample Physical Table Name” field. In the “Repository Table Object” field, fill in the name of the Target Level Table. In the “Factor” field, fill in the number that represents the following calculation: (# of rows in the sample table *100)/# of rows in the Target Level Table.Repeat this step for every sampling table for this Target Level table








DDL for Marketing Cache tables in Teradata

Create the Marketing cache tables for each Target Level using the following DDL to improve poor array insert performance into Teradata and its ODBC driver. The schema scripts are in Root Drive:\OracleBI\server\Schema path
CREATE MULTISET TABLE M_C_CONTACT ( GUID varchar(50) NOT NULL, QUALIFIED_ID varchar(30) NOT NULL ) PRIMARY INDEX (GUID, QUALIFIED_ID);

Model Cache Tables in RPD

Import the created cache physical tables into the physical layer. This table would not have any physical joins







Create the logical model of the cache table as Fact - Marketing - Segmentation Person Cache with M_C_CONTACT source table renamed as MKTG_CACHE_PERSON and QUALIFIED_ID column renamed as Person ID







Create complex join for this table to Dim-Contact








Create a presentation catalog Marketing Segmentation Cache and Saved Results and expose these columns into - Contact Cache sub folder








Create Save Result Set Table DDL for Teradata

If the saved result set feature is to be used, create the Marketing saved result set tables for each Target Level using the following DDL for improved performance in Teradata. The script is under same path as specified above
CREATE MULTISET TABLE M_SR_CONTACT ( GUID varchar(50) NOT NULL, TARGET_LEVEL_ID varchar(30) NOT NULL ) PRIMARY INDEX (GUID, TARGET_LEVEL_ID);

Model Save Result Set Tables in RPD

Follow similar steps as above to model the M_SR_CONTACT table aliased as M_SR_HEADER(Contact)





Associate the above mapping to the target level in RPD

1. Click Manage--> Marketing

2. Double click on Contacts QLI







3. Click on Cache Information Tab













4. Click on Cache Catalog ellipsis button and map to the Marketing Segmentation Cache and Saved Results presentation catalog

5. Similarly map GUID, Qualified ID presentation columns

6. Click on Default SQL button to populate the Physical SQL query

7. When a criteria block is cached, the populate stored procedure is used to write the cache/saved result set to the database. All segmentation users/groups need to be given this privilege via Manage Security->Users/Groups->Permissions->Query Limits->Populate Privilege, set to ‘Allow’
8. Under Manage--> Marketing. Click on Target Levels in the left pane and double click on Contacts Target Level








9. Select Saved Results Tab and add the names for Saved Results catalog,GUID Column, Target ID Column , Physical table Name, Connection Pool Name as in below figure















10. Click on Default SQL’s button to populate SQLs

Enable Caching for a Target Level

The purpose of these steps is to correctly setup caching to improve poor array insert performance into Teradata and its ODBC driver.

Set Teradata Database Physical Repository Object Properties

- Double-click the Teradata database physical repository object
- Select the Features tab
- Add the following value for MAX_PARAMETERS_PER_BULK_OPERATION = 128














Set Connection Pool Object for Selects from Teradata

- Each database physical repository object has 1 connection pool object defined by default
- This default connection pool object will be used for selects
- Set the Isolation Level to Dirty Read
- Select the Write Back tab- Buffer Size (KB) = 64
- Transaction Boundary = 20 – 50

The Transaction Boundary range stated above provides a general estimate of the value that Will yield the best array insertion throughput. Setting the Transaction Boundary within the range should generate about 650 records per second. The best Transaction Boundary value will be a function of the Teradata system used and its configuration










Following the same steps above create Connection Pool Object used for Inserting into Cache Tables in Teradata

- Add a new connection pool object to the Teradata database physical repository object
- Set the new connection pool object’s data source to the same data source used in the select

Connection pool

- Set the Isolation Level to Default

Enable Marketing Server to use Temp Table Caching

Once the caching is setup for Target Levels according to the previous section, enable Marketing Server to use temp table caching to avoid potential Teradata deadlock errors while inserting into the cache tables.

1. Edit /Web/config/instanceconfig.xml

2. Add the following to the instanceconfig.xml within the ServerInstance tag:

CREATE MULTISET TABLE @{TempTableName} (GUID varchar(50) NOT NULL, QUALIFIED_ID varchar(30) NOT NULL) PRIMARY INDEX (GUID, QUALIFIED_ID) INSERT INTO @{PhysicalTableName} (GUID,QUALIFIED_ID) Select GUID, QUALIFIED_ID from @{TempTableName} DROP TABLE @{TempTableName}
Important Note:

• The SQL used must be valid SQL for the database the Marketing Server is using which is Teradata in this case
• Use @{TempTableName} in the MktgCacheTempCreateSQL tag as a generic temp cache table variable name for the temp cache table DDL
• Use @{TempTableName} in the MktgCacheTempDropSQL tag as a generic temp cache table variable name for the drop temp table SQL
• Use @{TempTableName} in the MktgCacheTempInsertSQL tag as a generic temp cache table variable name for the insert-into-select SQL• Use @{PhysicalTableName} in the MktgCacheTempInsertSQL tag as a generic physical cache table variable name for the insert-into-select SQL
• The DDL specified in the MktgCacheTempCreateSQL tag must match the Marketing cache table DDL
• The column names specified in the insert-into-select SQL in the MktgCacheTempInsertSQL tag must be consistent with the columns in the cache table and temp tables

3. Save instanceconfig.xml

4. Start OracleBI Presentation Services

2 comments: