SAP BODS Tutorial for Beginners

What is SAP BODS?

SAP BODS is an ETL tool that delivers a single enterprise-class solution for data integration, data quality, and data profiling that permits you to integrate, transform, improve, and provide trusted data that supports important business processes and enables sound decisions.

SAP BODS combines industry data quality into one platform. BODS provides a single environment for development, run-time, management, security, and data connectivity.

The object and functions within BODS are specially designed to perform manipulation and transformation of huge and complex data very efficiently.

  • Data Integration- Data integration is the procedure of retrieving data from different sources and combining it in such a way that it can produce consistent, comprehensive and correct information for business reporting and analysis.
  • Data Quality- An Interactive method of measuring data from different perspective is called Data Quality. Data quality components are consistency, integrity, accuracy, and
  • Data Profiling- This is the process of reviewing source data, understanding structure, content and interrelationship, and identifying the potential for data projects. It is also known as the Data Assessment and Data Quality Analysis.

The purpose of this tool is to perform jobs like-

  • ETL- ETL is short for Extraction, Transformation, and Loading. ETL pulls data from database, table, and system, applying changes to modify the data or applying programming logic to enhance the extracted data, and loading data into any other system, databases or tables. For example, extract, transform and load data from SQL server to Oracle.
  • Data Warehousing- The database specifically designed and developed in a particular format is to enable easy data analysis and reporting.
  • Data Migration- Move data from one place to another place is called Data Migration. It is a subset of ETL. It also involves modification and alteration of data.
  • Business Intelligence- This is applied to analyze data of an organization to effectively perform functions like improving business performance. This concept combines data warehousing and reporting.

What is the use of BODS?

It provides a GUI that allows us to efficiently produce a job that mine data from various sources, convert that data to meet the business requirements of an organization, and load data into a single place.

Data Services Components

  • Designer
  • Repository
  • Job Server
  • Engine
  • Access Server
  • Adapters
  • Real-time services
  • Central Management Console (CMC)
  • Address server
  • Cleansing Packages, Dictionaries, and Directories

Designer

It is a development tool that provides a graphical interface. Designer allows you to create, test, execute and debug BODS job. It enables you to define data mappings, transformations, and control logic. It will enable developers to create objects then drag and drop and configure them by selecting an icon in a source to target flow diagram.

Repository

A repository is similar to a database that stores designer predefined system objects and user-defined objects including source and target metadata and transformation rules. The local repository is the necessary repository for BODS functioning. There are three types of the repository:

  • Local Repository
  • Central Repository
  • Profiler Repository

Job Server

Job server launches the data services processing engine and serves as an interface to the engine and other components in the data service suite. Job server is used to execute the real-time and batch job created by you. It uses multi-threading and parallel processing to provide performance optimization.

Engine

The BODS engine executes specific job created by using the designer. When you execute, your application data services job server launches enough engines to accomplish defined tasks effectively.

Access Server

The access server passes the message between web application and the data service job server and engines. In other words, it takes the message requests, moves to real-time service and displays a message in a specific time frame. It is also known as Real Time Message Broker System.

Central Management Console

This is a web-based administration tool for BODS used for basic functions such as repository registration and user management, etc.

Address Server

The address server must be started before processing data flow which contains the global address cleans or global suggestion list transform with the EMEA engine enabled. It offers address validation and corrections.

Adapters

An adapter allows you to import application metadata into a repository. Adapter Software Development Kit provided by SAP that can be used to develop customized adapters. These adapters are displayed in the data service designer by adapter data stores.

Real-time Services

Real-time services extract data from the body of the real-time message received and from any secondary sources used in the job.

Cleansing Packages

It is used by data cleanse transform in data services to parse. It improves the ability of data cleanses to process various forms of global data accurately.

SAP BODS Architecture

BODS architecture has three layers are as follows

  • Web Application Layer
  • Database Server Layer
  • Data Services Layer

Source Layer

SAP BODS Architecture

The source layer contains different data sources like SAP Presentation and non SAP RDBMS system, and data integration takes place in the performance area.

SAP BODS contains different data modules like Data Server Manager, workbench, etc. The target system can be a DW system like SAP HANA, SAP BW or a Non SAP data warehouse system.

BODS Objects

All the objects that are used in BODS called objects. All the objects like projects, jobs, metadata and system function are kept in the local object collection. All objects hold the following-

  • Properties- It describes an object and does not affect its operation.
  • Options- It controls the operation of objects.

Types of objects

There are two types of object in the system- Single Use Objects and Reusable Objects.

  • Single Use Objects- The objects that are defined especially to a job or data flow are known as Single Use Objects.
  • Reusable Object- You can reuse the object by generating calls to the explanations. Each reusable object has only one explanation and all the call to the object reference to that classification. If the description of an object is changed in one place, it affects the object explanation at all the places where that object performs. The object library is used to hold an object explanation. If you drag and drop an object from the library, a new reference to an existing object is created. Job workflow and dataflow are reusable objects in data services.

Job- A job is a small unit of work. You can schedule a job independently for execution.

Workflow- It reduces the manual process and increases the quality of work. Increases the flow of the business, reduce the processing time and automates the business process.

Dataflow- Dataflow extract, transform, and load data from the source to the target system. All the transformation occurs in the dataflow.

BODs Dataflow

SAP BODS Object Hierarchy

SAP BODS Object Hierarchy

Transform

SAP BODS transforms built-in system objects stored in the repository, which are used whenever we want to transform data from source to target. A transform enables you to control how datasets change in the data flow. It provides access to all repository objects (in-built or user built). Transform classified into four categories are as follows:

  • Data Integrator
  • Data Quality
  • Platform
  • Text Data Processing

Script

It is used to call functions and assign values in a dataflow. It is a single-use object.

Conditionals

Conditionals can also be added to the workflow, but it is optional. It allows us to apply If/Else/Then logic on the workflows.

Types of Datastore

There are three types of data store in data service:

  • Database Datastore: It is used to import metadata directly from RDBMS in a simple way.
  • Application Datastore: It is used to import metadata from most Enterprise Resource Planning (ERP) systems.
  • Adapter Datastore: Provides access to an application’s data and metadata.

What is Data Cleansing?

It is the method of removing, identifying and correcting unwanted records from a data-set, table, and database and then restoring, remaodeling the dirty data. It improves data consistency.

BODS Naming Standards

This allows recognizing objects in the repository. It is good to use the naming standard for all the objects in all systems. The table displays the list of recommended naming standards should be used for all jobs and other objects.

Prefix

Suffix

Object

DF_ n/a Data flow
Edf_ _Input Embedded data flow
EDF_ _Output Embedded data flow
RTJob_ n/a Real-time job
WF_ n/a Workflow
JOB_ n/a Job
n/a _DS Datastore
DC_ n/a Data configuration
SC_ n/a System configuration
n/a _Memory_DS Memory datastore
PROC_ n/a Store procedure

Embedded dataflow

It is also known as dataflow which is called from another data flow in the design. It can contain multiple numbers of sources and targets, but only one input or output pass data to the main data flow.

Types of Embedded Dataflow

  • One Input- It is added at the end of a dataflow.
  • One Output- It is added at the beginning of a dataflow.
  • No input or output- Creates a copy of an existing dataflow.