In the first part of this series I described the integration of the Microsoft reporting layer directly on SAP BW. The second part shows how to integrate with Microsoft OLAP and SAP BW. The following pictures highlights the different layers.
Ok, so let’s have a look into the details.
The data extraction barrier
Connecting a Analysis Services Cube directly on a SAP BW as data source is not a good idea. We would suffer much on performance. So the data has to be extracted out of SAP BW to a SQL Server first, but how? There are different possibilities we have with SQL Server Integration Services:
- In general SAP doesn’t make our life easy if we want to extract data from BW. There is no good extraction support in general for other destinations than SAP and Customer need an Open Hub License in order to store SAP BW data outside of SAP not matter how it is extracted.
SQL Server Feature Pack and Theobald Software
So let’s have a closer look on the extraction possibilities. No matter if you choose the SQL Server Feature Pack functionality or another 3rd party software they all have the same problem and this is the mechanism how SAP supports data extraction. The following picture show an overview about the possibilities you have with Theobald Software:
I want to focus on the Open Hub Service (OHS) functionality this is also the way how the SQL Server Feature Pack supports data extraction. Open Hub Services in SAP are a way to model extraction tables (OHS tables) and fill them with data for relevant for extraction. All BW data objects are supported for extraction and the extraction process supports initial and delta loads. There is a very good whitepaper on how to use OHS written by Hermann Däubler which everybody should read before to start working with OHS.
If you are now using the SQL Server Feature Pack connectors or Theobald Software to connect to the OHS tables SAP data is always extracted via the application layer of SAP. What does it mean? When you start the extraction process via OHS the data is read from the SAP BW database, loaded into the application memory of SAP and is sent out in small data packages (like 50.000 rows per package). Then the next package is read from the database, loaded into memory and sent out to Integration Service until the whole data is extracted from the OHS table. I started a test case on a SAP test system and want to give you some numbers on performance:
- extraction of 23 million records (22 GB of data) took about 30 hours!!!
- 50% of overall execution time was waiting for data providing by SAP
This approach doesn’t really scale for large data extractions like for a Data Warehouse, where you would really suffer on performance. So what are the options?
Direct SAP BW database access
The use of the Open Hub Services within SAP to provide SAP data in extraction tables is fine and works well. The only problem is how to connect to the OHS tables in a faster way? It is allowed by SAP to directly connect to the underlying SAP database (like Oracle or SQL Server) via a normal OleDB provider and access the OHS tables directly. Accessing internal SAP BW tables is not allowed.
So we use Integration Services connecting to the OHS table via an OleDB provider and loading data directly from the database which gives us a really great performance. It is also possible to secure the access only to the OHS tables for the database user which access the SAP BW database. Here are the numbers for OleDB access of my test case:
- extraction of 23 million records (22 GB of data) took about 2 hours
- The overall throughput on my test system is still not the best but I want you to compare the numbers instead (30 hours compared to 2 hours).
- There is one important thing to add when using this approach. Because there is no coordination mechanism by default that tells you, that the SAP extract is finished into the OHS table and that the data can be used for extraction. So you need a communication service on top. This service can be implemented in SAP BW tracking some metadata like which export is available in which OHS tables, how many rows are provided by that export, is it a full or delta export, what is the RequestID to identify the extract … This service can be implemented via a BW ABAP function exposed as a web service. It can be also used to call SAP back and mark data ready for deletion after loading it into the Microsoft BI environment.
Feeding the Data Warehouse
After the data extraction barrier has been covered we can now start modeling the Data Warehouse (as I described here), implement it in Analysis Services and feed the cube with data from SAP BW.
What are your experiences with extracting data out of SAP BW? Are there any other solutions out there? If yes, please let me know.