Many SAP Data Services (DS) applications use SAP ECC data as a source. Now it turns out that DS supports several mechanisms to extract data from SAP ECC. Which to choose? There is often a preferred method based on the functionality required and the features offered in the actual context of the source system. In this blog I discuss all the different options.
Start by setting up a data warehouse for the SAP source system:
Figure 1: SAP Data Warehouse Definition
The mandatory settings are the username and password, the name (or IP address) of the SAP application server, the client and the system (instance) number.
1/. "normal" data flow
If you're pulling from a single table, you're pulling in a standard dataflow. Import the metadata store table definition into the data store browser and use it as a source in a data flow.
An excerpt from the customer master data table KNA1 looks like this:
Figure 2: Data flow: KNA1 extraction
Note the little red triangle on the source table icon. It is an indication of a direct pull from the SAP data layer.
This is the easiest method. No special measures are required at the source system level. The extraction runs as a SAP dialog job. Note that there is a timeout setting (system configurable, typically 10, 30, 60 minutes) for dialog jobs. The DS job will fail if it takes longer to extract the data.
Where the conditions are sent to the underlying database. This benefits work performance. Also, make sure you only extract the columns you really need. The duration of the extraction process is linearly related to the amount of data, which is equal to the number of records * average record size. The less data is transferred, the faster the DS job will run.
Figure 3: Query transformation: KNA1 extract, without stale data sets
Figure 4: Generated SQL code - where the clause moves down
This approach is particularly beneficial when implementing incremental loads. If your source table contains a column with a last-modified timestamp, you can easily implement source-based Change Data Capture (CDC). Keep track of the timestamps you used in the previous incremental extraction (use a control table for this), initialize the global variables with these values, and use them in the where clause of your query transformation.
Figure 5: Query transformation: pull newly created or modified records from MARA
Figure 6: Generated SQL Code - Source Based CDC
two/. ABAP data flow
While conditions are sent from a normal data flow to the underlying database, joins are not (nor are sort and group operations!), often resulting in abominable performance, especially when dealing with of large amounts of data.
If you want to extract material master data from MARA and supplement each record with the English material description from MAKT, you can create a data flow like this:
Figure 7: Data flow - MARA and MAKT extract
Figure 8: Query Transformation: Join MARA and MAKT with MANDT and MATNR Columns
Figure 9: Generated SQL code - no union
DS generates two SQL statements. First extract all current MARA records. And then, for each individual record, it retrieves the corresponding English description (MATNR = AIVariable_1 and MANDT = AIVariable_2). This approach results in as many round trips to the underlying database as there are records in the MARA table! It is only valid when dealing with smaller data sets.
You can improve performance by modifying the properties of the source table.
The default settings are:
Figure 10: Default source table properties
Make MARA a mobile table (give it morejoin the rankthan MAKT) and caching MAKT in memory results in a completely different generation of SQL code, with no round trips to the database. The MARA table flows through the data flow, the MAKT table is cached, and the join is resolved in DS storage.
Figure 11: Changed source table properties
Figure 12: Generated SQL code: MAKT cache
The feasibility of this approach is influenced by 2 parameters:
- The amount of memory available for caching
- The time it takes to cache the MAKT table
This works perfectly for smaller tables. But it's also not a good solution if MAKT and MARA are too big.
The recommended solution to extract a join from SAP tables is to use an ABAP dataflow.
Figure 13: ABAP data flow - MARA and MAKT extract
DS generates ABAP code that conforms to the source table properties and data flow logic. The table with the largest.Enter the leaderboardit becomes a driving table. Also in this case, the duration of the extraction process depends linearly on the amount of data: the less data is transferred, the faster the DS job will run.
Figure 14: Generated ABAP code
The ABAP code is sent to the SAP system and is executed there. Only the results of the program are returned to DS for further processing. This approach only works if the SAP system is open for development! Also make sure of the following:
- oABAP execution optionis set tobuild and run.
- oRun in the background (batch)The property is defined asSimto avoid timeouts in SAP dialog tasks.
- odata transfer methodis set toRFC. oRFC destinationmust be defined in the SAP system. The other data transmission methods only exist for compatibility reasons and should no longer be used. All result in lower performance.
Figure 15: SAP Data Warehouse Definition
To do the same DS work on non-evolutionary layers of your landscape, first transfer the ABAP program from DEV to TST, PRD... Put thatABAP execution optionProrun preinstalledand run the DS job. It will not regenerate the ABAP code but will execute the transported code.
ABAP dataflows are also a convenient solution for implementing incremental loads for ECC tables that do not contain a last-modified timestamp column. Insertions and modifications in KNA1 are recorded in the CDHDR and CDPOS tables. Use an ABAP data flow to link KNA1 to these tables. Make sure CDHDR gets the highest valueEnter the leaderboard, lower KNA1 to get the most efficient generated code. And include where clauses for:
- Filter current customers
- Receive only recently modified records
- Of the correct entries in the log tables
Figure 16: ABAP data flow - KNA1 extract
Figure 17: Query Transformation: Joining KNA1 with CDHDR and CDPOS
Figure 18: Query Transformation - Extract newly created or modified records from KNA1
3/. SAP extractor
SAP extractors adapt to BW Business Content. They contain all the logic for general business transformations, possible aggregations, and also how changes can be detected, making them well-suited for implementing incremental loads.
The DS Extractor feature is based on the Operational Data Provisioning (ODP) API. DS supports all ODP source types supported by the ODP API, including CDC functionality.
In a normal data flow, DS uses RFC to call the ODP Data Replication API. Conditions, if not, are sent to the extractor. This means that all data is pulled from the extractor and additional filtering is done in DS. Import the ODP object definition from the metadata store into the data store browser.
Figure 19: 0CUSTOMER_ATTR import extractor
Make sure you have theextraction modeProPetitions. Then use it as a source in a dataflow. An excerpt from the ODP 0CUSTOMER_ATTR object looks like this:
Figure 20: Data flow - extract from 0CUSTOMER_ATTR
If you only want to extract a small subset of the data, use the ODP object as a source in an ABAP data flow.
Figure 21: ABAP data flow - extract from 0CUSTOMER_ATTR
Add the where clause to the query transformation.
Figure 22: Query transformation: extract current US customers from 0CUSTOMER_ATTR
DS generates the ABAP that calls the ODP data replication API. The generated code contains logic to filter out unnecessary records.
Figure 23: Generated ABAP code
The implementation of CDC is quite easy for extractors who are "delta-aware". Make sure you have theextraction modeProChange Data Collection (CDC). When importing the ODP object.
Figure 24: Import extractor 0PROJECT_ATTR
Then use it as a source in a dataflow. There is no need to add a time-based condition in the where clause. The extractor logic ensures that only new and changed records are forwarded to DS. Just make sure thatinitial feeproperty of the ODP object is set tono. just set it upSimif you want the destination table to be restarted.
Figure 25: ODP object properties
Add a Map_CDC_Operation transformation to automatically synchronize the destination table with the source object. The transformation translates the value of the row operation to the corresponding DS row type:
- me > paste
- B&U: Image before and after an update
- D > Exclude
Figure 26: Data flow: delta extraction of 0PROJECT_ATTR
4/. ECC function
DS can also call RFC-enabled ECC functions that return tables as data flow sources. If a standard ECC function is not RFC compliant, you need an RFC compliant wrapper function that passes parameters to the standard function, calls it, and forwards the results to DS.
You can only import metadata for a role by name. Call it a query transformation by selecting New Function Call... from the popup menu in its output schema. Select the ECC data storage function. Define the input parameters and select the output parameter. The function call is added to the output schema.
Figure 27: Query transformation: Calling an ECC function
Then, in a next query transformation, deinterleave the returned results before writing them to a destination table.
Figure 28: Query transformation: output schema of the deinterleave function
5/. SAP LT (SLT) Replication Server
DS can use SLT Replication Server as a source. This is a very clean and elegant way to create CDC jobs in DS. Working with SLT objects is similar to how DS works with SAP extractors.
Define the SLT data warehouse like any other SAP data warehouse. Just make sure you select the correct ODP context in the definition.
Figure 29: Definition of SLT data storage
You can import the tables from which you need to extract data.
Figure 30: SLT table metadata
Use the ODP object as a source in a data flow. When the data flow runs for the first time, a full extraction of the underlying table is performed. All subsequent executions automatically run an incremental program and only forward the delta.
Figure 31: Data flow: delta MARA extraction by SLT
6/. IDOC
DS realtime tasks can read IDOC messages and IDOC files. DS batch jobs can only be read from IDoc files.
Import the IDoc metadata definition by name from the SAP data warehouse. Use the IDoc as the file source in a data flow in a batch job.
Figure 32: Data Flow - IDoc File Source Delta Extraction
Double-click the IDoc icon to open its definition and specify the name of the IDoc file. You can use wildcards (? and *) or list multiple file names separated by commas if you want to process multiple files in a single stream.
Figure 33: IDoc Properties
Generate the SAP IDoc files and run your DS task.
FAQs
What is the best way to extract data from SAP? ›
The recommended solution for extracting from a join of SAP tables is through the use of an ABAP data flow. DS generates ABAP code corresponding to the properties of the source tables and the logic of the dataflow.
How to extract data from SAP ECC? ›- Open SAP Logon and connect to the SAP ECC system containing the data you want to import.
- To list the available activated extractors, you can use the Postprocess DataSources and Hierarchy (RSA6) transaction and choose Enter as indicated below:
- Create a New Data Store and give the credentials to connect to the ECC system.
- Select the Tables under the Data store and import the sap tables that are required to the Data services.
- Copy the SAP table data to staging tables in the SQL database and use the staged tables for further steps.
There are three main types of data extraction in ETL: full extraction, incremental stream extraction, and incremental batch extraction. Full extraction involves extracting all the data from the source system and loading it into the target system.
Can Excel pull data FROM SAP? ›The CData ODBC driver for SAP uses the standard ODBC interface to link SAP data with applications like Microsoft Access and Excel. Follow the steps below to use Microsoft Query to import SAP data into a spreadsheet and provide values to a parameterized query from cells in a spreadsheet.
How do I extract data from SAP ECC to Excel? ›You can use the icon for exporting as a shortcut. All you have to do is to click on the exporting icon, which is an arrow pointing to the right, then select the spreadsheet button. Once that happens, your SAP data will be exported to Excel without a problem.
Is SAP ECC obsolete? ›It is also important to know that SAP ECC has reached the end of the line. SAP is planning to end support for SAP ECC by 2027.
Is SAP ECC outdated? ›In early 2019, SAP announced it was planning to end maintenance for ERP Central Component and other legacy ERP products by 2025. After pushback from customers, SAP has extended that deadline and will continue maintenance through 2027, with an optional extension through 2030.
Does SAP have an ETL tool? ›The SAP ETL tool automates data extraction, using OData services to extract the data, both initial and incremental or deltas. The tool can connect to Data extractors or CDS views to get the data.
How do I download all data from SAP? ›The first thing is to click on the list menu. Once you do that, you should select the Export option and then the Spreadsheet option. This option works when you are in the full-screen mode where the SAP screen is displaying returned data lists.
How do I Export large data from SAP to Excel? ›
From the List drop down box click on Export then Local file. In the pop-up box, Select format Text with Tabs. Click on the match box. When the 'Save As' pop-up appears, save your document to your Desktop.
Can Tableau pull data from SAP? ›Tableau supports live and extract connections to SAP BW data. With a live connection you will always see the latest data in your workbooks and dashboards.