Hi Ann...
I will be writing a blog on exactly your questions :-)
First question from my side, is your source system, SLT and HANA in one location/data center?
As far as tracing the issue, in our case we have identified a process that causes latency (this almost made me cry). The example we have is a specific COPA transaction (KE27) where a bulk update happens sometimes containing 3 million records in a matter of seconds. The process to identify was:
- * check the statistics on SLT and drill down to timeframe of mass updates
- * check the Load graph (performance tab) in HANA and check for spikes on the different components i.e. mass writes/reads/CPU...this identified that Delta Merges were also running on certain large tables.
- * check HANA tables using timestamps and identify what was updated
- * use table contents to identify the source of the data flow (record types/transaction codes...etc)
- * check source system/functional teams as to what was executed during the time of latency.
The above helped us identify the root process and we then investigated further. We now actively monitor during peak times where certain processes are executed and make use of the email notifications with latency thresholds. We also make use of the 'safety buffer' in the COPA accelerator and will also implement additional notes where latency 'sniffing' is performed within the source system process with automatic 'wait' times and additional email notifications.
However, I am still trying to formulate a best practice around latency monitoring...happy to share once I am done.
Thanks
Kris