Design Considerations for Audit Trail Implementation ~ VedantaTree ~ The Tree of Knowledge

Auditing the operations in any application is a very common requirement for security and auditing purpose. Auditing can be done at various levels with varying level of details. Overall prime requirements for Audit Trail are:

Collect enough information to determine,

Who has done the changes,
What are the changes,
When these changes have been done, Locale etc
What user scenario was in action, i.e. the operation context

To capture all user actions, even if some of these are not reaching to DB or would be interacting with external services directly (rare, but happens in some cases)
Successful and unsuccessful logons or other security related operations
Management of collected audit data, which could be very large.
Search operations on collected data
Display the collected data to Users in presentable format as and when demanded

Considering above requirements, there are various design approaches to implement the audit trail in system. Few of these are given below (considering a JEE application, however, many of these are generic)

Implement DB triggers to capture any change in data states and log in audit tables
Log audit data in business services, which means collecting the required data in each business service operation and log it using some abstract Audit service to some data store
Using generic logging framework like Log4J to log the auditing information to desired format and data store and later parse these logs to extract the required data
Interceptors based approach to intercept all operations, collect the data and log it through Audit service. Here interceptors can be of DAO layer interceptors like Hibernate Interceptors or Service level interceptor using AOP.

All of these approaches have their own pros and cons. As obvious, one approach can not be fit for all. Application state, project requirements and scope can define the right set of implementation. So lets discuss the pros and cons of each of these approaches to understand it more.

Database triggers

Database triggers can do really well and nothing can escape from the triggers. Once these are set carefully on all levels, all changes are bound to pass through these and hence will be captured in audit tables.

Pros:

Whether it is simple user operation, or any change is being done by DB admin; all kind of changes will be captured by triggers.
A comparatively simple approach, in aspect, that no code change is required and it can be implemented directly at database level without disturbing the code.

Cons:

Lack of application use-case context at database level, and hence it could be difficult to create a complete scenario from the auditing data, especially when data is hierarchical and composite
There are high chances to miss the scenarios which are not even reaching to database, like any operation which was failed due to input validation, or an operation which is going directly to external service
Performance hit due to very fine grained auditing, which otherwise can be coarse grained at business layer

Business Service Auditing

Other approach could be to identify the Audit data in business service and log it using some abstract Audit Service. This is a comparatively coarse grained implementation in code itself.

Pros:

Benefit of logging from Business service is that code has complete business context to log with Audit Trail data
Coarse grained logging based on user operation and also based on the operation context, Business Service can decide what data is required to be logged along with all suitable attributes from client request or database
Being coarse grained in nature, it is better for performance
It will capture those scenarios also which may not hit the DB

So practically all cons specific in trigger based approach are handled here. However, this is not without weak points.

Cons:

It needs a lot of code in all the business services and operations. Any new implementation will need to follow the same notion to implement auditing logic
Change in auditing requirement means, change in all the services. Hence maintenance and improvement can be time consuming.

With Business service level approach, most of the cons of trigger based approach have been handled. However, one major setback is lack of abstraction and hence it results in repeated implementation in all business service. It can be hard to maintain. Lets move to next alternative.

Hibernate Interceptor

To Solve the weak points of service methods based implementation, another approach is to use the Hibernate interceptors. Of course, this approach is feasible if application is using Hibernate or any ORM similar to it. Following this approach, simple hibernate interceptors can be set for all database operations. Now all DB calls will pass through it and will give a centralized place to log all required audit data.

Pros:

Centralized place to put audit logic, and hence no duplicate implementation
Business operation context i.e. use case information can also be passed as part of data object itself

Cons:

It would not be able to capture the scenarios where no DB interaction has been happened.
Use case i.e. business operation context information has to be passed explicitly using data objects, kind of overloading the data objects
In case of any DB error, whole operation along with audit data can be rolled back.
Applicable only if application is using Hibernate or similar technology

Hence Hibernate interceptors are solving many problem points, however, we still have few more points to address.

AOP Interceptors at Business Service Layer

Here comes the next solution in form of Application interceptor using AOP. AOP is quite commonly used technique nowadays. Even if application is not using it, it is not difficult to plug in. A generic AuditingInterceptor can be designed and implemented to intercept all operation calls coming to services and to log the data using some generic 'Auditing Service'. Business context, i.e the Use case information can be provided by service method using annotations. Partial information which is bound to come from User Interface or Client facing layer, can be passed from that layer with other data. Using annotation, service can define the granularity of data to be stored with auditing and so on. This seems to be addressing most of the weak points defined with previous approaches.

Pros:

A generic implementation, no duplicate code
A clean implementation, flexible enough to pick the changing context from service methods itself using annotations and from the data passed from client layer.
Easy to maintain and enhance
Completely context aware
Can capture audit data irrespective of whether it is reaching to DB or not, and is not impacted with DB errors in service operation.

Cons:

A performance penalty for every call, however, practically it is negligible and will happen in other approaches also.
Client layer may also need design elements to pass some context information, if required. (not an absolute con but a requirement)

With last approach, we have reached to a point where we have addressed most of the concerns. However, as said previously, it does not imply that all should use this approach. Even trigger based approach can be a best candidate if application scope is limited, or application behavior can be mostly covered by using triggers easily. Like in legacy applications, this approach can work well without making much code changes.

Having discussed various approaches to implement Audit Trail, lets discuss a few points in brief to manage the auditing data from performance and management perspective.

One solution to improve the performance could be to design a Asynchronous queue processing based 'Audit Data Service'. So Auditing interceptor will keep on pushing the data to this queue. Another thread/s will pick the data from this queue and will persist it to desired storage channel. This can offload some of the performance overhead and can work well if application can bear the very minute chance of any missed auditing log due to system abrupt failure. Anyhow, alternative backup could be to pull the general application logs from file system to find any missed trigger.

Another main design challenge mostly is how to store the data, i.e. in which format. One of the commonly used approach is to store all data in RDBMS. However, at times, it can be slow to store large amount of auditing data. Rationalizing the data from whole bunch of data during auditing process can be costly from performance point of view. It may not be required to dump whole data in RDBMS as it is. Hence, next optimization point could be, if we can simply dump the auditing data to some fast storage media as it is, and later optimize it for best use.

One of the approaches could be, if Audit data processor pick the data from queue, convert it to XML representation (using any XML serialization API) and dump the whole load of data to any nosql database. Now, NoSQL database will have all auditing data for a user, operation, date and time, module etc. To represent the data, i.e. to show this data to end user, call can be taken to directly pick the data from NoSQL database or offline jobs can be scheduled to parse the relevant data from NoSQL store and arrange it in RDBMS. This will give a good efficient audit information system while retrieving/searching the data.

Any or mix of any of above approaches can be used to implement the audit trail in application. What is your approach, share it please..

2 comments:

Anonymous said...: Good article Mohit. In fact we are facing some performance issue with audit trail and thinking of dumping all audit trail data into NoSql database for fast access.; April 15, 2015 at 1:02 AM
ashu said...: I was also thinking about a solution dumping the audit trail data in a NoSQL DB instead of a relational DB.Has anybody done something like this or have some pointers in this direction.; June 8, 2016 at 7:49 PM

VedantaTree