Data Auditing using Javers

August 28, 2019

Rezwan Nabi

Software Engineer

August 28, 2019

Rezwan Nabi

Software Engineer

Almost every application deals with data. And by using an application, users modify the data – data is created, updated or deleted. Eventually, we start looking for options to audit changes in data like who changed it or when was it changed and sometimes, we want to know more like the previous value before the modification. All these questions or requirements ask for a version-control system similar to the version-control system for our source code.

The magic of the log is that if it is a complete log of changes, it holds not only the contents of the final version of the table but also allows recreating all other versions that might have existed. It is, effectively, a sort of backup of every previous state of the table. But there is a complexity involved in terms of space and querying over the growing size of audit data, which will eventually outgrow the live application data. So, we should look out for options where we can separate the live data from audit data for scalability, have efficient indexing techniques available etc.

Javers provides answers to all these questions. It is a versatile open-source Java framework for data auditing. It provides options to store audit data in a separate database, even to the extent that we can store audit data for a relational DB, MYSQL, into a non-relational DB, MongoDB. We can easily integrate this framework in our application to maintain and browse the history of changes in our data.

Features:

Object diff: Javers is built on top of the object diff engine. It can be used as a standalone tool to compare two object graphs and get the difference between them as a list of atomic changes
Javers Repository: Javers repository is a central part of Javers data auditing engine. It tracks every change made on audited data, so we can easily identify the change, when it was made, who made it. It provides three views on object history: changes, shadows, and snapshots. It provides powerful JQL, Javers Query language, to browse the detailed history of a given class, object or property.
At the moment, Javers provides MongoDB implementation and SQL implementation for the following dialects: H2, PostgreSQL, MySQL/MariaDB, Oracle, and Microsoft SQL Server.
JSON serialization: Javers persists each state of an object as a snapshot in JSON format. It has a well-designed and customizable JSON serialization and deserialization module, based on GSON and Java reflections. The mapping of domain objects to persistent format (JSON) — is done by Javers-core and this common JSON format is used by many Javers Repository implementations.

Let’s see Javers in action now. We will be using a spring boot REST API project.

For brevity, let’s have a single resource, product, for which we will be creating following REST APIs secured by basic authentication.

CRUD APIs:

POST API to create a new product
GET API to get the list of products
PUT API to update a product

Audit APIs:

GET API to get the list of versions of a product
GET API to get the list of changes/diff between two versions of a product
PUT API to roll-backwards and roll-forward the current version

We will be using in-memory h2 database to store our data and Javers will automatically create its tables required for Javers repository as following:

jv_global_id — domain object identifiers,
jv_commit — Javers commits metadata,
jv_commit_property — commit properties,
jv_snapshot — domain object snapshots.

You can find the complete project athttps://github.com/Rizwanmisger/data-version-control

To get started, we will add a Javers dependency to pom.xml

<dependency>

<groupId>org.javers</groupId>

<artifactId>javers-spring-boot-starter-sql</artifactId>

<version>${javers.version}</version>

</dependency>

view raw pom.xml hosted with

by GitHub

Next, we will create a generic service class, ‘AuditService’, which will handle all the operations related to auditing our entities. It performs the following operations:

Commit: This operation saves the snapshot of an entity along with the author/user making the change. For every new commit, a new version also called as a shadow is created.
Get a version of an entity: The operation uses JQL to query for the version of an entity identified by its type, id, and version number. Then we can use to compare the current version with this retrieved version or even set it as the current version, which will either be a roll-backwards or roll-forward.
Get all versions of an entity: The operation uses JQL to query for all the versions of an entity. Javers does maintain a chronological order for versions, but since we are providing an ability to roll-backwards or roll-forward the current version, we can’t rely on this order to identify the current version of an entity. However, we can take advantage of Javers’s diff tool to compare the current state of the entity present in actual data with the list of available versions to identify the current version.

@Service

public class AuditService {

@Autowired

private final Javers javers;

public AuditService(Javers javers) {

this.javers = javers;

}

public <T> void commit(String author, T currentVersion) {

javers.commit(author, currentVersion);

}

public <T> List<VersionDTO<T>> getVersions(T currentVersion, Object id) {

List<Shadow<T>> ds = getShadows(currentVersion.getClass(), id);

AtomicInteger index = new AtomicInteger();

return ds.stream().map(d –> {

VersionDTO<T> version = new VersionDTO<>();

version.setEntity(d.get());

version.setVersion(index.getAndIncrement());

version.setAuthor(d.getCommitMetadata().getAuthor());

version.setCreatedAt(d.getCommitMetadata().getCommitDate());

if ( !javers.compare(currentVersion, d.get()).hasChanges()) {

version.setCurrentVersion(true);

}

return version;

}).collect(Collectors.toList());

}

public <T> List<VersionsDiffDTO> compare(Class<?> entity, Object id, int left, int right) {

List<Shadow<T>> shadows = getShadows(entity, id);

T v1 = shadows.get(left).get();

T v2 = shadows.get(right).get();

List<Change> changes = javers.compare(v1, v2).getChanges();

return changes.parallelStream().map(change –> {

VersionsDiffDTO diff = new VersionsDiffDTO();

diff.setPropertyName(((ValueChange) change).getPropertyName());

diff.setPropertyNameWithPath(((ValueChange) change).getPropertyNameWithPath());

diff.setLeft(((ValueChange) change).getLeft());

diff.setRight(((ValueChange) change).getRight());

return diff;

}).collect(Collectors.toList());

}

public <T> T getVersion(Class<T> entity, Object id, int version) {

List<Shadow<T>> shadows = getShadows(entity, id);

return shadows.get(version).get();

}

private <T> List<Shadow<T>> getShadows(Class<?> entity, Object id) {

QueryBuilder jqlQuery = QueryBuilder.byInstanceId(id, entity);

List<Shadow<T>> shadows = javers.findShadows(jqlQuery.build());

Collections.reverse(shadows);

return shadows;

}

}

view raw AuditService.java hosted with

by GitHub

Finally, we will add a controller to expose the data and audit logs using REST APIs. This controller will perform the following functions:

Whenever a user hits POST API to create a new product, we will call the audit service’s commit method, which will save its snapshot as the first version of this product.
Whenever a user requests change in a product by hitting PUT API, we will update the product and call audit service’s commit method to track this change and save its snapshot as a new version for this product.
User will be able to see audit data by using audit APIs for products. We can use audit service’s functions to get a list of versions, compare different versions and switch the current version.

@RestController

@RequestMapping(“/api/v1/products“)

public class ProductsController {

@Autowired

ProductsRepository productsRepository;

@Autowired

AuditService auditService;

@PostMapping(““)

ResponseEntity<Product> create(@RequestBody Product product, Principal principal) {

product.setId(UUID.randomUUID());

product.setCreatedAt((LocalDateTime.now(ZoneOffset.UTC).withNano(0)));

product.setUpdatedAt((LocalDateTime.now(ZoneOffset.UTC).withNano(0)));

productsRepository.save(product);

auditService.commit(principal.getName(), product);

return new ResponseEntity<>(product, HttpStatus.CREATED);

}

@GetMapping(““)

List<Product> list() {

return (List<Product>) productsRepository.findAll();

}

@PutMapping(“/{id}“)

ResponseEntity<Product> update(@PathVariable UUID id, @RequestBody Product product, Principal principal) {

return productsRepository.findById(id).map(p –> {

p.setName(product.getName());

p.setDescription(product.getDescription());

p.setUpdatedAt((LocalDateTime.now(ZoneOffset.UTC).withNano(0)));

productsRepository.save(p);

auditService.commit(principal.getName(), p);

return new ResponseEntity<>(p, HttpStatus.OK);

}).orElse(new ResponseEntity<>(HttpStatus.NOT_FOUND));

}

@GetMapping(“{id}/versions“)

ResponseEntity<List<VersionDTO<Product>>> getVersions(@PathVariable UUID id) {

return productsRepository.findById(id).map(p –> {

List<VersionDTO<Product>> list = auditService.getVersions(p, id);

return new ResponseEntity<>(list, HttpStatus.OK);

}).orElse(new ResponseEntity<>(HttpStatus.NOT_FOUND));

}

@GetMapping(“{id}/versions/diff“)

ResponseEntity<List<VersionsDiffDTO>> getDiff(@PathVariable UUID id, @RequestParam int left, @RequestParam int right) {

return productsRepository.findById(id).map(p –> {

List<VersionsDiffDTO> diff = auditService.compare(Product.class, id, left, right);

return new ResponseEntity<>(diff, HttpStatus.OK);

}).orElse(new ResponseEntity<>(HttpStatus.NOT_FOUND));

}

@PutMapping(“/{id}/versions“)

ResponseEntity<Product> changeVersion(@PathVariable UUID id, @RequestHeader Integer version) {

return productsRepository.findById(id).map(cs –> {

Product c = auditService.getVersion(Product.class, id, version);

productsRepository.save(c);

return new ResponseEntity<>(c, HttpStatus.OK);

}).orElse(new ResponseEntity<>(HttpStatus.NOT_FOUND));

}

}

view raw ProductsController.java hosted with

by GitHub

Conclusion

Javers is lightweight and versatile. Since it uses JSON for object serialization, we don’t have to provide detailed ORM-like mapping. Javers only needs to know some high-level facts about our data model. Its integration is seamlessly easy, we can also use Spring Boot starters to simplify integrating Javers with your application. All required Javers beans are created and auto-configured with reasonable defaults.