Posts

Showing posts with the label Performance

Java is Very Fast, If You Don’t Create Many Objects

Image
  You still have to watch how many objects you create. This article looks at a benchmark passing events over TCP/IP at 4 billion events per minute using the net.openhft.chronicle.wire.channel package in Chronicle Wire and why we still avoid object allocations..  One of the key optimisations is creating almost no garbage. Allocation is a very cheap operation and collection of very short-lived objects is also very cheap. Does this really make a difference? What difference does one small object per event (44 bytes) make to the performance in a throughput test where GC pauses are amortised? While allocation is as efficient as possible, it doesn’t avoid the memory pressure on the L1/L2 caches of your CPUs and when many cores are busy, they are contending for memory in the shared L3 cache.  Results Benchmark on a Ryzen 5950X with Ubuntu 22.10. JVM Vendor, Version No objects Throughput, Average Latency* One object per event Throughput, Average Latency* Azul Zulu 1.8.0_322 60.6 M event/s, 528

Benchmarking Kafka vs Chronicle for Microservices: which is 750 times faster?

Image
Apache Kafka is a common choice for inter-service communication. Kafka facilitates the parallel processing of messages and is a good choice for log aggregation. Kafka claims to be low latency, high throughput . However, is Kafka fast enough for many microservices applications in the cloud? When I wrote Chronicle Queue Open Source my aim was to develop a messaging framework with microsecond latencies, and banks around the world have adopted it for use in their latency-sensitive trading systems. In this article, I will describe how Kafka does not scale in terms of throughput as easily as Chronicle Queue for microservices applications. As a teaser, I will show you this chart showing that Chronicle Queue is around 750 times faster even for lower throughput. Visualising delay as a distance In order to illustrate the difference, let me start with an analogy. Light travels through optic fibre and copper at about two thirds the speed of light in a vacuum, so to appreciate very short de

Distributed Unique Time Stamp Identifiers

Image
Recently I published an article on using timestamps as unique identifiers , generated in a fraction of a microsecond. This article covers an implementation that supports distributed identifier generation directly. This specific implementation supports up to one billion new 64-bit identifiers every second only repeating after 520 years. They can also be printed as timestamps containing the wall clock to make it easier to read. Concurrent identifier generation in a distributed system Each host has a predefined, unique host identifier, or hostId . This TimeProvider assumes up to 100 hosts to produce different identifiers concurrently.  JVMs using the same hostId must be on the same physical machine using the same memory-mapped file, or you can give each JVM a different hostId A nano-second timestamp with a host identifier DistributedUniqueTimeProvider stores a host identifier in the lower two digits of the timestamp making it easier to read. The previous implementation used bit shifting

System wide unique nanosecond timestamps

A Unique Identifier can be very useful for tracing. Those ids are even more useful when they contain a high-resolution timestamp.  Not only do they record the time of an event, but if unique can help trace events as they pass through the system. Such unique timestamps however can be expensive depending on how they are implemented.   This post explores a lightweight means of producing a unique, monotonically increasing system-wide nano-second resolution timestamp available in our open-source library. Uses for Unique Identifiers Unique identifiers can be useful to associate with a piece of information so that information can be referred to later unambiguously. This could be an event, a request, an order id, or a customer id. They can naturally be used as a primary key in a database or key/value store to retrieve that information later. One of the challenges of generating these identifiers is avoiding creating duplicates while not having an increasing cost.  You could record every identi