Host environment must have:
Download the latest Kafka release and extract it:
Note: You can download an RPM package as well.
Run the following commands in order to start all services in the correct order:
The server.properties config found in the config directory of kafka controls several aspects of kafka, namely limits on messages. For nodes that do not have public IPs the following needs to be verified prior to launch:
listeners=PLAINTEXT://0.0.0.0:9092
Uncomment: listener.security.protocol.map
Open another terminal session and run:
Once all services have successfully launched, you will have a basic Kafka environment running and ready to use.
Kafka is a distributed event streaming platform that lets you read, write, store, and process events (also called records or messages in the documentation) across many machines.
Before you can write your first events, you must create a topic. Open another terminal session and run:
This command will use the single broker created to host a single topic.
Now to populate the created topic you can use several methods, the SE team prefers to create a static JSON file that constitutes several thousand messages. For smaller scale testing it is sufficient to use jq to stream the file into a kafka topic using the following CLI command:
Note: In the event incorrect data is populated into the topic, it’s easy to delete the topic, recreate and repopulate. To delete the topic use: bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic <Enter topic name>
In order to test the populated topic, it is best practice to use the standard Kafka consumer (as opposed to FeatureBase consumer) to check the records output on a CLI. Open another terminal session and run the console consumer client to read the events you just created:
You can stop the consumer client with Ctrl-C at any time.
If the records are correct, then move to using a FeatureBase Consumer, in this case the static consumer with the basis of the command as follows:
For more information from the consumer you can add -v for verbose
You must provide either a primary key or use auto generate to assign external keys:
--primary-key-fields
OR
--auto-generate \
--external-generate \
Additionally, It’s recommended to adjust --batch-size 1 for the first test to pull one message, as the consumer will fail to start if there are issues in the initial batch grab from Kafka e.g. an incorrect character in a field, missing keys, etc
Here is a small snippet of data along with the correct header.config to use in a test.
JSON Data:
Header.config: