Key responsibilities
- Design and implement real-time data processing solutions using Kafka and Spark Streaming (and possibly Flink)
- Collaborate with cross-functional teams to ensure the successful integration of real-time data solutions into our products and services
- Optimize the performance and scalability of real-time data systems
- Maintain and troubleshoot Kafka clusters and Spark Streaming applications
- Keep up-to-date on the latest trends and developments in the field of real-time data processing
Requirements
- University degree in IT or similar
- Level of English (B2 or higher)
- 1+ years of experience in implementation of Kafka clusters and real-time analytics
- 1+ years of experience in Spark Streaming
- 2+ years of experience in Python
- Strong data manipulation skills (specially SQL, Spark SQL and the Dataframe API)
- Ability to design, develop and maintain real-time data processing solutions
- Excellent problem-solving skills and ability to make informed decisions in real time
- Good knowledge of Git
- Good communication skills
- Results faced, self-challenging mentality
- Eager to learn
Nice to have
- Experience with Apache Flink
- Experience with other data-relate Apache products
- Experience with Kubernetes clusters or Rancher
- Experience with Delta Lake
- Experience with Spark in general and the Hadoop ecosystem